AI-Generated Images to be Used for Robotics Training

Oct 7, 2024

This post is also available in: עברית (Hebrew)

Researchers from Stephen James’s Robot Learning Lab in London have developed a groundbreaking system named Genima that leverages AI-generated images to train robotic machines. By fine-tuning Stable Diffusion, Genima enables the visualization of robots’ movements, guiding them both in simulations and the real world.

Genima functions as a behavior-cloning agent, transforming the traditional approach to robotic training. It “draws joint-actions” on RGB images, which are then processed by a controller that maps these visual targets to a sequence of joint positions. This innovative method allows robots, ranging from mechanical arms to humanoid robots and driverless cars, to learn complex tasks more efficiently.

According to Interesting Engineering, the researchers conducted extensive studies using Genima across 25 tasks from the RLBench simulation and 9 real-world manipulation scenarios. They discovered that by lifting actions into image-space, internet pre-trained diffusion models could generate policies that outperform state-of-the-art visuomotor approaches. Notably, Genima demonstrated robustness against scene perturbations and adaptability to novel objects, making it competitive even with 3D agents that utilize additional data such as depth and keypoints.

However, the study highlights that Genima has its limitations. As with all behavior-cloning agents, it focuses on distilling expert behaviors rather than discovering new ones. The system relies on camera calibration to render targets accurately, assuming that the robot is always visible from a specific viewpoint.

The researchers validated Genima’s effectiveness by benchmarking it against neural network ACT on real-robot setups. Using a Franka Emika Panda robot equipped with external and wrist cameras, they trained multi-task agents from scratch across nine diverse tasks, including handling dynamic and deformable objects. The system visualizes desired actions—such as opening a box or hanging a scarf—through colored spheres placed atop images, indicating future joint movements.

In trials, Genima completed 25 simulations and nine real-world tasks, achieving average success rates of 50% and 64%, respectively, according to MIT Technology Review. The potential of pre-trained diffusion models in transforming robotics parallels their revolutionary impact on image generation, signaling a promising future for robotic training techniques.

AI-Generated Images to be Used for Robotics Training

Latest

Metal That Behaves Like a Gel Could Redefine High-Temperature Systems

Gmail Data Leak: 183 Million Reasons to Rethink Your Password Security

3D-Printed Antennas Bend Without Breaking the Signal

New Method Enhances AI’s Ability to Recognize Personalized Objects

AI-Powered Eye Chip: A New Chapter for the Blind

New Tech Solves Key Weakness in Solid-State Batteries

99% Accuracy: How Never Mine is Shaping the Future of Demining

Stronger Magnets, Smaller Motors: A Boost for Clean Energy Tech

Batteries, Not Flux Capacitors: The Real Future of Urban Flight

Magnetic Origami Bots Take a Step Toward Smart Medicine

Smart, Scalable, Mobile: The Next-Gen Turret System

Tracking Mosquitoes and Floods from Space

Microscopic DNA Petals Mimic Nature to Perform Medical Tasks

The Engine That Breaks the Thermodynamic Rulebook

Smartwatch Breakthrough Enables Centimeter-Level GPS Accuracy

When AI Becomes Your Space Medic

Can’t be Cloned: Hydrogel Gives Products a Unique ID

Bridging the Global Talent Gap with AI: How Iverse Is Redefining...

Print Smarter, Not Harder: Open-Source Multi-Material 3D Printing

Clear Tech, Clear Skin: UV Safety Goes Wearable