This post is also available in: heעברית (Hebrew)

Nowadays robots can do a variety of tasks, as long as they are trained on real-world data. But an important question is- what if they could bypass this step? The result would be many more general-purpose robots being developed at a faster rate.

The new introduction by Google’s DeepMind is a self-improving AI model called “RoboCat”, and it may be the key to machines that can self-generate new training data to improve their technique without too much human interference, according to a blog post published by the firm.

The researchers state in their post: “RoboCat learns much faster than other state-of-the-art models. It can pick up a new task with as few as 100 demonstrations because it draws from a large and diverse dataset. This capability will help accelerate robotics research, as it reduces the need for human-supervised training, and is an important step towards creating a general-purpose robot.”

They further explain that the new model is based on DeepMind’s multimodal model Gato (“cat” in Spanish), which can process language, images and actions in both simulated and physical environments.

According to Interesting Engineering, the researchers used Gato’s architecture which comes with a large training dataset of sequences of images and actions of various robot arms solving hundreds of different tasks.

They then trained RobotCat to learn new tasks by following five steps:

  1. Collect between 100-1000 demonstrations of a new task or robot, using a robotic arm controlled by a human.
  2. Fine-tune RoboCat on this new task/arm, creating a specialized spin-off agent.
  3. The spin-off agent practices on this new task/arm an average of 10,000 times, generating more training data.
  4. Incorporate the demonstration data and self-generated data into RoboCat’s existing training dataset.
  5. Train a new version of RoboCat on the new training dataset.

This new and diverse training technique taught the AI model to operate different robotic arms within just a few hours. Even though it had not been trained on arms with two-pronged grippers, RoboCat was able to adapt to a more complex arm with a three-fingered gripper and twice as many controllable inputs.

The robot got better at learning new tasks the more new tasks it learned. Early versions of RoboCat were successful just 36 percent of the time when faced with previously unseen tasks, but the latest and most advanced model, which had trained on a greater diversity of tasks, more than doubled this success rate on the same tasks.

Information provided by the news website Interesting Engineering.