This post is also available in: heעברית (Hebrew)

Researchers have created a deep-learning system that can teach a robot simply by observing a human’s actions. Developed by Nvidia, the system is designed to enhance communication between humans and robots and at the same time further research that will enable people to work alongside robots seamlessly.  With demonstrations, a user can communicate a task to the robot and provide clues as to how to best perform the task, says the research.

Using NVIDIA TITAN X GPUs, the researchers trained a sequence of neural networks to perform duties associated with perception, program generation, and program execution. As a result, the robot was able to learn a task from a single demonstration in the real world.

Once the robot sees a task, it generates a human-readable description of the steps necessary to re-perform the task, according to Nvidia developer website.

The key to achieving this capability is leveraging the power of synthetic data to train the neural networks. Current approaches to training neural networks require large amounts of labeled training data, which is a serious bottleneck in these systems.  With synthetic data generation, an almost infinite amount of labeled training data can be produced with very little effort.

This is also the first time an image-centric domain randomization approach has been used on a robot. Domain randomization is a technique to produce synthetic data with large amounts of diversity, which then fools the perception network into seeing the real-world data as simply another variation of its training data. The researchers chose to process the data in an image-centric manner to ensure that the networks are not dependent on the camera or environment.

The team says they will continue to explore the use of synthetic training data for robotics manipulation to extend the capabilities of their method to additional scenarios.