Training Robots Using Audio

image provided by pixabay

This post is also available in: עברית (Hebrew)

A team of roboticists from Stanford University and the Toyota Research Institute found that when training robots, adding audio data to visual data improves their learning skills.

The team explained that all training done with AI-based robots involves exposing them to a large amount of visual information, ignoring associated audio. They share that they wondered whether adding microphones to robots and allowing them to collect sound data of the mission or action they are learning might help them learn a task better.

The example they provided was opening a box of cereal and filling a bowl – in this scenario, it may be helpful to hear the sounds of a box being opened and the dryness of the cereal as it is put in the bowl.

According to Techxplore, the team tested their theory by designing and carrying out four robot-learning experiments – the first was teaching a robot to turn over a bagel in a frying pan with a spatula, the second was teaching a robot to erase an image on a whiteboard with an eraser, the third was pouring dice from a cup into another cup, and the fourth was to choose the correct size of tape from three samples and to use it to tape a wire to a plastic strip.

The robot used for all these experiments had a grasping claw, and all the experiments were done in two ways, video only and video and audio. After running all the experiments, the researchers compared the results by judging how quickly and easily the robots were able to learn and carry out the tasks, as well as the task accuracy.

The researchers found that adding audio significantly improved speed and accuracy with certain tasks, but did not help others. Learning how to pour dice with audio information dramatically improved the robot’s ability to figure out if there were any dice in the cup, or helped it understand if it was using the right amount of pressure on the eraser while erasing the whiteboard. However, adding sound didn’t help it determine if it successfully turned the bagel or removed the image from the board.

The team concluded that adding audio to teaching material for AI robots could provide better results for some applications.