2018年7月31日火曜日

OpenAI's Dactyl System Gives Robots Humanlike Dexterity

In a forthcoming paper ("Dexterous In-Hand Manipulation"), OpenAI researchers describe a system that uses a reinforcement model, where the AI [known as Dactyl] learns through trial and error, to direct robot hands in grasping and manipulating objects with state-of-the-art precision. All the more impressive, it was trained entirely digitally, in a computer simulation, and wasn't provided any human demonstrations by which to learn. The researchers used the MuJoCo physics engine to simulate a physical environment in which a real robot might operate, and Unity to render images for training a computer vision model to recognize poses. But this approach had its limitations, the team writes -- the simulation was merely a "rough approximation" of the physical setup, which made it "unlikely" to produce systems that would translate well to the real world. Their solution was to randomize aspects of the environment, like its physics (friction, gravity, joint limits, object dimensions, and more) and visual appearance (lighting conditions, hand and object poses, materials, and textures). This both reduced the likelihood of overfitting -- a phenomenon that occurs when a neural network learns noise in training data, negatively affecting its performance -- and increased the chances of producing an algorithm that would successfully choose actions based on real-world fingertip positions and object poses. 

Next, the researchers trained the model -- a recurrent neural network -- with 384 machines, each with 16 CPU cores, allowing them to generate roughly two years of simulated experience per hour. After optimizing it on an eight-GPU PC, they moved onto the next step: training a convolutional neural network that would predict the position and orientation of objects in the robot's "hand" from three simulated camera images. Once the models were trained, it was onto validation tests. The researchers used a Shadow Dexterous Hand, a robotic hand with five fingers with a total of 24 degrees of freedom, mounted on an aluminum frame to manipulate objects. Two sets of cameras, meanwhile -- motion capture cameras as well as RGB cameras -- served as the system's eyes, allowing it to track the objects' rotation and orientation. In the first of two tests, the algorithms were tasked with reorienting a block labeled with letters of the alphabet. The team chose a random goal, and each time the AI achieved it, they selected a new one until the robot (1) dropped the block, (2) spent more than a minute manipulating the block, or (3) reached 50 successful rotations. In the second test, the block was swapped with an octagonal prism. The result? The models not only exhibited "unprecedented" performance, but naturally discovered types of grasps observed in humans, such as tripod (a grip that uses the thumb, index finger, and middle finger), prismatic (a grip in which the thumb and finger oppose each other), and tip pinch grip. They also learned how to pivot and slide the robot hand's fingers, and how to use gravitational, translational, and torsional forces to slot the object into the desired position.

0 件のコメント:

コメントを投稿