Chapter 78 — Perceptual Robotics

Active in-hand object recognition

This video showcases the implementation of active object learning and recognition using the framework proposed in Browatzki et al. [1, 2]. The first phase shows the robot trying to learn the visual representation of several paper cups differing by a few key features. The robot executes a pre-programmed exploration program to look at the cup from all sides. The (very low-resolution) visual input is tracked and so-called key-frames are extracted which represent the (visual) exploration. After learning, the robot tries to recognize cups that have been placed into its hands using a similar exploration program based on visual information - due to the low-resolution input and the highly similar objects, the robot, however, fails to make the correct decision. The video then shows the second, advanced, exploration, which is based on actively seeking the view that is expected to provide maximum information about the object. For this, the robot embeds the learned visual information into a proprioceptive map indexed by the two joint angles of the hand. In this map, the robot now tries to predict the joint-angle combination that provides the most information about the object, given the current state of exploration. The implementation uses particle filtering to track a large number of object (view) hypotheses at the same time. Since the robot now uses a multisensory representation, the subsequent object-recognition trials are all correct, despite poor visual input and highly similar objects. References: [1] B Browatzki, V. Tikhanoff, G. Metta, H.H. Bülthoff, C. Wallraven: Active in-hand object recognition on a humanoid robot, IEEE Trans. Robot. 30(5), 1260-1269 (2014); [2] B. Browatzki, V. Tikhanoff, G. Metta, H.H. Bülthoff, C. Wallraven: Active object recognition on a humanoid robot, Proc. IEEE Int. Conf. Robot. Autom. (ICRA), St. Paul (2012), pp. 2021-2028.
Christian Wallraven
Thanks to Vladimir Tikhanoff for filming the iCub exploration. Thanks to the people at IIT (Vladimir Tikhanoff, Georgio Metta) for the collaboration
Latitude ="37.584686 , Longitude = 127.026715"    (link to Google Maps)