Jan Peters, Daniel D. Lee, Jens Kober, Duy Nguyen-Tuong, J. Andrew Bagnell and Stefan Schaal
Machine learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors; conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in robot learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this chapter, we attempt to strengthen the links between the two research communities by providing a survey of work in robot learning for learning control and behavior generation in robots. We highlight both key challenges in robot learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our chapter lies on model learning for control and robot reinforcement learning. We demonstrate how machine learning approaches may be profitably applied, and we note throughout open questions and the tremendous potential for future research.
Inverted helicopter hovering
Author Pieter Abbeel
Video ID : 352
An example of simulation-based optimization using a learned forward model.
This brief video shows a successful application of reinforcement learning to the design of a controller for sustained inverted flight of an autonomous helicopter. The authors began by learning a stochastic, nonlinear forward model of the helicopter’s dynamics. Then, a reinforcement learning algorithm was applied to automatically learn a controller for autonomous inverted hovering.
The video illustrates Section 15.2.5 -- Applications of Model Learning, Springer Handbook of Robotics, 2nd ed (2016);
Reference: A.Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, E. Liang: Autonomous inverted helicopter flight via reinforcement learning, IX Int. Symp. Exp. Robot. 2004, Springer Tract. Adv. Robot. 21, 363-372 (2006)