1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Apr 08, 2009 CS5331: Autonomous Mobile Robots
Overview Creating a robot that can fly autonomously Software developed at Stanford as part of their AI lab This paper is slightly outdated as many new maneuvers have been created. Apr 08, 2009 CS5331: Autonomous Mobile Robots2
Learning Approach Apprenticeship Collect data from human trying maneuver (multiple times) Learn a model from the data Find controller than can simulate based on model Test on helicopter (pray it doesn’t crash) Apr 08, 2009 CS5331: Autonomous Mobile Robots3
Helicopters State Position Velocity Angular Velocity Controlled with 4 dimensions Cyclic pitch Tail rotor Take gravity out when calculating the model Apr 08, 2009 CS5331: Autonomous Mobile Robots4
Controller Design Use a Markov decision process Sextuple (S,A,T,H,s(0),R) S-set of states A-set of actions (inputs) T-dynamic model-set of probability distributions for the next state H-horizon or number of time steps of interest s(0)-initial state R-reward function Apr 08, 2009 CS5331: Autonomous Mobile Robots5
Differential Dynamic Programming(DDP) Compute the linear approximation Compute the optimal solution to the linear quadratic regulator Must take into account error state Cost for change in input-needed in real testing Apr 08, 2009 CS5331: Autonomous Mobile Robots6
DDP-Continued 2 phases DDP to find open loop input sequence Use DDP again refining the inputs as a deviation from the nominal open-loop input sequence Integral control-take into account wind and errors in the model Apr 08, 2009 CS5331: Autonomous Mobile Robots7
Rewards 24 features Used inverse reinforcement learning Rewards from inverse reinforcement usually did not produce correct result Took inverse results and manually tuned them to get good results Apr 08, 2009 CS5331: Autonomous Mobile Robots8
Helicopter Xcell Tempest 54” long 19” high 13 lbs Two-stroke engine Orientation sensors GPS-doesn’t work during flips Apr 08, 2009 CS5331: Autonomous Mobile Robots9
Apr 08, 2009 CS5331: Autonomous Mobile Robots10
Flip Apr 08, 2009 CS5331: Autonomous Mobile Robots11
Roll Apr 08, 2009 CS5331: Autonomous Mobile Robots12
Tail-In Funnel Apr 08, 2009 CS5331: Autonomous Mobile Robots13
Nose-In Funnel Apr 08, 2009 CS5331: Autonomous Mobile Robots14
Questions Motivations/Who pays for it I can see applications in the defense sector DARPA Could more maneuvers be done just by changing some parameters? Probably not because the filter is learned based on a model so you would need to create a new model Apr 08, 2009 CS5331: Autonomous Mobile Robots15
More Questions What's the relationship between reinforcement learning and MDP? Not Sure Could a helicopter like this operate in the West Texas wind storms? Apr 08, 2009 CS5331: Autonomous Mobile Robots16
Fun Stuff Videos: dxqn0fcnE dxqn0fcnE Helicopter elicopterkits/1025_Spectra_G/1025_kit _main.asp elicopterkits/1025_Spectra_G/1025_kit _main.asp Apr 08, 2009 CS5331: Autonomous Mobile Robots17