Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Apr 08, 2009 CS5331: Autonomous.

Similar presentations


Presentation on theme: "1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Apr 08, 2009 CS5331: Autonomous."— Presentation transcript:

1 1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Greg.mcchesney@ttu.edu Apr 08, 2009 CS5331: Autonomous Mobile Robots

2 Overview  Creating a robot that can fly autonomously  Software developed at Stanford as part of their AI lab  This paper is slightly outdated as many new maneuvers have been created. Apr 08, 2009 CS5331: Autonomous Mobile Robots2

3 Learning Approach  Apprenticeship Collect data from human trying maneuver (multiple times) Learn a model from the data Find controller than can simulate based on model Test on helicopter (pray it doesn’t crash) Apr 08, 2009 CS5331: Autonomous Mobile Robots3

4 Helicopters State  Position  Velocity  Angular Velocity  Controlled with 4 dimensions Cyclic pitch Tail rotor  Take gravity out when calculating the model Apr 08, 2009 CS5331: Autonomous Mobile Robots4

5 Controller Design  Use a Markov decision process  Sextuple (S,A,T,H,s(0),R) S-set of states A-set of actions (inputs) T-dynamic model-set of probability distributions for the next state H-horizon or number of time steps of interest s(0)-initial state R-reward function Apr 08, 2009 CS5331: Autonomous Mobile Robots5

6 Differential Dynamic Programming(DDP)  Compute the linear approximation  Compute the optimal solution to the linear quadratic regulator Must take into account error state Cost for change in input-needed in real testing Apr 08, 2009 CS5331: Autonomous Mobile Robots6

7 DDP-Continued  2 phases DDP to find open loop input sequence Use DDP again refining the inputs as a deviation from the nominal open-loop input sequence  Integral control-take into account wind and errors in the model Apr 08, 2009 CS5331: Autonomous Mobile Robots7

8 Rewards  24 features  Used inverse reinforcement learning  Rewards from inverse reinforcement usually did not produce correct result  Took inverse results and manually tuned them to get good results Apr 08, 2009 CS5331: Autonomous Mobile Robots8

9 Helicopter  Xcell Tempest  54” long  19” high  13 lbs  Two-stroke engine  Orientation sensors  GPS-doesn’t work during flips Apr 08, 2009 CS5331: Autonomous Mobile Robots9

10 Apr 08, 2009 CS5331: Autonomous Mobile Robots10

11 Flip Apr 08, 2009 CS5331: Autonomous Mobile Robots11

12 Roll Apr 08, 2009 CS5331: Autonomous Mobile Robots12

13 Tail-In Funnel Apr 08, 2009 CS5331: Autonomous Mobile Robots13

14 Nose-In Funnel Apr 08, 2009 CS5331: Autonomous Mobile Robots14

15 Questions  Motivations/Who pays for it I can see applications in the defense sector DARPA  Could more maneuvers be done just by changing some parameters? Probably not because the filter is learned based on a model so you would need to create a new model Apr 08, 2009 CS5331: Autonomous Mobile Robots15

16 More Questions  What's the relationship between reinforcement learning and MDP? Not Sure  Could a helicopter like this operate in the West Texas wind storms? Apr 08, 2009 CS5331: Autonomous Mobile Robots16

17 Fun Stuff  Videos: http://heli.stanford.edu/ http://www.youtube.com/watch?v=VC dxqn0fcnE http://www.youtube.com/watch?v=VC dxqn0fcnE  Helicopter http://www.miniatureaircraftusa.com/h elicopterkits/1025_Spectra_G/1025_kit _main.asp http://www.miniatureaircraftusa.com/h elicopterkits/1025_Spectra_G/1025_kit _main.asp Apr 08, 2009 CS5331: Autonomous Mobile Robots17


Download ppt "1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University Apr 08, 2009 CS5331: Autonomous."

Similar presentations


Ads by Google