Presentation is loading. Please wait.

Presentation is loading. Please wait.

Training an Adaptive Critic Flight Controller

Similar presentations


Presentation on theme: "Training an Adaptive Critic Flight Controller"— Presentation transcript:

1 Training an Adaptive Critic Flight Controller
Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air Transportation, Princeton University, Princeton, NJ April 5-6, 2001

2 Introduction Classical/neural synthesis of control systems
Prior knowledge Adaptive control and artificial neural networks Adaptive critics Learn in real time Cope with noise Cope with many variables Plan over time in a complex way ... Discuss as in the GNC paper how gain-sched. controllers have been used to control nonlinear systems, why (i.e. changing op. conditions) what they consist of (briefly) and where do they fail. *Mention aircrafts Explain what we propose and what are the objectives of the (general approach) -> one particular feature of NNs is that they can learn in an on-line fashion (therefore allowing for...), one architecture that allows for this is adaptive critics (brain-like capabilities) which have the following characteristics.. Begin to introduce the notion of why we'll begin with dynamic programming to move on to adaptive critics.. critic networks remove the learning process one step from the control network KEYWORDS: reinforcement learning, dynamic programming Adaptation takes place during every time interval: Action network takes immediate control action Critic network estimates projected cost

3 Motivation Provide full envelope control Multiphase learning:
Pre-training phase, motivated by corresponding linear controller On-line training phase, during simulations or testing On-line training accounts for: Differences between actual and assumed dynamic models Nonlinear effects not captured in linearizations Potential applications: Incorporate pilot's knowledge into controller a-priori Uninhabited air vehicles control Aerobatic flight control INTRODUCE PI STRUCTURE AND IDEA FOR PINN MOTIVATION… 1- .. by replacing the gains with neural networks 2- Explain well what is covered here and what is instead the ultimate goal - this is why the training is called "pre-training" - and that performance baselines for this phase are established by an equivalent linear model.. 3- The overall procedure provides improved global control w.r.t. gain scheduling, because ... 4- This phase already defines a global nonlinear neural network controller with less effort than.. (also compare to previous such methods) and is demonstrated here 5- On-line training improves control response for large, coupled motions

4 Table of Contents Aircraft control design approach
Initialization, or pre-training phase Adaptive critic neural network controller On-line training Resilient backpropagation

5 Aircraft Control Design Approach
Modeling Linearizations Linear Control Initialization On-line Training Full Envelope Control!

6 Linear control design:
Linearizations: Aircraft Flight Envelope Altitude (m) Linear control design: Longitudinal Lateral-directional Velocity (m/s)

7 Linear Proportional-Integral Controller
Closed-loop stability: ys(t) + + + x(t) CI Hu Hx - - u(t) yc x(t) CF AIRCRAFT + - Delta’s omitted for simplicity. *Study the formulas and notation from the paper in case of questions.. CB Omitting D's, for simplicity: yc = desired output, (xc,uc) = set point.

8 Proportional-Integral Neural Network Controller
yc x(t) + - u(t) uc uB(t) uI(t) xc ys(t) e a NNI NNF NNB SVG CSG AIRCRAFT NNC l = dV/dDxa Orange lines indicate MAIN training lines to show the flow of the main information at EACH TIME STEP training occurs along these lines. Other flows are also possible, like the one of the control from the action TO the critic for an AD design, but are not shown for simplicity (see on-line training slides..) Note that the action network is the sum of NNB and NNI Note the total control is the sum of uc and the NNs contributions.. Where:

9 Algebraic Neural Network Pre-training Phase
Feedback: Integral Error: Critic: NNI|L CI |L CI |LD NNI|LD NNC|L P |L P |LD NNC|LD NNB|L CB |L CB |LD NNB|LD Combine longitudinal and lateral-directional networks: NNB|L NNB|LD NNB , etc. ... Obtain action network: NNB NNI NNA

10 Roll and Sideslip Angle
Comparison of Neural Network and Linear Controllers Between Training Points Flight condition: (14,000 m; 220 m/s) Velocity (m/s) Velocity and Climb Angle Command Climb Angle (deg) Roll Angle (deg) Roll and Sideslip Angle Command Sideslip Angle (deg) NN-Control Linear Control Time (sec)

11 Adaptive Critic Implementation: Action Network On-line Training
Train action network, at time t, holding the critic parameters fixed x(t) Aicraft Model x(t+1) a NNA Utility Function NNA Target Optimality Condition NNC Aircraft model block indicates that a discretized model of the airplane is used at that location, but in reality other info can be used as well, for instance an integrator to produce xi(t), similarly for other blocks.. they are only indicative of what is being used NNB + + NNI [Balakrishnan and Biega, 1996]

12 Adaptive Critic Implementation: Critic Network On-line Training
Train critic network, at time t, holding the action parameters fixed x(t) Aicraft Model x(t+1) a NNA Utility Function NNC NNC Target NNC + [Balakrishnan and Biega, 1996]

13 On-line Neural Network Training Goal
Given a target, t(p), for the network output, z(p): with network parameters, w, provided by the initialization phase. Scaling effect: v w E v w z p s

14 Comparison of Neural Network Training Algorithms
Technique Speed Implement. Complexity Memory Requirement Main Drawbacks Backpropagation Poor Low Small Scaling Levenberg- Marquardt Excellent Medium Large Extended Kalman Filter (Highest) High Resilient Medium-High Medium- Local convergence

15 Resilient Backpropagation
NN Architecture and w (initialization) * * * Store w, and D

16 Resilient Backpropagation Algorithm Performance
Adaptive critics neural network controller test case: Action Network Epochs Mean-squared error performance

17 Summary and Conclusions
Adaptive critic flight controller: Improve aircraft control performance under extreme conditions Systematic approach for designing nonlinear control systems, innovative neural network training techniques Adaptive critic neural network controller implementation Algebraic pre-training based on a-priori knowledge On-line training during simulations (severe conditions) The aircraft nonlinear model is in terms of the given state and control vectors. In this and the following slides, the body frame, the inertial frame and all of the significant aircraft angle are introduced to illustrate the complexity of the problem at hand. Future Work: Testing: acrobatic maneuvers, severe operating conditions, coupling and nonlinear effects!


Download ppt "Training an Adaptive Critic Flight Controller"

Similar presentations


Ads by Google