Download presentation
Presentation is loading. Please wait.
Published byGodwin Abraham Robinson Modified over 9 years ago
1
Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air Transportation, MIT, Cambridge, MA January 18-19, 2001
2
Table of Contents Introduction and motivation Optimal control problem Dynamic programming formulation Adaptive critic designs Proportional-Integral Neural Network Controller Pre-training of action and critic networks On-line training of action and critic networks Summary and Conclusions
3
Introduction Gain scheduled linear controllers for nonlinear systems Classical/neural synthesis of control systems Prior knowledge Adaptive control and artificial neural networks Adaptive critics Learn in real time Cope with noise Cope with many variables Plan over time in a complex way... Action network takes immediate control action Critic network estimates projected cost
4
Motivation for Neural Network-Based Controller Network of networks motivated by linear control structure Multiphase learning: Pre-training On-line training during piloted simulations or testing Improved global control Pre-training phase provides: A global neural network controller An excellent initialization point for on-line learning On-line training accounts for: Differences between actual and assumed dynamic models Nonlinear effects not captured in linearizations
5
Nonlinear Business Jet Aircraft Model Aircraft Equations of Motion: State vector: Control vector: Parameters: Output vector: XBXB ZBZB Thrust Drag Lift V mg p q r YBYB Bolza Type Cost Function Minimization:
6
Value Function Minimization The minimization of J can be imbedded in the following problem: Value Function for [t, t f ]: V Time [x(t f ), t f ] t0t0 ttftf J Time [x(t f ), t f ] t0t0 ttftf
7
Discretized Optimization Problem Approximate the equations of motion by a difference equation: Similarly, the cost function: The cost of operation during the last stage is: where:
8
The Principle of Optimality Similarly, the optimal cost over the last two intervals is given by: The optimal cost, from to, is then: By the principle of optimality, a b c Given V ab, if V abc is optimal from a to c, then V bc is optimal from b to c: V * abc = V ab + V * bc
9
A Recurrence Relationship of Dynamic Programming Then, for a k-stage process: Backward Dynamic Programming So, the optimal cost for a 2-stage process can be re-written as: b c e d f V cf V df V ef V bc V bd V be Begin at t f Move backward to t 0 Store all u and V Off-line process a b'
10
Forward Dynamic Programming Estimate cost-to-go function, Determine immediate control action Move forward in time, updating cost-to-go function On-line process V bc c b f a V ab Cost associated with going from t k to t f : Utility Estimated cost for Hereon, let t = t k, t + 1 = t k+1, etc...
11
Adaptive Critic Designs.. at a Glance! Legend: 'H': Heuristic 'DP': Dynamic Programming 'D': Dual 'P': Programming 'G': Global 'AD': Action Dependent HDP u NN A NN C x V x DHP NN A NN C x x u GDHP NN A NN C x x u AD u NN C NN A x x ADHDP ADDHP ADGDHP
12
Proportional-Integral Controller Closed-loop stability: y c = desired output, (x c,u c ) = set point. ycyc x(t)x(t) ys(t)ys(t) CICI CFCF CBCB HuHu HxHx + + + + - - - u(t)u(t) AIRCRAFT (t)(t) Omitting 's, for simplicity:
13
Assuming small perturbations, expand about nominal solution: where: The state variable is augmented to include the output error integral, (t), and the Proportional-Integral cost function is defined as: Linearized Business Jet Aircraft Model
14
where P is a Riccati matrix. Furthermore, From the Euler-Lagrange equations, the optimal * control law is: Linear Proportional-Integral Control Law Formulation it can be shown that the optimal value function is and its partial derivative with respect to x a is:
15
Where: Proportional-Integral Neural Network Controller ycyc x(t)x(t) + + + + - u(t)u(t) ucuc + - uB(t)uB(t) uI(t)uI(t) xcxc ys(t)ys(t) e a NN I NN F NN B SVG CSG AIRCRAFT NN C V/ x a (t)(t)
16
Feedback and Command-Integral (Action) Neural Network Pre-Training Feedback Requirements (pre-training phase): (R1) (R2) NN B accounts for regulation, u B = NN B ( x, a). For each operating point, k, a z output and inputs: ~ Command-Integral Requirements (pre-training phase): NN I provides dynamic compensation, u I = NN I ( , a). For each operating point, k, a z output and inputs: (R1) (R2)
17
Critic Neural Network Pre-Training Critic Requirements (pre-training phase): NN C estimates the value function derivatives, = NN C ( x a, a). For each operating point, k, a z output and training inputs: From the Proportional-Integral optimal value function derivatives: (R1) (R2)
18
Action and Critic Neural Network On-line Training by Dual Heuristic Programming The cost-to-go at time t, The same cost function is optimized in pre-training and in on-line learning The utility function is defined as: must be minimized w.r.t., for an estimated cost-to-go at (t + 1).
19
Optimality Conditions for Dual Heuristic Programming Differentiating both sides of the value function, V[ x a (t)], w.r.t. x a (t): Differentiating w.r.t. the control, Optimality equation:
20
NN C Aicraft Model (Discrete Time) Utility NN A xa(t)xa(t) a x a (t+1) Action Network On-line Training Train action network, at time t, holding the critic parameters fixed [Balakrishnan and Biega, 1996] NN B NN I + + uB(t)uB(t) uI(t)uI(t) a (t)(t)
21
Critic Network On-line Training Train critic network, at time t, holding the action parameters fixed [Balakrishnan and Biega, 1996] NN C Utility NN A xa(t)xa(t) a x a (t+1) NN C + Aicraft Model (Discrete Time)
22
Summary Aircraft optimal control problem Linear multivariable control structure Corresponding nonlinear controller Adaptive critic architecture: Action and critic networks Algebraic pre-training based on a-priori knowledge On-line training during simulations (severe conditions)
23
Conclusions and Future Work Nonlinear, adaptive, real-time controller: Maintain stability and robustness throughout flight envelope Improve aircraft control performance under extreme conditions Systematic approach for designing nonlinear control systems motivated by well-known linear control structures Innovative neural network training techniques Future Work: Adaptive critic architecture implementation Testing: acrobatic maneuvers, severe operating conditions, coupling and nonlinear effects
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.