Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air.

Slides:



Advertisements
Similar presentations
Lect.3 Modeling in The Time Domain Basil Hamed
Advertisements


Neural Network Control of a Hypersonic Inlet Joint University Program Meeting April 05, 2001 Nilesh V. Kulkarni Advisors Prof. Minh Q. Phan Dartmouth College.
Robust and Efficient Control of an Induction Machine for an Electric Vehicle Arbin Ebrahim and Dr. Gregory Murphy University of Alabama.
Training an Adaptive Critic Flight Controller
Lecture 11: Recursive Parameter Estimation
President UniversityErwin SitompulModern Control 11/1 Dr.-Ing. Erwin Sitompul President University Lecture 11 Modern Control
Linear dynamic systems
 KHAIRUNNISA BINTI ABD AZIZ  MOHD SHAHRIL BIN IBRAHIM  MOHD SUFI B AB MAJID  MOHD SUPIAN B MOHD NAZIP  MOKTAR BIN YAAKOP.
280 SYSTEM IDENTIFICATION The System Identification Problem is to estimate a model of a system based on input-output data. Basic Configuration continuous.
Introduction to Digital Systems Saint-Petersburg State University Faculty of Applied Mathematics – Control Processes Lections 8 ─ 10 prof. Evgeny I. Veremey.
February 24, Final Presentation AAE Final Presentation Backstepping Based Flight Control Asif Hossain.
Empirical Virtual Sliding Target Guidance law Presented by: Jonathan Hexner Itay Kroul Supervisor: Dr. Mark Moulin.

CH 1 Introduction Prof. Ming-Shaung Ju Dept. of Mechanical Engineering NCKU.
Silvia Ferrari Princeton University
Performance Optimization of the Magneto-hydrodynamic Generator at the Scramjet Inlet Nilesh V. Kulkarni Advisors: Prof. Minh Q. Phan Dartmouth College.
Approximating the Algebraic Solution of Systems of Interval Linear Equations with Use of Neural Networks Nguyen Hoang Viet Michal Kleiber Institute of.
IFAC Control of Distributed Parameter Systems, July 20-24, 2009, Toulouse Inverse method for pyrolysable and ablative materials with optimal control formulation.
A Shaft Sensorless Control for PMSM Using Direct Neural Network Adaptive Observer Authors: Guo Qingding Luo Ruifu Wang Limei IEEE IECON 22 nd International.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Satellite Tracking Example of SNC and DMC ASEN.
Introductory Control Theory CS 659 Kris Hauser.
4. Linear optimal Filters and Predictors 윤영규 ADSLAB.
Book Adaptive control -astrom and witten mark
20/10/2009 IVR Herrmann IVR: Introduction to Control OVERVIEW Control systems Transformations Simple control algorithms.
Optimal Nonlinear Neural Network Controllers for Aircraft Joint University Program Meeting October 10, 2001 Nilesh V. Kulkarni Advisors Prof. Minh Q. Phan.
Neural Network Based Online Optimal Control of Unknown MIMO Nonaffine Systems with Application to HCCI Engines OBJECTIVES  Develop an optimal control.
1 Adaptive, Optimal and Reconfigurable Nonlinear Control Design for Futuristic Flight Vehicles Radhakant Padhi Assistant Professor Dept. of Aerospace Engineering.
Optimal Feedback Control of the Magneto-hydrodynamic Generator for a Hypersonic Vehicle Nilesh V. Kulkarni Advisors: Prof. Minh Q. Phan Dartmouth College.
To clarify the statements, we present the following simple, closed-loop system where x(t) is a tracking error signal, is an unknown nonlinear function,
1 Adaptive Control Neural Networks 13(2000): Neural net based MRAC for a class of nonlinear plants M.S. Ahmed.
September Bound Computation for Adaptive Systems V&V Giampiero Campa September 2008 West Virginia University.
Statistical learning and optimal control: A framework for biological learning and motor control Lecture 4: Stochastic optimal control Reza Shadmehr Johns.
Advanced Control of Marine Power System
Akram Bitar and Larry Manevitz Department of Computer Science
Motivation Thus far we have dealt primarily with the input/output characteristics of linear systems. State variable, or state space, representations describe.
Structure and Synthesis of Robot Motion Topics in Control Subramanian Ramamoorthy School of Informatics 19 February, 2009.
Feedback Control Systems (FCS) Dr. Imtiaz Hussain URL :
Robotics Research Laboratory 1 Chapter 7 Multivariable and Optimal Control.
Adaptive Feedback Scheduling with LQ Controller for Real Time Control System Chen Xi.
A Semi-Blind Technique for MIMO Channel Matrix Estimation Aditya Jagannatham and Bhaskar D. Rao The proposed algorithm performs well compared to its training.
Nonlinear Predictive Control for Fast Constrained Systems By Ahmed Youssef.
CONTROL OF MULTIVARIABLE SYSTEM BY MULTIRATE FAST OUTPUT SAMPLING TECHNIQUE B. Bandyopadhyay and Jignesh Solanki Indian Institute of Technology Mumbai,
Lecture 5 Neural Control
Stochastic Optimal Control of Unknown Linear Networked Control System in the Presence of Random Delays and Packet Losses OBJECTIVES Develop a Q-learning.
Chapter 8: Adaptive Networks
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Reinforcement Learning for Intelligent Control Presented at Chinese Youth Automation Conference National Academy of Science, Beijing, 8/22/05 George G.

Adaptive Optimal Control of Nonlinear Parametric Strict Feedback Systems with application to Helicopter Attitude Control OBJECTIVES  Optimal adaptive.
دانشگاه صنعتي اميركبير دانشكده مهندسي پزشكي استاد درس دكتر فرزاد توحيدخواه بهمن 1389 کنترل پيش بين-دکتر توحيدخواه MPC Stability-2.
Effects of System Uncertainty on Adaptive-Critic Flight Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University.
CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models.
Simulation and Experimental Verification of Model Based Opto-Electronic Automation Drexel University Department of Electrical and Computer Engineering.
Dynamic Neural Network Control (DNNC): A Non-Conventional Neural Network Model Masoud Nikravesh EECS Department, CS Division BISC Program University of.
Towards Adaptive Optimal Control of the Scramjet Inlet Nilesh V. Kulkarni Advisors: Prof. Minh Q. Phan Dartmouth College Prof. Robert F. Stengel Princeton.
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
Computacion Inteligente Least-Square Methods for System Identification.
Silvia Ferrari and Mark Jensenius Department of Mechanical Engineering Duke University Crystal City, VA, September 28, 2005 Robust and.
A PID Neural Network Controller
Deep Feedforward Networks
Introduction to Soft Computing
Student: Hao Xu, ECE Department
The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’
Homework 9 Refer to the last example.
Hafez Sarkawi (D1) Control System Theory Lab
NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION
NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION
Chapter 7 Inverse Dynamics Control
Akram Bitar and Larry Manevitz Department of Computer Science
Presentation transcript:

Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air Transportation, MIT, Cambridge, MA January 18-19, 2001

Table of Contents Introduction and motivation Optimal control problem Dynamic programming formulation Adaptive critic designs Proportional-Integral Neural Network Controller Pre-training of action and critic networks On-line training of action and critic networks Summary and Conclusions

Introduction Gain scheduled linear controllers for nonlinear systems Classical/neural synthesis of control systems Prior knowledge Adaptive control and artificial neural networks Adaptive critics Learn in real time Cope with noise Cope with many variables Plan over time in a complex way... Action network takes immediate control action Critic network estimates projected cost

Motivation for Neural Network-Based Controller Network of networks motivated by linear control structure Multiphase learning: Pre-training On-line training during piloted simulations or testing Improved global control Pre-training phase provides: A global neural network controller An excellent initialization point for on-line learning On-line training accounts for: Differences between actual and assumed dynamic models Nonlinear effects not captured in linearizations

Nonlinear Business Jet Aircraft Model Aircraft Equations of Motion: State vector: Control vector: Parameters: Output vector: XBXB ZBZB Thrust Drag Lift V mg p q r   YBYB Bolza Type Cost Function Minimization:

Value Function Minimization The minimization of J can be imbedded in the following problem: Value Function for [t, t f ]: V Time  [x(t f ), t f ] t0t0 ttftf J Time  [x(t f ), t f ] t0t0 ttftf

Discretized Optimization Problem Approximate the equations of motion by a difference equation: Similarly, the cost function: The cost of operation during the last stage is: where:

The Principle of Optimality Similarly, the optimal cost over the last two intervals is given by: The optimal cost, from to, is then: By the principle of optimality, a b c Given V ab, if V abc is optimal from a to c, then V bc is optimal from b to c: V * abc = V ab + V * bc

A Recurrence Relationship of Dynamic Programming Then, for a k-stage process: Backward Dynamic Programming So, the optimal cost for a 2-stage process can be re-written as: b c e d f V cf V df V ef V bc V bd V be Begin at t f Move backward to t 0 Store all u and V Off-line process a b'

Forward Dynamic Programming Estimate cost-to-go function, Determine immediate control action Move forward in time, updating cost-to-go function On-line process V bc c b f a V ab Cost associated with going from t k to t f : Utility Estimated cost for Hereon, let t = t k, t + 1 = t k+1, etc...

Adaptive Critic Designs.. at a Glance! Legend: 'H': Heuristic 'DP': Dynamic Programming 'D': Dual 'P': Programming 'G': Global 'AD': Action Dependent HDP u NN A NN C x V x DHP NN A NN C x x u GDHP NN A NN C x x u AD u NN C NN A x x ADHDP ADDHP ADGDHP

Proportional-Integral Controller Closed-loop stability: y c = desired output, (x c,u c ) = set point. ycyc x(t)x(t) ys(t)ys(t) CICI CFCF CBCB HuHu HxHx u(t)u(t) AIRCRAFT (t)(t) Omitting  's, for simplicity:

Assuming small perturbations, expand about nominal solution: where: The state variable is augmented to include the output error integral,  (t), and the Proportional-Integral cost function is defined as: Linearized Business Jet Aircraft Model

where P is a Riccati matrix. Furthermore, From the Euler-Lagrange equations, the optimal * control law is: Linear Proportional-Integral Control Law Formulation it can be shown that the optimal value function is and its partial derivative with respect to  x a is:

Where: Proportional-Integral Neural Network Controller ycyc x(t)x(t) u(t)u(t) ucuc + - uB(t)uB(t) uI(t)uI(t) xcxc ys(t)ys(t) e a NN I NN F NN B SVG CSG AIRCRAFT NN C  V/  x a (t)(t)

Feedback and Command-Integral (Action) Neural Network Pre-Training Feedback Requirements (pre-training phase): (R1) (R2) NN B accounts for regulation,  u B = NN B (  x, a). For each operating point, k, a z output and inputs: ~ Command-Integral Requirements (pre-training phase): NN I provides dynamic compensation,  u I = NN I ( , a). For each operating point, k, a z output and inputs: (R1) (R2)

Critic Neural Network Pre-Training Critic Requirements (pre-training phase): NN C estimates the value function derivatives,  = NN C (  x a, a). For each operating point, k, a z output and training inputs: From the Proportional-Integral optimal value function derivatives: (R1) (R2)

Action and Critic Neural Network On-line Training by Dual Heuristic Programming The cost-to-go at time t, The same cost function is optimized in pre-training and in on-line learning The utility function is defined as: must be minimized w.r.t., for an estimated cost-to-go at (t + 1).

Optimality Conditions for Dual Heuristic Programming Differentiating both sides of the value function, V[  x a (t)], w.r.t.  x a (t): Differentiating w.r.t. the control, Optimality equation:

NN C Aicraft Model (Discrete Time) Utility NN A xa(t)xa(t) a  x a (t+1) Action Network On-line Training Train action network, at time t, holding the critic parameters fixed [Balakrishnan and Biega, 1996] NN B NN I + + uB(t)uB(t) uI(t)uI(t) a (t)(t)

Critic Network On-line Training Train critic network, at time t, holding the action parameters fixed [Balakrishnan and Biega, 1996] NN C Utility NN A xa(t)xa(t) a  x a (t+1) NN C + Aicraft Model (Discrete Time)

Summary Aircraft optimal control problem Linear multivariable control structure Corresponding nonlinear controller Adaptive critic architecture: Action and critic networks  Algebraic pre-training based on a-priori knowledge  On-line training during simulations (severe conditions)

Conclusions and Future Work Nonlinear, adaptive, real-time controller: Maintain stability and robustness throughout flight envelope Improve aircraft control performance under extreme conditions Systematic approach for designing nonlinear control systems motivated by well-known linear control structures Innovative neural network training techniques Future Work: Adaptive critic architecture implementation Testing: acrobatic maneuvers, severe operating conditions, coupling and nonlinear effects