Stochastic Optimal Control of Unknown Linear Networked Control System in the Presence of Random Delays and Packet Losses OBJECTIVES Develop a Q-learning.

Slides:

Advertisements

Similar presentations

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Advertisements

The Impact of Channel Estimation Errors on Space-Time Block Codes Presentation for Virginia Tech Symposium on Wireless Personal Communications M. C. Valenti.

Stability of computer network for the set delay Jolanta Tańcula.

Design of LFC using Optimal Control Theory The optimal controller is designed to minimize the quadratic performance index of the following form For linear.

1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted.

The loss function, the normal equation,

Study of the periodic time-varying nonlinear iterative learning control ECE 6330: Nonlinear and Adaptive Control FISP Hyo-Sung Ahn Dept of Electrical and.

Claudia Lizet Navarro Hernández PhD Student Supervisor: Professor S.P.Banks April 2004 Monash University Australia April 2004 The University of Sheffield.

1cs542g-term Notes  Assignment 1 due tonight ( me by tomorrow morning)

Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.

280 SYSTEM IDENTIFICATION The System Identification Problem is to estimate a model of a system based on input-output data. Basic Configuration continuous.

Analysis of a Pendulum Problem after Jan Jantzen

Analyzing Multi-channel MAC Protocols for Underwater Sensor Networks Presenter: Zhong Zhou.

Reinforcement Learning Mitchell, Ch. 13 (see also Barto & Sutton book on-line)

Controller Tuning: A Motivational Example

ECE 776 Information Theory Capacity of Fading Channels with Channel Side Information Andrea J. Goldsmith and Pravin P. Varaiya, Professor Name: Dr. Osvaldo.

Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”

Autonomous Robotics Team Autonomous Robotics Lab: Cooperative Control of a Three-Robot Formation Texas A&M University, College Station, TX Fall Presentations.

Estimation Error and Portfolio Optimization Global Asset Allocation and Stock Selection Campbell R. Harvey Duke University, Durham, NC USA National Bureau.

Normalised Least Mean-Square Adaptive Filtering

Chapter 1 Introduction to Adaptive Control

Algorithm Taxonomy Thus far we have focused on:

Introduction to Adaptive Digital Filters Algorithms

Particle Filtering in Network Tomography

A Simple and Effective Cross Layer Networking System for Mobile Ad Hoc Networks Wing Ho Yuen, Heung-no Lee and Timothy Andersen.

Adaptive Critic Design for Aircraft Control Silvia Ferrari Advisor: Prof. Robert F. Stengel Princeton University FAA/NASA Joint University Program on Air.

Book Adaptive control -astrom and witten mark

By Asst.Prof.Dr.Thamer M.Jamel Department of Electrical Engineering University of Technology Baghdad – Iraq.

Optimal Nonlinear Neural Network Controllers for Aircraft Joint University Program Meeting October 10, 2001 Nilesh V. Kulkarni Advisors Prof. Minh Q. Phan.

Neural Network Based Online Optimal Control of Unknown MIMO Nonaffine Systems with Application to HCCI Engines OBJECTIVES  Develop an optimal control.

Multiuser Detection (MUD) Combined with array signal processing in current wireless communication environments Wed. 박사 3학기 구 정 회.

CSDA Conference, Limassol, 2005 University of Medicine and Pharmacy “Gr. T. Popa” Iasi Department of Mathematics and Informatics Gabriel Dimitriu University.

On Optimizing the Backoff Interval for Random Access Scheme Zygmunt J. Hass and Jing Deng IEEE Transactions on Communications, Dec 2003.

Chapter 8 Model Based Control Using Wireless Transmitter.

Time-Varying Angular Rate Sensing for a MEMS Z-Axis Gyroscope Mohammad Salah †, Michael McIntyre †, Darren Dawson †, and John Wagner ‡ Mohammad Salah †,

1 Chapter 2 1. Parametric Models. 2 Parametric Models The first step in the design of online parameter identification (PI) algorithms is to lump the unknown.

1 Adaptive Control Neural Networks 13(2000): Neural net based MRAC for a class of nonlinear plants M.S. Ahmed.

CHAPTER 5 S TOCHASTIC G RADIENT F ORM OF S TOCHASTIC A PROXIMATION Organization of chapter in ISSO –Stochastic gradient Core algorithm Basic principles.

Low Level Control. Control System Components The main components of a control system are The plant, or the process that is being controlled The controller,

Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,

S& EDG: Scalable and Efficient Data Gathering Routing Protocol for Underwater Wireless Sensor Networks 1 Prepared by: Naveed Ilyas MS(EE), CIIT, Islamabad,

CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.

Adaptive Optimal Control of Nonlinear Parametric Strict Feedback Systems with application to Helicopter Attitude Control OBJECTIVES  Optimal adaptive.

Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.

September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:

Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.

(COEN507) LECTURE III SLIDES By M. Abdullahi

Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,

1 Lu LIU and Jie HUANG Department of Mechanics & Automation Engineering The Chinese University of Hong Kong 9 December, Systems Workshop on Autonomous.

State-Space Recursive Least Squares with Adaptive Memory College of Electrical & Mechanical Engineering National University of Sciences & Technology (NUST)

1 Nonlinear Sub-optimal Mid Course Guidance with Desired Alinement using MPQC P. N. Dwivedi, Dr. A.Bhattacharya, Scientist, DRDO, Hyderabad-,INDIA Dr.

Introduction Control Engineering Kim, Do Wan HANBAT NATIONAL UNIVERSITY.

Optimization-based Cross-Layer Design in Networked Control Systems Jia Bai, Emeka P. Eyisi Yuan Xue and Xenofon D. Koutsoukos.

A PID Neural Network Controller

Adnan Quadri & Dr. Naima Kaabouch Optimization Efficiency

Student: Hao Xu, ECE Department

Controller Tuning: A Motivational Example

Estimation Error and Portfolio Optimization

Estimation Error and Portfolio Optimization

The loss function, the normal equation,

Mathematical Foundations of BME Reza Shadmehr

NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION

Estimation Error and Portfolio Optimization

16. Mean Square Estimation

NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION

ACHIEVEMENT DESCRIPTION

Chapter 7 Inverse Dynamics Control

Presentation transcript:

Stochastic Optimal Control of Unknown Linear Networked Control System in the Presence of Random Delays and Packet Losses OBJECTIVES Develop a Q-learning based stochastic suboptimal controller for an unknown networked control system (NCS) with random delay and packet losses; Develop an adaptive estimator (AE)-based stochastic optimal control Investigate the effects of delays and packet losses on the stability of the NCS with unknown dynamics Student: Hao Xu, ECE Department BACKGROUND Networked control can reduce the installation costs and increase productivity through the use of wireless communication technology The challenging problems in control of networked-based system are network delay and packet losses. These effects do not only degrade the performance of NCS, but also can destabilize the system. Approximate dynamic programming (ADP) techniques intent to solve optimal control problems of complex systems without the knowledge of system dynamics in a forward-in-time manner. Figure 1 the wireless networked control system The proposed approach for optimal controller design involves using a combination of Q-learning and adaptive estimator (AE) whereas for suboptimal controller design only Q-learning scheme will be utilized The delays and packet losses are incorporated in the dynamic model which will be used for the controller development Networked Control System Model Networked control system representation and Figure 2 depicts a block diagram representation: Figure 2 Block diagram of Networked control system Faculty Advisor: Dr. Jagannathan Sarangapani, ECE Department Q-learning Stochastic Suboptimal Control 1. Define the Q-function: 2. Define the update law to tune the Q-function where 3. Using mean values of the delays and packet losses instead of the random delays and packet losses, then H matrix become time-invariant matrix. 4. Define the update law to tune the H matrix online in least-squares sense 1) Vectorize the H matrix: 2) Update law: where and 5. Develop the stochastic suboptimal control 6. Convergence: when, and at the same time. Simulation Results Consider the linear time-invariant inverted pendulum dynamics After random delays and packet losses due to NCS, the original time-invariant system was discretized and represented as a time-varying system (Note: since the random delays and packet losses are considered, the NCS model is not only time varying, but also a function of time k) Performance evaluation of proposed suboptimal and optimal control 1)Stability: Figure 5 Stability performance As shown in Figure 5, if we use a PID without considering delays and packet losses, the NCS will be unstable(fig.5-(a)). However, when we implement proposed Q-learning suboptimal and AE optimal control, the NCS can still maintain stable(Fig.5-(b),(c)). 2) Optimality: Figure 6 Optimal performance As shown in figure 6-(a), proposed AE-base optimal controller can minimize the cost-to- go ( ) better than proposed Q-learning suboptimal controller. In Figure 6-(b), proposed AE-based optimal control can force NCS states converge to zero quicker than Q-learning suboptimal control. It indicates proposed AE-based optimal control is more effective than Q-learning suboptimal control. AE-based Stochastic Optimal Control 1. When random delays and packet losses are considered, H matrix become time-varying. However, we assume that it changes slowly. 2. Set up stochastic Q-function : 3. Using the adaptive estimator to represent the Q-function: where and is the Kronecker product quadratic polynomial basis vector 4. Define the update law to tune the approximated H matrix 1) Represent residual error: where and 2) Update law for time varying matrix H: where is a constant, and 5. Determine the AE stochastic optimal control input 6. Convergence: when, then and CONCLUSIONS Proposed Q-learning based suboptimal and AE-based optimal control design for NCS with unknown dynamics in presence of random delays and packet losses performs superior than a traditional controller Both Q-learning based suboptimal control and AE-based optimal control can maintain NCS stable. Proposed AE-based optimal control is more effective than Proposed Q-learning based suboptimal control. AE-based Stochastic Optimal Control (2) Figure 3 present the block diagram for the AE-based stochastic optimal regulator of NCS Figure 3 Stochastic optimal regulator block diagram FUTURE WORK Design suboptimal and optimal control for nonlinear networked control systems (NNCS) with unknown dynamics in presence of random delays and packet losses Design a novel wireless network protocol to decrease the effects of random delays and packet losses. Optimize the NNCS globally from both control part and wireless network part.