Stochastic Optimal Control of Unknown Linear Networked Control System in the Presence of Random Delays and Packet Losses OBJECTIVES Develop a Q-learning based stochastic suboptimal controller for an unknown networked control system (NCS) with random delay and packet losses; Develop an adaptive estimator (AE)-based stochastic optimal control Investigate the effects of delays and packet losses on the stability of the NCS with unknown dynamics Student: Hao Xu, ECE Department BACKGROUND Networked control can reduce the installation costs and increase productivity through the use of wireless communication technology The challenging problems in control of networked-based system are network delay and packet losses. These effects do not only degrade the performance of NCS, but also can destabilize the system. Approximate dynamic programming (ADP) techniques intent to solve optimal control problems of complex systems without the knowledge of system dynamics in a forward-in-time manner. Figure 1 the wireless networked control system The proposed approach for optimal controller design involves using a combination of Q-learning and adaptive estimator (AE) whereas for suboptimal controller design only Q-learning scheme will be utilized The delays and packet losses are incorporated in the dynamic model which will be used for the controller development Networked Control System Model Networked control system representation and Figure 2 depicts a block diagram representation: Figure 2 Block diagram of Networked control system Faculty Advisor: Dr. Jagannathan Sarangapani, ECE Department Q-learning Stochastic Suboptimal Control 1. Define the Q-function: 2. Define the update law to tune the Q-function where 3. Using mean values of the delays and packet losses instead of the random delays and packet losses, then H matrix become time-invariant matrix. 4. Define the update law to tune the H matrix online in least-squares sense 1) Vectorize the H matrix: 2) Update law: where and 5. Develop the stochastic suboptimal control 6. Convergence: when, and at the same time. Simulation Results Consider the linear time-invariant inverted pendulum dynamics After random delays and packet losses due to NCS, the original time-invariant system was discretized and represented as a time-varying system (Note: since the random delays and packet losses are considered, the NCS model is not only time varying, but also a function of time k) Performance evaluation of proposed suboptimal and optimal control 1)Stability: Figure 5 Stability performance As shown in Figure 5, if we use a PID without considering delays and packet losses, the NCS will be unstable(fig.5-(a)). However, when we implement proposed Q-learning suboptimal and AE optimal control, the NCS can still maintain stable(Fig.5-(b),(c)). 2) Optimality: Figure 6 Optimal performance As shown in figure 6-(a), proposed AE-base optimal controller can minimize the cost-to- go ( ) better than proposed Q-learning suboptimal controller. In Figure 6-(b), proposed AE-based optimal control can force NCS states converge to zero quicker than Q-learning suboptimal control. It indicates proposed AE-based optimal control is more effective than Q-learning suboptimal control. AE-based Stochastic Optimal Control 1. When random delays and packet losses are considered, H matrix become time-varying. However, we assume that it changes slowly. 2. Set up stochastic Q-function : 3. Using the adaptive estimator to represent the Q-function: where and is the Kronecker product quadratic polynomial basis vector 4. Define the update law to tune the approximated H matrix 1) Represent residual error: where and 2) Update law for time varying matrix H: where is a constant, and 5. Determine the AE stochastic optimal control input 6. Convergence: when, then and CONCLUSIONS Proposed Q-learning based suboptimal and AE-based optimal control design for NCS with unknown dynamics in presence of random delays and packet losses performs superior than a traditional controller Both Q-learning based suboptimal control and AE-based optimal control can maintain NCS stable. Proposed AE-based optimal control is more effective than Proposed Q-learning based suboptimal control. AE-based Stochastic Optimal Control (2) Figure 3 present the block diagram for the AE-based stochastic optimal regulator of NCS Figure 3 Stochastic optimal regulator block diagram FUTURE WORK Design suboptimal and optimal control for nonlinear networked control systems (NNCS) with unknown dynamics in presence of random delays and packet losses Design a novel wireless network protocol to decrease the effects of random delays and packet losses. Optimize the NNCS globally from both control part and wireless network part.