Optimal control T. F. Edgar Spring 2012
Optimal Control Static optimization (finite dimensions) Calculus of variations (infinite dimensions) Maximum principle (Pontryagin) / minimum principle Based on state space models Min 𝑉 𝒙,𝒖 S.t. 𝒙 =𝒇 𝒙, 𝒖, 𝑡 𝒙 𝑡 0 is given 𝑉 𝒙,𝑢 =Φ 𝒙 𝑡 𝑓 + 𝑡 0 𝑡 𝑓 𝐿 𝒙,𝒖,𝑡 𝑑𝑡 General nonlinear control problem
Special Case of 𝑽 Minimum fuel: 0 𝑡 𝑓 𝒖 𝑑𝑡 Minimum time: 0 𝑡 𝑓 1𝑑𝑡 Max range : 𝑥 𝑡 𝑓 Quadratic loss: 0 𝑡 𝑓 𝒙 𝑇 𝑸𝒙+ 𝒖 𝑇 𝑹𝒖 𝑑𝑡 Analytical solution if state equation is linear, i.e., 𝒙 =𝑨𝒙+𝑩𝒖
“Linear Quadratic” problem - LQP Note 𝐼𝑆𝐸= 0 𝑡 𝑓 𝑥 2 𝑑𝑡 is not solvable in a realistic sense (𝑢 is unbounded), thus need control weighting in 𝑉 E.g., 𝑉= 0 𝑡 𝑓 𝑥 2 +𝑟 𝑢 2 𝑑𝑡 𝑟 is a tuning parameter (affects overshoot)
𝑉=𝑃𝑟𝑜𝑓𝑖𝑡 ? Ex. Maximize conversion in exit of tubular reactor max 𝑥 3 𝑡 𝑓 𝑥 3 : Concentration 𝑡: Residence time parameter In other cases, when 𝑥 and 𝑢 are deviation variables, 𝑥 2 +𝑟 𝑢 2 𝑥 2 +𝑟 𝑢 2 𝑑𝑡 Objective function does not directly relate to profit (See T. F. Edgar paper in Comp. Chem. Eng., Vol 29, 41 (2004))
Initial conditions (a) 𝑥 0 ≠0, 𝑥 𝑡 𝑓 → 𝑥 𝑑 =0 or 𝑉= 0 𝑡 𝑓 𝑥− 𝑥 𝑑 2 𝑑𝑡 Set point change, 𝑥 𝑑 is the desired 𝑥 (b) 𝑥 0 ≠0, impulse disturbance, 𝑥 𝑑 =0 (c) 𝑥 0 =0, model includes disturbance term
Other considerations: “open loop” vs. “closed loop” “open loop”: optimal control is an explicit function of time, depends on 𝑥 0 -- “programmed control” “closed loop”: feedback control, 𝑢 𝑡 depends on 𝑥 𝑡 , but not on 𝑥 0 . e.g., 𝑢 𝑡 =−𝐾 𝑡 𝑥 𝑡 Feedback control is advantageous in presence of noise, model errors. Optimal feedback control arises from a specific optimal control problems, the LQP.
Derivation of Minimum Principle min 𝑉 𝒙,𝒖 =Φ 𝒙 𝑡 𝑓 + 0 𝑡 𝑓 𝐿 𝒙 𝑡 ,𝒖 𝑡 ,𝑡 𝑑𝑡 𝒙 =𝒇 𝒙,𝒖,𝑡 𝒙 𝑛×1 , 𝒖 𝑟×1 Φ, 𝐿, 𝑓 have continuous 1st partial w.r.t. 𝒙,𝒖,𝑡 Form Lagrangian 𝑉 𝑢 =Φ+ 𝑡 0 𝑡 𝑓 𝐿+ 𝝀 𝑇 𝒇− 𝒙 𝑑𝑡 Multipliers: adjoint variables, costates
( 𝝀 𝑇 𝒙 𝑑𝑡 = 𝝀 𝑇 𝒙 𝑡 𝑓 − 𝝀 𝑇 𝒙 𝑡 𝟎 + 𝝀 𝑇 𝒙 𝑑𝑡 ) Define 𝐻=𝐿+ 𝝀 𝑇 𝒇 (Hamiltonian) 𝑉 𝑢 =Φ+ 𝑡 0 𝑡 𝑓 𝐻− 𝝀 𝑇 𝒙 𝑑𝑡 = Φ 𝑥 − 𝝀 𝑇 𝒙 𝑡 𝑓 + 𝑡 0 𝑡 𝑓 𝐻+ 𝝀 𝑇 𝒙 𝑑𝑡 ( 𝝀 𝑇 𝒙 𝑑𝑡 = 𝝀 𝑇 𝒙 𝑡 𝑓 − 𝝀 𝑇 𝒙 𝑡 𝟎 + 𝝀 𝑇 𝒙 𝑑𝑡 ) Since 𝑉 is Lagrangian, we treat as unconstrained problem with variables: 𝒙 𝑡 , 𝝀 𝑡 , 𝒖 𝑡 Use variations: 𝛿𝒙 𝑡 , 𝛿𝒖 𝑡 , 𝛿 𝑉 (for 𝛿𝝀 𝑡 => original constraint, the state equation.) 𝛿 𝑉 =0= 𝑑Φ 𝑑𝑥 − 𝜆 𝑇 𝑡 𝑓 + 𝜆 𝑇 𝛿𝑥 𝑡 0 + 𝑡 0 𝑡 𝑓 𝐻 𝑢 𝛿𝑢+ 𝐻 𝑥 𝛿𝑥+ 𝜆 𝑇 𝛿𝑥 𝑑𝑡
Since 𝛿𝑥 𝑡 , 𝛿𝑢 𝑡 are arbitrary (≠0), then 𝜕𝐻 𝜕𝑥 + 𝜆 =0 𝜆 =− 𝜕𝐻 𝜕𝑥 (n equations. “adjoint equation”) 𝜕𝐻 𝜕𝑢 =0, “optimality equation” for weak minimum 𝑡= 𝑡 𝑓 , 𝜕Φ 𝜕𝑥 −𝜆=0 𝜆 𝑡 𝑓 =− 𝜕Φ 𝜕𝑥 𝑡 𝑓 (n boundary conditions) If 𝑥 𝑡 0 is specified, then 𝛿𝑥 𝑡 0 =0 Two point boundary value problem (“TPBVP”)
Example: 𝑑 𝑥 1 𝑑𝑡 =𝑢− 𝑥 1 (1st order transfer function) min 𝑉= 1 2 0 𝑡 𝑓 𝑥 1 2 + 𝑢 2 𝑑𝑡 LQP 𝐻= 1 2 𝑥 1 2 + 𝑢 2 + 𝜆 1 𝑢− 𝑥 1 𝜆 1 =− 𝑥 1 + 𝜆 1 , 𝜆 1 𝑡 𝑓 =0 𝐻 𝑢 =𝑢+ 𝜆 1 =0 𝑢 𝑜𝑝𝑡 =− 𝜆 1 (but don’t know 𝜆 1 𝑡 yet)
Free canonical equations (eliminate 𝑢) (1) 𝑥 1 =𝑢− 𝑥 1 =− 𝜆 1 − 𝑥 1 ( 𝑥 1 0 is known) (2) 𝜆 1 =− 𝑥 1 + 𝜆 1 , 𝜆 1 𝑡 𝑓 =0 Combine (1) and (2), 𝜆 1 =2 𝜆 1 𝜆 1 = 𝑘 1 𝑒 2 𝑡 + 𝑘 2 𝑒 − 2 𝑡 0= 𝑘 1 𝑒 2 𝑡 𝑓 + 𝑘 2 𝑒 − 2 𝑡 𝑓 𝑥 1 = 𝜆 1 − 𝜆 1 = 𝑘 1 1− 2 𝑒 2 𝑡 + 𝑘 2 1+ 2 𝑒 − 2 𝑡 𝑥 1 0 = 𝑘 1 1− 2 + 𝑘 2 1+ 2 𝑢 𝑜𝑝𝑡 𝑡 = 𝑥 0 2 −1 + 2 +1 𝑒 2 2 𝑡 𝑓 𝑒 2 𝑡 − 𝑒 2 2 𝑡 𝑓 − 2 𝑡 = 𝑐 1 𝑒 2 𝑡 − 𝑐 2 𝑒 − 2 𝑡 𝑢<0 ∀𝑡 for 𝑥 0 >0, initially correct to reduce 𝑥 𝑡
Another example: 𝑥 1 = 𝑥 2 𝑥 2 =𝑢 (double integrator) 𝑉= 1 2 0 ∞ 𝑥 1 2 + 𝑥 2 2 + 𝑢 2 𝑑𝑡 𝐻= 1 2 𝑥 1 2 + 1 2 𝑥 2 2 + 1 2 𝑢 2 + 𝜆 1 𝑥 2 + 𝜆 2 𝑢 𝜆 1 =− 𝜕𝐻 𝜕 𝑥 1 =− 𝑥 1 𝜆 2 =− 𝜕𝐻 𝜕 𝑥 2 =− 𝑥 2 − 𝜆 1 𝐻 𝑢 =0=𝑢+ 𝜆 2 𝑢 𝑜𝑝𝑡 =− 𝜆 2
Free canonical equations 𝑥 1 = 𝑥 2 𝑥 2 =− 𝜆 2 𝜆 1 =− 𝑥 1 𝜆 2 =− 𝑥 2 − 𝜆 1 (𝒙, 𝝀 coupled) 𝜆 2 − 𝜆 2 + 𝜆 2 =0 Char. Equation: 𝑟 4 − 𝑟 2 +1=0 𝑟′ 2 − 𝑟 ′ +1=0 𝑟 ′ =0.5±0.707𝑗 𝑟=±0.85±0.4𝑗 (4 roots, apply boundary condition)
Can motivate feedback control via discrete time, one step ahead 𝑥 𝑘+1 =𝑒 𝑥 𝑘 +𝑓 𝑢 𝑘 Set 𝑘=0, 𝑥 1 =𝑒 𝑥 0 +𝑓 𝑥 0 ( 𝑥 0 fixed) min 𝑉= 𝑥 1 2 +𝑎 𝑢 0 2 𝑉= 𝑒 𝑥 0 +𝑓 𝑢 0 2 +𝑎 𝑢 0 2 𝜕𝑉 𝜕 𝑢 0 =2𝑓 𝑒 𝑥 0 +𝑓 𝑢 0 +2𝑎 𝑢 0 =0 0=𝑒 𝑥 0 +𝑓 𝑢 0 + 𝑎 𝑓 𝑢 0 𝑢 0 = −𝑒 𝑥 0 𝑓+ 𝑎 𝑓 Feedback control
Continuous Time LQP 𝒙 =𝑨𝒙+𝑩𝒖 𝑉= 1 2 𝒙 𝑇 𝑡 𝑓 𝑺𝒙 𝑡 𝑓 + 1 2 0 𝑡 𝑓 𝒙 𝑇 𝑸𝒙+ 𝒖 𝑇 𝑹𝒖 𝑑𝑡 𝑺, 𝑸≥𝑶, 𝑹≥𝑶 𝐻= 𝝀 𝑇 𝑨𝒙+𝑩𝒖 + 1 2 𝒙 𝑇 𝑸𝒙+ 1 2 𝒖 𝑇 𝑹𝒖 𝝀 =−𝑸𝒙− 𝑨 𝑇 𝝀, 𝝀 𝑡 𝑓 =𝑺𝒙 𝑡 𝑓 𝑯 𝒖 =𝑶= 𝑩 𝑇 𝝀+𝑹𝒖 𝒖 𝑜𝑝𝑡 =− 𝑹 −1 𝑩 𝑇 𝝀 (𝑹>𝑶) 𝑯 𝒖𝒖 =𝑹>𝑶
Free canonical equations 𝒙 =𝑨𝒙−𝑩 𝑹 −1 𝑩 𝑇 𝝀 (𝒙 0 given) 𝝀 =−𝑸𝒙− 𝑨 𝑇 𝝀 (𝝀 𝑡 𝑓 given) Let 𝝀=𝑷𝒙 (Riccati transformation) 𝒖 𝑜𝑝𝑡 =− 𝑹 −1 𝑩 𝑇 𝑷𝒙, let 𝑲= 𝑹 −1 𝑩 𝑇 𝑷 (feedback control) Then we have ODE in 𝑷 𝒙 =𝑨𝒙−𝑩 𝑹 −1 𝑩 𝑇 𝑷𝒙 (1) 𝝀 =−𝑸𝒙− 𝑨 𝑇 𝝀 𝑷 𝒙+𝑷 𝒙 =−𝑸𝒙− 𝑨 𝑇 𝑷𝒙 (2)
Substitute Eq. (1) into Eq. (2): 𝑷 +𝑷𝑨+ 𝑨 𝑇 𝑷−𝑷𝑩 𝑹 −1 𝑩 𝑇 𝑷+𝑸=𝑶 (Riccati ODE) 𝑷 𝑡 𝑓 =𝑺 ( backward time integration) At steady state, 𝑷→ 𝑷 𝑒 for 𝑡 𝑓 → ∞, solve steady state equation. 𝑷 is symmetric, 𝑷= 𝑷 𝑇
Example 𝑸= 0 0 0 1 , 𝑡 𝑓 →∞ 𝑨= −1 0 1 0 , 𝑩= 1 0 , 𝑅=0.1 Plug into Riccati Equation (Steady state) 5 𝑃 11 2 + 𝑃 11 − 𝑃 12 =0 10 𝑃 12 2 −1=0 1+10 𝑃 11 𝑃 12 − 𝑃 22 =0 𝑃 11 =0.1706 𝑃 22 =0.8556 𝑃 12 = 𝑃 21 =0.3162 Feedback Matrix: 𝑲= 𝑹 −1 𝑩 𝑇 𝑷= −1.706 −3.162
Generally 3 ways to solve steady state Riccati Equation: (1) integration of ode’s steady state; (2) Newton-Raphson (non linear equation solver); (3) transition matrix (analytical solution).
Transition matrix approach 𝒙 𝝀 = 𝜸 = 𝑨 −𝑩 𝑹 −1 𝑩 𝑇 −𝑸 − 𝑨 𝑇 𝜸 Reverse time integration (Boundary Condition: at 𝑡= 𝑡 𝑓 ): Let 𝜏= 𝑡 𝑓 −𝑡 When 𝑡= 𝑡 𝑓 , 𝜏=0 𝑑𝜸 𝑑𝜏 = 𝜸 = −𝑨 𝑩 𝑹 −1 𝑩 𝑇 𝑸 𝑨 𝑇 𝜸 𝜸= 𝑒 𝒛 𝜏 𝜸 𝜏=0 Partition exponential 𝒙 𝝀 =𝜸= 𝜃 11 𝜃 12 𝜃 21 𝜃 22 𝜸 𝜏=0
𝒙 𝜏 = 𝜃 11 𝒙 𝑡 𝑓 + 𝜃 12 𝝀 𝑡 𝑓 = 𝜃 11 𝒙 𝑡 𝑓 + 𝜃 12 𝑷 𝑡 𝑓 𝒙 𝑡 𝑓 (1) 𝝀 𝜏 = 𝜃 21 𝒙 𝑡 𝑓 + 𝜃 22 𝝀 𝑡 𝑓 𝑷 𝜏 𝒙 𝜏 = 𝜃 21 𝒙 𝑡 𝑓 + 𝜃 22 𝑷 𝑡 𝑓 𝒙 𝑡 𝑓 (2) Combine (1) and (2), factor out 𝒙 𝑡 𝑓 𝑷 𝜏 𝜃 11 + 𝜃 12 𝑷 𝑡 𝑓 = 𝜃 21 + 𝜃 22 𝑷 𝑡 𝑓 Fix integration ∆𝑡, 𝜃 𝑖𝑗 Δ𝑡 is fixed 𝑷 𝑡−∆𝑡 𝜃 11 + 𝜃 12 𝑷 𝑡 = 𝜃 21 + 𝜃 22 𝑷 𝑡 Boundary condition: 𝑷 𝑡 𝑓 =𝑺 Backward time integration of 𝑃, then forward time integration 𝒙 =𝑨𝒙+𝑩𝒖 𝒖=− 𝑹 −1 𝑩 𝑇 𝑷𝒙
Integral Action (eliminate offset) Add terms 𝒖 𝑇 𝑹 𝒖 or 𝒙 1 𝑇 𝑸 𝒙 1 to objective function Example: 𝑥 1 =𝑎 𝑥 1 +𝑏𝑢 𝑉= 1 2 𝑞 𝑥 1 2 +𝑟 𝑢 2 + 𝑞 𝑑𝑢 𝑑𝑡 2 𝑑𝑡 Augment state equation 𝑥 1 =𝑎 𝑥 1 +𝑏𝑢 (new state variable) 𝑑𝑢 𝑑𝑡 =𝑤 (new control variable) Calculate feedback control 𝑤 𝑜𝑝𝑡 =− 𝑘 1 𝑥 1 − 𝑘 2 𝑢 𝑑𝑢 𝑑𝑡 =− 𝑘 1 𝑥 1 − 𝑘 2 𝑥 1 −𝑎 𝑥 1 1 𝑏 Integrate: 𝑢= 𝑘′ 1 𝑥 1 𝑑𝑡 + 𝑘′ 2 𝑥 1
Second method: 𝑥 0 = 𝑥 1 𝑑𝑡 ; 𝑥 0 = 𝑥 1 𝑉= 1 2 𝑞 𝑥 1 2 +𝑟 𝑢 2 + 𝑞 𝑥 0 2 𝑑𝑡 𝑥 0 = 𝑥 1 𝑥 1 =𝑎 𝑥 1 +𝑏𝑢 Optimal control: 𝑢=− 𝑘 1 𝑥 1 − 𝑘 0 𝑥 0 =− 𝑘 1 𝑥 1 − 𝑘 0 𝑥 1 𝑑𝑡 With more state variables, PID controller