School ofAutomation,XJTU最优策略具有这样的性质:不管过去控制策略如何,余下阶段的控制策略必须是关于目前状态的最优策略。(Bellman1957)[例]设C点是由A点到B点的最优路径上的任一点,那么由C点到B点的最优路径仍然是A到B的最优路径上C到B的部分。BA2CCAIYUANLI
School of Automation, XJTU © CAI YUANLI 2 最优策略具有这样的性质:不管过去控制策略如何,余下阶段的 控制策略必须是关于目前状态的最优策略。(Bellman 1957) [例] 设 C 点是由 A 点到 B 点的最优路径上的任一点,那么由 C 点 到 B 点的最优路径仍然是 A 到 B 的最优路径上 C 到 B 的部分。 C A B
School of Automation,XJTU2确定性最优控制基础2.1连续时间系统(2. 1)x(t) = f(x(t),u(t)), x(to) = xoJ(xo,to) = [T, x(T)] + JT L(x, u, t)dt(2.2)问题:求u*(t)E2r,使得J*(xo,to)≤J(xo,to)。CAIYUANLI3
School of Automation, XJTU © CAI YUANLI 3 2 确定性最优控制基础 连续时间系统 𝑥̇(𝑡) = 𝑓(𝑥(𝑡), 𝑢(𝑡)),𝑥(𝑡0 ) = 𝑥0 (2.1) 𝐽(𝑥0 ,𝑡0 ) = 𝜑[𝑇, 𝑥(𝑇)] + ∫ 𝐿(𝑥, 𝑢,𝑡)𝑑𝑡 𝑇 𝑡0 (2.2) 问题:求𝑢 ∗ (𝑡) ∈ 𝔄,使得𝐽 ∗ (𝑥0 ,𝑡0 ) ≤ 𝐽(𝑥0 ,𝑡0 )
School ofAutomation,XJTU一般地J*[x(t),t] = min(Φ[T, x(T)] + f' L[x(t), u(t), t]dt)(2.3)u(Tt≤T≤T根据最优性原理t+4tminL[x(t), u(t), t]dt + J*[x(t + t),t + 4tl)J*[x(t),t] =u(t)tst≤t+4t泰勒级数展开上式右端第2项:J*[x(t + △t),t + △t] = J*[x(t),t] + Jt[x(t),t]4t4CAIYUANLI
School of Automation, XJTU © CAI YUANLI 4 一般地 𝐽 ∗ [𝑥(𝑡),𝑡] = 𝑚𝑖𝑛 𝑢(𝜏) 𝑡≤𝜏≤𝑇 {𝜑[𝑇, 𝑥(𝑇)] + ∫ 𝐿[𝑥(𝜏), 𝑢(𝜏), 𝜏]𝑑𝜏 𝑇 𝑡 } (2.3) 根据最优性原理 𝐽 ∗ [𝑥(𝑡),𝑡] = min 𝑢(𝜏) 𝑡≤𝜏≤𝑡+𝛥𝑡 {∫ 𝐿[𝑥(𝜏), 𝑢(𝜏), 𝜏]𝑑𝜏 + 𝐽 ∗ [𝑥(𝑡 + 𝛥𝑡),𝑡 + 𝛥𝑡] 𝑡+𝛥𝑡 𝑡 } 泰勒级数展开上式右端第 2 项: 𝐽 ∗ [𝑥(𝑡 + 𝛥𝑡),𝑡 + 𝛥𝑡] = 𝐽 ∗ [𝑥(𝑡),𝑡] + 𝐽𝑡 ∗ [𝑥(𝑡),𝑡]𝛥𝑡
School ofAutomation,XJTU+ J*[x(t),t]T f[x(t),u(t),t]4t + o(4t)从而可得:-Jt[x(t),t] = min(L[x(t), u(t),t] + J*[x(t), t]T f[x(t),u(t),t (2.4)u(t)记(Hamilton函数)H[x(t),u(t),Jx, t] = L[x(t), u(t),t] + J*[x(t),t]T f[x(t), u(t),t](2.5)那么(2.6)-Jt[x(t),t] = minH[x(t), u(t), Jx,t]u(t)5CCAIYUANLI
School of Automation, XJTU © CAI YUANLI 5 + 𝐽𝑥 ∗ [𝑥(𝑡),𝑡] 𝑇𝑓[𝑥(𝑡), 𝑢(𝑡),𝑡]𝛥𝑡 + 𝑜(𝛥𝑡) 从而可得: −𝐽𝑡 ∗ [𝑥(𝑡),𝑡] = min 𝑢(𝑡) {𝐿[𝑥(𝑡), 𝑢(𝑡),𝑡] + 𝐽𝑥 ∗ [𝑥(𝑡),𝑡] 𝑇𝑓[𝑥(𝑡), 𝑢(𝑡),𝑡]} (2.4) 记(Hamilton 函数) 𝐻[𝑥(𝑡), 𝑢(𝑡),𝐽𝑥 ∗ ,𝑡] = 𝐿[𝑥(𝑡), 𝑢(𝑡),𝑡] + 𝐽𝑥 ∗ [𝑥(𝑡),𝑡] 𝑇𝑓[𝑥(𝑡), 𝑢(𝑡),𝑡] (2.5) 那么 −𝐽𝑡 ∗ [𝑥(𝑡),𝑡] = min 𝑢(𝑡) 𝐻[𝑥(𝑡), 𝑢(𝑡),𝐽𝑥 ∗ ,𝑡] (2.6)
School ofAutomation,XJTU哈密尔顿-雅可比-贝尔曼(HJB)方程边界条件:(2.7)J*[x(T), T] = [T, x(T)]【庞特里亚金极小值原理](2.8)H[x(t),u(t),a(t),t] = L[x(t), u(t),t] + ^ (t)f[x(t),u(t),t](2.9)u*(t) = arg minH[x(t), u(t), ^(t),t)u(t)CAIYUANLI6
School of Automation, XJTU © CAI YUANLI 6 ——哈密尔顿-雅可比-贝尔曼(HJB)方程 边界条件: 𝐽 ∗ [𝑥(𝑇), 𝑇] = 𝜑[𝑇, 𝑥(𝑇)] (2.7) [庞特里亚金极小值原理] 𝐻[𝑥(𝑡), 𝑢(𝑡), 𝜆(𝑡),𝑡] = 𝐿[𝑥(𝑡), 𝑢(𝑡),𝑡] + 𝜆 𝑇 (𝑡)𝑓[𝑥(𝑡), 𝑢(𝑡),𝑡] (2.8) 𝑢 ∗ (𝑡) = arg min 𝑢(𝑡) 𝐻[𝑥(𝑡), 𝑢(𝑡), 𝜆(𝑡),𝑡] (2.9)