concentrate on the decisions in the short interval from t to t+ A by summarizing the outcome in the remaining period in the value func tion, J *(*(t+Ar),+An) By the definition of the value function, any admissible control cannot do better than the value function if the initial state is the same. Consider the following special type of control, u(!),tst'sty: the control is arbitrary between time t and time t+A and optimal in the remaining period given the state reached at time t+At. Then the corresponding value of the objective functional satisfies 1+A J*(x*()1)2,6(x('),(),t)d+J*(x(t+△),t+△) (1.11) where x(o,), tst'st, is the state variable corresponding to the control u(t) with the initial state x(=x*(o Combining(1.10)and(1. 11)yields t+△t *(x*(),0)=0(x*(1),*(t),1)d+J*(x*(t+△,t+△1) ≥f0(x(,u(t),tdr+J*(x(+△,t+△) for any u()∈U,t≤!≤t+△ This shows that the optimal control in the interval [, /+Ad] maximizes the sum of the objective functional in the interval and the maximum possible value of the functional in od [+Ar, 1. If both sides of the inequality are differentiabl Taylor's expansion around t yields I The details of Taylors expansion here are as follows. Taylor's theorem states that if F(is F()=F(a)+(1-a)F(a)+o(t-a) satisfies F0(1)=f6(D) =f0(x+(D),a(1),D)t+J*(x*(n),D) +[(a*(x*(1)D)/ox)*0)+aJ*(x·(m),D)/O4+o(M)
Appendix IV 194 concentrate on the decisions in the short interval from t to t + Dt by summarizing the outcome in the remaining period in the value func tion, J * (x * (t + Dt),t + Dt) . By the definition of the value function, any admissible control cannot do better than the value function if the initial state is the same. Consider the following special type of control, 1 u(t¢), t £ t¢ £ t : the control is arbitrary between time t and time t + Dt and optimal in the remaining period given the state reached at time t + Dt . Then the corresponding value of the objective functional satisfies * ( * ( ), ) ( ( ), ( ), ) * ( ( ), ) 0 J x t t f x t u t t dt J x t t t t t t t > ¢ ¢ ¢ ¢ + + D + D ò + D (1.11) where x(t¢) , 1 t £ t¢ £ t , is the state variable corresponding to the control u(t¢) with the initial state x(t) = x * (t) . Combining (1.10) and (1.11) yields ( ( ), ( ), ) ( ( ), ) ( ( ), ) ( ( ), ( ), ) ( ( ), ) 0 0 f x t u t t dt J x t t t t J x t t f x t u t t dt J x t t t t t t t t t t ³ ¢ ¢ ¢ ¢ + * + D + D * * = * ¢ * ¢ ¢ ¢ + * * + D + D ò ò + D + D for any u(t¢)ÎU ,t £ t¢£ t + Dt . (1.12) This shows that the optimal control in the interval [t,t + Dt] maximizes the sum of the objective functional in the interval and the maximum possible value of the functional in the rest of the period [ ]1 t + Dt,t . If both sides of the inequality are differentiable, Taylor's expansion around t yields1 1 The details of Taylor's expansion here are as follows. Taylor's theorem states that if F (t) is differentiable at t = a , then F(t) = F (a) + (t - a)F(a) + o(t - a) , where 0 ( ) lim 0 = - - - ® t a o t a t a .Noting that ò +D + D º ¢ ¢ t t t F (t t) f (t )dt 0 0 satisfies ( ) ( ), 0 0 F t = f t ¢ we obtain [( *( *( ), ) ) *( ) *( *( ), ) ] ( ), ( * ( ), *( ), ) *( *( ), ) ( * ( ), * ( '), ') ' * ( *( ), ) 0 0 J x t t x x t J x t t t t o t f x t u t t t J x t t f x t u t t dt J x t t t t t t t + ¶ ¶ +¶ ¶ D + D = D + ¢ + + D + D ò +D & and
-(OJ*(x*(1),)/)△t =f0(x*(D),u*(,D)△+(aJ*(x*(t),D)/ox)f1(x*(D),Lu*(D),D)△t+ ≥f0(x*(D),(0,0)At+(aJ*(x*(D,1)ax)f(x*(D,u(D),D)M+… for any u()∈U, where. represents higher order terms which become negligible as At tends to zero since they approach zero faster than At. Note that we used x(o=x*(o x(t)=f1(x(D),u(1),D)andx*(t)=f1(x*(1),ut*(,1) Inequality(1. 13) has a natural economic interpretation. For example, if a firm is contemplating the optimal capital accumulation policy, fo(x*(O),u(O),t)4,is approximately the amount of profits earned in the period L, t+Az a*(x*(0, 1)/ax is the marginal value of capital, or the contribution of an additional unit of capital at time t; and fI(x*(0),u(0), t)At=x()At is approximately the amount of capital accumulated in period t, t+Ar]. Thus (a*/ax)Ar rep of capital accumulated during the period. (1. 13), therefore, shows that the optimal control vector maximizes the sum of the current profits and the value of increased Dividing(1. 13)by At and taking limits as At approaches zero, we obtain a*(x*(0),0)/at =f0(x*+(D),u*(D,0)+(aJ*(x*(D),1)/Ox)f1(x*(),u*(m),1) ≥后(x*(1),u(D),D)+(aJ*(x*(m),1)/ax)f(x*(D),u(D,D) (1)∈U Thus the optimal control vector u*(1 maximizes f(x*(1),u,1)+(aJ*(x*()1)/ax)f1(x*(m),u,D) (1.15) at each instant of time, and we have finally transformed the problem of finding the optimal path to that of finding optimal numbers at each point in time. From the above discussion, it must be clear that(1. 15)summarizes both the instantaneous effect and the indirect effect through a change in the state variable (1. 14) can be rewritten as fo(x(r,u(r),rdr+J*(x(+△),t+△) =f0(x(),a(D),1)△t+J*(x()n) +[(aJ*(x(),n)/ax)(t)+aJ*(x(n),1)/oM+o(△) fo(x*(),(D),D)M+J*(x*(D),D) (aJ*(x*(),)/ax)x(n)+a/*(x*(m),)/an]M+o△) where we used x(n)=x*(O). Substituting these two equations into(1. 12)yields(1. 13) 195
Appendix IV 195 ( *( ), ( ), ) ( *( *( ), )/ ) ( *( ), ( ), ) , ( *( ), *( ), ) ( *( *( ), )/ ) ( *( ), *( ), ) ( *( *( ), )/ ) 0 1 0 1 K K ³ D + ¶ ¶ D + = D + ¶ ¶ D + - ¶ ¶ D f x t u t t t J x t t x f x t u t t t f x t u t t t J x t t x f x t u t t t J x t t t t for any u(t) Î U , (1.13) where ... represents higher order terms which become negligible as Dt tends to zero, since they approach zero faster than Dt . Note that we used x(t) = x * (t) , ( ) ( ( ), ( ), ) 1 x& t = f x t u t t and *( ) ( *( ), *( ), ). 1 x& t = f x t u t t Inequality (1.13) has a natural economic interpretation. For example, if a firm is contemplating the optimal capital accumulation policy, f (x*(t),u(t),t)Dt 0 , is approximately the amount of profits earned in the period [t,t + Dt] . ¶J * (x * (t),t) / ¶x is the marginal value of capital, or the contribution of an additional unit of capital at time t; and f (x*(t),u(t),t)Dt = x(t)Dt 1 & is approximately the amount of capital accumulated in period [t,t + Dt]. Thus ¶J ¶x f Dt 1 ( */ ) represents the value of capital accumulated during the period. (1.13), therefore, shows that the optimal control vector maximizes the sum of the current profits and the value of increased capital. Dividing (1.13) by At and taking limits as At approaches zero, we obtain ( *( ), ( ), ) ( *( *( ), )/ ) ( *( ), ( ), ) ( *( ), *( ), ) ( *( *( ), )/ ) ( *( ), *( ), ) *( *( ), )/ 0 1 0 1 f x t u t t J x t t x f x t u t t f x t u t t J x t t x f x t u t t J x t t t ³ + ¶ ¶ = + ¶ ¶ - ¶ ¶ for any u(t)ÎU . (1.14) Thus the optimal control vector u * (t) maximizes ( *( ), , ) ( *( *( ), )/ ) ( *( ), , ) 0 1 f x t u t + ¶J x t t ¶x f x t u t (1.15) at each instant of time, and we have finally transformed the problem of finding the optimal path to that of finding optimal numbers at each point in time. From the above discussion, it must be clear that (1.15) summarizes both the instantaneous effect and the indirect effect through a change in the state variable. (1.14) can be rewritten as [( *( *( ), ) ) ( ) *( * ( ), ) ] ( ), ( * ( ), ( ), ) *( * ( ), ) [( *( ( ), ) ) ( ) * ( ( ), ) ] ( ) ( ( ), ( ), ) *( ( ), ) ( ( '), ( '), ') ' * ( ( ), ) 0 0 0 J x t t x x t J x t t t t o t f x t u t t t J x t t J x t t x x t J x t t t t o t f x t u t t t J x t t f x t u t t dt J x t t t t t t t + ¶ ¶ + ¶ ¶ D + D = D + + ¶ ¶ + ¶ ¶ D + D = D + + + D + D ò +D & & where we used x(t) = x *(t) . Substituting these two equations into (1.12) yields (1.13)