EEE TRANSACTIONS ON SYSTEMS. MAN. AND CYBERNETICS-PART C: APPLICATIONS AND REVIEWS VOL 30. NO. 1. FEBUARY 2000 Optimal Design of CMAc Neural-Network Controller for robot manipulators Young h. Kim and Frank L. Lewis, Fellow, IEEE Abstract-This paper is concerned with the application neural-network based, closed-loop control can be found [12 of quadratic optimization for motion control to feedback For indirect or identification-based, robotic-system control, sev control of robotic systems using cerebellar model arithmetic eral neural network and learning schemes c can be found in the lit computer (CMAC) neural networks. Explicit solutions to the I control erature. Most of these approaches consider neural networks as of robotic systems are found by solving an algebraic Riccati equa- very general computational models. Although a pure neural-net- tion. It is shown how the CMAC's can cope with nonlinearities work approach without a knowledge of robot dynamics may be through optimization with no preliminary of -line learning phase promising, it is important to note that this approach will not be punov stability analysis, so that both system-tracking stability and very practical due to high dimensionality of input-output space error convergence can be guaranteed in the closed-loop system. In this way, the training or off-line learning process by pure con- The filtered-tracking error or critic gain and the Lyapunov nectionist models would require a neural network of impractical function for the nonlinear analysis are derived from the user input size and unreasonable number of repetition cycles. The pure in terms of a specified quadratic-performance index Simulation connectionist approach has poor generalization properties results from a two-link robot manipulator show the satisfactory In this paper, we propose a ne performance of the proposed control schemes even in the presence that integrates linear optimal-control techniques and CMAC neural-network learning methods. The linear optimal control Index Terms--CMAC neural network, optimal control, robotic has an inherent robustness against a certain range of model uncertainties [9]. However, nonlinear dynamics cannot be taken nto consideration in linear optimal-control design. We use L. INTRODUCTION the Cmac neural networks to adaptively estimate nonlinear I hERE has been some work related to applying optimal ncertainties, yielding a controller that can tolerate a wider control techniques to the nonlinear robotic manipulator. range of uncertainties. The salient feature of this H-J-B control These approaches often combine feedback linearization and op- design is that we can use a priori knowledge of the plant timal-control techniques Johansson [6] showed explicit solu- dynamics as the system equation in the corresponding linear tions to the Hamilton-Jacobi-Bellman(H-J-B)equation for optimal-control design. The neural network is used to improve optimal control of robot motion and how optimal control and performance in the face of unknown nonlinearities by adding adaptive control may act in concert in the case of unknown nonlinear effects to the linear optimal controller. or uncertain system parameters. Dawson et al. [5] used a gen- The paper is organized as follows In Section II, we will re- ral-control law known as modified computed-torque control view some fundamentals of the CMAC neural networks.In Sec MCTC)and quadratic optimal-control theory to derive a pa- tion I, we give a new control design for rigid robot systems rameterized proportional-derivative(PD)form for an auxiliary using the H-J-B equation In Section IV, a CMAC controller oput to the controller. However, in actual situations, the robot combined with the optimal-control signal is proposed In Sec- dynamics is rarely known completely, and thus, it is difficult to tion v, a two-link robot controller is designed and simulated in express real robot dynamics in exact mathematical equations or to linearize the dynamics with respect to the operating point Neural networks have been used for approximation of non linear systems, for classification of signals, and for associative memory For control engineers, the approximation capability of Let R denote the real numbers, n the real n-vectors, and tification-based control. More work is now appearing on the a∈as础l-=√+…+ and the norm of a matrix use of neural networks in direct, closed-loop controllers that AE mxn as (A!l=vAma[AT A] where Ama[ 1 and Amin[l yield guaranteed performance [13]. The robotic application of are the largest and smallest eigenvalues of amatrix. The absolute ue is denoted as· aii and B∈我m×n, the Frobenius norm is supported by NSF Grant ECS-952167. defined by非1=tr(44)=∑吗 with tr() as the trace 种如需分+mmmm ykim50@hotmail.com;flewis@arri.uta.edu) Publisher Item Identifier S 1094-6977(00)00364-3 Ar2≤‖Arl
22 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 Optimal Design of CMAC Neural-Network Controller for Robot Manipulators Young H. Kim and Frank L. Lewis, Fellow, IEEE Abstract—This paper is concerned with the application of quadratic optimization for motion control to feedback control of robotic systems using cerebellar model arithmetic computer (CMAC) neural networks. Explicit solutions to the Hamilton–Jacobi–Bellman (H–J–B) equation for optimal control of robotic systems are found by solving an algebraic Riccati equation. It is shown how the CMAC’s can cope with nonlinearities through optimization with no preliminary off-line learning phase required. The adaptive-learning algorithm is derived from Lyapunov stability analysis, so that both system-tracking stability and error convergence can be guaranteed in the closed-loop system. The filtered-tracking error or critic gain and the Lyapunov function for the nonlinear analysis are derived from the user input in terms of a specified quadratic-performance index. Simulation results from a two-link robot manipulator show the satisfactory performance of the proposed control schemes even in the presence of large modeling uncertainties and external disturbances. Index Terms—CMAC neural network, optimal control, robotic control. I. INTRODUCTION T HERE has been some work related to applying optimalcontrol techniques to the nonlinear robotic manipulator. These approaches often combine feedback linearization and optimal-control techniques. Johansson [6] showed explicit solutions to the Hamilton–Jacobi–Bellman (H–J–B) equation for optimal control of robot motion and how optimal control and adaptive control may act in concert in the case of unknown or uncertain system parameters. Dawson et al. [5] used a general-control law known as modified computed-torque control (MCTC) and quadratic optimal-control theory to derive a parameterized proportional-derivative (PD) form for an auxiliary input to the controller. However, in actual situations, the robot dynamics is rarely known completely, and thus, it is difficult to express real robot dynamics in exact mathematical equations or to linearize the dynamics with respect to the operating point. Neural networks have been used for approximation of nonlinear systems, for classification of signals, and for associative memory. For control engineers, the approximation capability of neural networks is usually used for system identification or identification-based control. More work is now appearing on the use of neural networks in direct, closed-loop controllers that yield guaranteed performance [13]. The robotic application of Manuscript received June 2, 1997; revised June 23, 1999. This research was supported by NSF Grant ECS-9521673. The authors are with the Automation and Robotics Research Institute, University of Texas at Arlington, Fort Worth, TX 76118-7115 USA (e-mail: ykim50@hotmail.com; flewis@arri.uta.edu). Publisher Item Identifier S 1094-6977(00)00364-3. neural-network based, closed-loop control can be found [12]. For indirect or identification-based, robotic-system control, several neural network and learning schemes can be found in the literature. Most of these approaches consider neural networks as very general computational models. Although a pure neural-network approach without a knowledge of robot dynamics may be promising, it is important to note that this approach will not be very practical due to high dimensionality of input–output space. In this way, the training or off-line learning process by pure connectionist models would require a neural network of impractical size and unreasonable number of repetition cycles. The pure connectionist approach has poor generalization properties. In this paper, we propose a nonlinear optimal-design method that integrates linear optimal-control techniques and CMAC neural-network learning methods. The linear optimal control has an inherent robustness against a certain range of model uncertainties [9]. However, nonlinear dynamics cannot be taken into consideration in linear optimal-control design. We use the CMAC neural networks to adaptively estimate nonlinear uncertainties, yielding a controller that can tolerate a wider range of uncertainties. The salient feature of this H–J–B control design is that we can use a priori knowledge of the plant dynamics as the system equation in the corresponding linear optimal-control design. The neural network is used to improve performance in the face of unknown nonlinearities by adding nonlinear effects to the linear optimal controller. The paper is organized as follows. In Section II, we will review some fundamentals of the CMAC neural networks. In Section III, we give a new control design for rigid robot systems using the H–J–B equation. In Section IV, a CMAC controller combined with the optimal-control signal is proposed. In Section V, a two-link robot controller is designed and simulated in the face of large uncertainties and external disturbances. II. BACKGROUND Let denote the real numbers, the real -vectors, and the real matrices. We define the norm of a vector as and the norm of a matrix as where and are the largest and smallest eigenvalues of a matrix. The absolute value is denoted as . Given and , the Frobenius norm is defined by with as the trace operator. The associated inner product is . The Frobenius norm is compatible with the two-norm so that with and . 1094–6977/00$10.00 © 2000 IEEE
KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER receptive field adjustable Fig. 1. Architecture of a Cmac neural network Fig. I shows the architecture and operation of the CMAC. The 3 =[1 u2. sional Receptive-Field Functions: Given any A. CMAC Neural Networks 2)Multisim nl E 3e", the multidimensional receptive- CMAC can be used to approximate a nonlinear mapping y(r): field functions are defined as Ⅺ→ Y where X" C9 is the application in mensional input space and Y c gm in the application 9i1,2,…,j ,n(x1)·p2,(x2)…pm,n(xn)(3) space. The CMAC algorithm consists of two primary functions for determining the value of a complex function, as shown in Fig 1 1,……, n The output of the CMAc is given by R:X→A P.A→Y v(x)=>m(x),j=1,…,m( where where X continuous n-dimensional input space Wji E s output-layer weight values, a NA-dimensional association space O: continuous, multidimensional receptive-fie Y m-dimensional output space → function; The function p= R()is fixed and maps each point in the NA number of the association point. input space onto the association space A The function P() The effect of receptive-field basis function type and partition computes an output ye Y by projecting the association vector number along each dimension on the Cmac performance has determined by R(r)onto a vector of adjustable weights such not yet been systematically studied The output of the CMAC can be expressed in a vector notation y=P(p)=wp v(r)=w p(a) R()in(1)is the multidimensional receptive field function I)Receptive-Field Fumction: Given x=[T1r2..mI E Re, let Eci mim Si]ev1<i<n be domain of interest w matrix of adjustable weight values For this domain, select integers Ni and strictly increasing par- p(r), vector of receptive-field functions titions Based on the approximation property of the CMAC, there ex ists ideal weight values W, so that the function to be approxi mated can be represented as 丌;=[x;,1x,2…x;,N 1<< f(a=wpr)+e(c) or each component of the input space, the receptive-field basis function can be defined as rectangular [1] or triangular [4] or with e(=r)the"functional reconstructional error"and E()< any continuously bounded function, e.g., Gaussian 31
KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER 23 Fig. 1. Architecture of a CMAC neural network. A. CMAC Neural Networks Fig. 1 shows the architecture and operation of the CMAC. The CMAC can be used to approximate a nonlinear mapping : where is the application in the -dimensional input space and in the application output space. The CMAC algorithm consists of two primary functions for determining the value of a complex function, as shown in Fig. 1 (1) where continuous -dimensional input space; -dimensional association space; -dimensional output space. The function is fixed and maps each point in the input space onto the association space . The function computes an output by projecting the association vector determined by onto a vector of adjustable weights such that (2) in (1) is the multidimensional receptive field function. 1) Receptive-Field Function: Given , let be domain of interest. For this domain, select integers and strictly increasing partitions For each component of the input space, the receptive-field basis function can be defined as rectangular [1] or triangular [4] or any continuously bounded function, e.g., Gaussian [3]. 2) Multidimensional Receptive-Field Functions: Given any , the multidimensional receptivefield functions are defined as (3) with , . The output of the CMAC is given by (4) where output-layer weight values; : continuous, multidimensional receptive-field function; number of the association point. The effect of receptive-field basis function type and partition number along each dimension on the CMAC performance has not yet been systematically studied. The output of the CMAC can be expressed in a vector notation as (5) where matrix of adjustable weight values vector of receptive-field functions. Based on the approximation property of the CMAC, there exists ideal weight values , so that the function to be approximated can be represented as (6) with the “functional reconstructional error” and bounded
EEE TRANSACTIONS ON SYSTEMS. MAN. AND CYBERNETICS-PART C: APPLICATIONS AND REVIEWS VOL 30. NO. 1. FEBUARY 2000 Then, an estimate of f(a)can be given by Property 1--Inertia: The inertia matrix M(g) is uniformly bounded f(r) m1I≤M(q)≤m2Im,m2>0andI∈我n,(l6) where W are estimates of the ideal weight values. The Lya punov method is applied to derive reinforcement adaptive Property 2-Skeww Symmetry: The matrix learning rules for the weight values. Since these adaptive learning rules are formulated from the stability analysis of the N(q立=M(q)-2Vm(g立 controlled system, the system performance can be guaranteed or closed-loop control s skew-symmetric Robot Arm Dynam II. OPTIMAL-COMPUTED TORQUE-CONTROLLER DESIGN The dynamics of an n-link robot manipulator may be ex- A.H-J-B Optimization pressed in the Lagrange form [91 Define the velocity-error dynamics M(q)+Vm(q立+F+fc(①)+gq)+7(t)=T(t et)=-Ae(t q(t)∈我 variable The following augmented system is obtained M(q)∈ inertia Vm(q),q)∈我× Coriolis/centripetal forces = gq∈究 gravitational force L+」=[0nxm-M-v M-1 u(ty diagonal matrix of viscous friction co- or with shorter notation efficients. ∫c(i)∈我 Coulomb friction coefficient x(1)=A(q,q)(t)+B(q)u(t) external disturbances The external control torque to each joint is t(tEgen with A(g立∈我x如,B(q)∈我2n,and(t)∈界nx Given a desired trajectory ga(t)E R", the tracking errors are i(t)is defined as it)T=[etT r(tT].A quadratic perfor e(t)=qa(t-t) and e(t)=qa(t)-i(t) (9 mance index J(u) is as follows and the instantaneous performance measure is defined L(E, u)dt rt=et)+ Aet) (10) with the Lagrangian where AE R xn is the constant-gain matrix or critic (not nec- L(E, u)=22(+Qa(t)+3uT(t)Ru(t) The robot dynamics( 8)may be written as 1Q121「e Qi2 Q,r+2 uTRu(22) M(q(t)=-Vmn(,@)r(t)-T(t)+(x)(11) Given the performance index J(u), the control objective is where the robot nonlinear function is to find the auxiliary control input u(t)that minimizes(21)sub. ject to the differential constraints imposed by (19). The optimal h(r)=M(a(d+ Ae+Vm(a, i a+Ae) control that achieves this objective will be denoted by u (t). It is Fq+f(q)+9(q)+7(t) (12) worth noting for now, that only the part of the control-input-to- robotic-system denoted by u(t) in(14) is penalized. This is rea and. for instance sonable from a practical standpoint, since the gravity, Coriolis, and friction-compensation terms in(12)cannot be modified by (t)=[ereq效] (13)the optimal-design phase A necessary and sufficient condition for u(t) to minimize This key function h(r)captures all the unknown dynamics of (21)subject to(20)is that there exist a function V=v(a, ty the robot arm satisfying the H-J-B equation Now define a control-input torque as o(t+milr(2u一 (2,t) (t)=b(x)-u(t) with u(t)E an auxiliary control input to be optimized later. where the Hamiltionian of optimization is defined as The closed-loop system becomes Mart=-vm(, or(t)+ut (15) H av(E, t) L(2,v)+(t
24 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 Then, an estimate of can be given by (7) where are estimates of the ideal weight values. The Lyapunov method is applied to derive reinforcement adaptive learning rules for the weight values. Since these adaptive learning rules are formulated from the stability analysis of the controlled system, the system performance can be guaranteed for closed-loop control. B. Robot Arm Dynamics and Properties The dynamics of an -link robot manipulator may be expressed in the Lagrange form [9] (8) with joint variable; inertia; Coriolis/centripetal forces; gravitational forces; diagonal matrix of viscous friction coefficients; Coulomb friction coefficients; external disturbances. The external control torque to each joint is . Given a desired trajectory , the tracking errors are and (9) and the instantaneous performance measure is defined as (10) where is the constant-gain matrix or critic (not necessarily symmetric). The robot dynamics (8) may be written as (11) where the robot nonlinear function is (12) and, for instance (13) This key function captures all the unknown dynamics of the robot arm. Now define a control-input torque as (14) with an auxiliary control input to be optimized later. The closed-loop system becomes (15) Property 1—Inertia: The inertia matrix is uniformly bounded and (16) Property 2—Skew Symmetry: The matrix (17) is skew-symmetric. III. OPTIMAL-COMPUTED TORQUE-CONTROLLER DESIGN A. H–J–B Optimization Define the velocity-error dynamics (18) The following augmented system is obtained: (19) or with shorter notation (20) with , , and . is defined as . A quadratic performance index is as follows: (21) with the Lagrangian (22) Given the performance index , the control objective is to find the auxiliary control input that minimizes (21) subject to the differential constraints imposed by (19). The optimal control that achieves this objective will be denoted by . It is worth noting for now, that only the part of the control-input-torobotic-system denoted by in (14) is penalized. This is reasonable from a practical standpoint, since the gravity, Coriolis, and friction-compensation terms in (12) cannot be modified by the optimal-design phase. A necessary and sufficient condition for to minimize (21) subject to (20) is that there exist a function satisfying the H–J–B equation [10] (23) where the Hamiltionian of optimization is defined as (24)
KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER and V(a, t) is referred to as the value function. It satisfies the where h(er) is given by(12). It is referred to as an optima partial differential equation computed torque controller(OCTC) av(e,t=L(i, u, )+222.(25)B Stability Analysis Theorem 2: Suppose that matrices K and A exist that satisfy The minimum is attained for the optimal control u(t)=w(t), the hypotheses of Lemma 1, and in addition, there exist con- and the Hamiltonian is then given by stants K,i and k2 such that 0 <hi< k2 < oo, and the spectrum H*=minL(, u)+ov(, t) of P is bounded in the sense that hiI <P< h2I on(to, oo) Then using the feedback control in(29)and(20)results in the controlled nonlinear system H(乏,u* av(a, t) it=A(,,i-B(R B(g PIt). (35) V(2,t) This is globally exponentially stable(GES)regarding the origi 92n Lemma 1: The following function V composed of z, M(g) Proof: The quadratic function v(i, t )is a suitable and a positive symmetric matrix K=K E9pnxn satisfies the Lyapunov function candidate, because it is positive radially H-J-B equation growing with z. It is continuous and has a unique minimum at V=是Pq)=kO2×n 2> the origin of the error space. It remains to show that dv/dt<O (27) for all 2#0. From the solution of the H-J-B equation (A12), it follows that where K and A in(10)and(27) can be found from the riccati differential equation L(2,). PA+APf-PBRBP+P+Q=Onxn.(28)Substituting(29)for(31)gives The optimal control u(t)that minimizes(21) subject to(20)is d(,t)~{2Q+(BP2)r-1(BP2}<0 ut()=-1BP(q)=-Rrt).(29) Vt>0≠0 (37) See Appendix A for proof. The time derivative of the lyapunov function is negative defl Theorem 1: Let the symmetric weighting matrices Q, R be nite, and the assertion of the theorem then follows directly from chosen such that the properties of the Lyapunov function [9] Q Qu Q 75 R=Q2(30) IV CMAC NEURAL-CONTROLLER DESIGN with Q12+Q12<nxn. Then the K and A required in Le The block diagram in Fig. 2 shows the major components I can be determined from the following relations that embody the CMAC neural controller. The external-control torques to the joints are composed of the optimal-feedback con- 1(Q12+QT2)>0nXn (31) trol law given in Theorem I plus the CMAC neural-network The nonlinear robot function can be represented by a CMAC eural network h(x)=W2y(x)+E(x)l|=(x)川≤EM(38) with(32)solved for A using Lyapunov equation solvers(e.g MatLab[15D) where p(sr)is a multidimensional receptive-field function for See Appendix B for proof. the CMAc Remarks Then a functional estimate h(r)of h(r)can be written as 1)In order to guarantee positive definiteness of the con- structed matrix Q, the following inequality [7] must be h(a)=w plr) Aim(Q2)>|Ql2/am(Q1).(3) r(t)=Wry(x)-'(t)-v() 2)With the optimal-feedback control law c(t)calculated using Theorem 1, the torques r(t) to apply to the robotic where u(t) is a robustifying vector. Then(11)becomes system are calculated according to the control input Mr(t)=-Vmr(t)+Wp(=)+E(a)+Ta(t) T*(t)=(x)-"(t +u(t)+u(t
KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER 25 and is referred to as the value function. It satisfies the partial differential equation (25) The minimum is attained for the optimal control , and the Hamiltonian is then given by (26) Lemma 1: The following function composed of , and a positive symmetric matrix satisfies the H–J–B equation: (27) where and in (10) and (27) can be found from the Riccati differential equation (28) The optimal control that minimizes (21) subject to (20) is (29) See Appendix A for proof. Theorem 1: Let the symmetric weighting matrices , be chosen such that (30) with . Then the and required in Lemma 1 can be determined from the following relations: (31) (32) with (32) solved for using Lyapunov equation solvers (e.g., MatLab [15]). See Appendix B for proof. Remarks: 1) In order to guarantee positive definiteness of the constructed matrix , the following inequality [7] must be satisfied (33) 2) With the optimal-feedback control law calculated using Theorem 1, the torques to apply to the robotic system are calculated according to the control input (34) where is given by (12). It is referred to as an optimalcomputed torque controller (OCTC). B. Stability Analysis Theorem 2: Suppose that matrices and exist that satisfy the hypotheses of Lemma 1, and in addition, there exist constants and such that , and the spectrum of is bounded in the sense that on . Then using the feedback control in (29) and (20) results in the controlled nonlinear system (35) This is globally exponentially stable (GES) regarding the origin in . Proof: The quadratic function is a suitable Lyapunov function candidate, because it is positive radially, growing with . It is continuous and has a unique minimum at the origin of the error space. It remains to show that for all . From the solution of the H–J–B equation (A12), it follows that (36) Substituting (29) for (31) gives (37) The time derivative of the Lyapunov function is negative definite, and the assertion of the theorem then follows directly from the properties of the Lyapunov function [9]. IV. CMAC NEURAL-CONTROLLER DESIGN The block diagram in Fig. 2 shows the major components that embody the CMAC neural controller. The external-control torques to the joints are composed of the optimal-feedback control law given in Theorem 1 plus the CMAC neural-network output components. The nonlinear robot function can be represented by a CMAC neural network (38) where is a multidimensional receptive-field function for the CMAC. Then a functional estimate of can be written as (39) The external torque is given by (40) where is a robustifying vector. Then (11) becomes (41)
EEE TRANSACTIONS ON SYSTEMS. MAN. AND CYBERNETICS-PART C: APPLICATIONS AND REVIEWS VOL 30. NO. 1. FEBUARY 2000 Long-term Performance User Input Measure: Cost function User Input: Instantaneous Performance CMAC neural ne Measure: na Unknot Fig. 2. CMAC neural controller based on the H-J-B optimization. with the weight-estimation error W=w-W. The state-space Evaluating(47) along the trajectory of(43)yields description of (4 1)can be given by L=z P(g)Az-2 B P(az+32p(g)i 刻()=A()+Bg(+Wx()+x)++ 2PqB{W()+m)+7+y with z, A, and B given in(19)and(20) Inserting the optimal-feedback control law (29)into(42),we Using iP(g)Ai=(1/2)2T(A P(+PaZ,and from the riccati equation(28), we have i(t)=(A-BRB Pi(t) P+PA+是P=一想Q+ PbrBP(49) +BWry(ax)+(x)+7a(+v(t+},(43) Then the time derivative of Lyapunov function becomes Theorem 3: Let the control action u*(t be provided by optimal controller (29), with the robustifying term given by L=-2i Q2-22TP(aBRB P(i+2P(@ v(t)=-k2r(t)/|r(t) B{+7+v+t{W(FW+9BP2)}.(50) with ba and r(t) defined as the instantaneous-perfor- Applying the robustifying term(44)and the adaptive learning mance measure(10). Let the adaptive learning rule for neural- rule(45),we obtain network weights be given by L≤-22{m(Q)+m(R)} W=Fp(x)BPq)2-N洲W +1M+1(Wy1n子),(s) with F=F>Onxn and K>0. Then the errors e(t), r(t), The following inequality is used in the previous derivation and w(t) are"uniformly ultimately bounded. "Moreover, the errors et)and r(t)can be made arbitrarily small by adjusting w(w-w) weighting matrices Proof: Consider the following Lyapunov function =W,W》F-|W≤|WFWM-‖W.(52) Completing the square terms yields L=2xn1m|2+WFW)(46) where K is positive definite and symmetric given by (31). The L≤-2()m(+ha time derivative L of the Lyapunov function becomes +(w-1)-41 L=2P(+2Pq2+(WFW).(7 (53)
26 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 Fig. 2. CMAC neural controller based on the H–J–B optimization. with the weight-estimation error . The state-space description of (41) can be given by (42) with , , and given in (19) and (20). Inserting the optimal-feedback control law (29) into (42), we obtain (43) Theorem 3: Let the control action be provided by the optimal controller (29), with the robustifying term given by (44) with and defined as the instantaneous-performance measure (10). Let the adaptive learning rule for neuralnetwork weights be given by (45) with and . Then the errors , , and are “uniformly ultimately bounded.” Moreover, the errors and can be made arbitrarily small by adjusting weighting matrices. Proof: Consider the following Lyapunov function: (46) where is positive definite and symmetric given by (31). The time derivative of the Lyapunov function becomes (47) Evaluating (47) along the trajectory of (43) yields (48) Using , and from the Riccati equation (28), we have (49) Then the time derivative of Lyapunov function becomes (50) Applying the robustifying term (44) and the adaptive learning rule (45), we obtain (51) The following inequality is used in the previous derivation (52) Completing the square terms yields (53)