2.3 LU Decomposition and Its Applications 43 Isaacson,E.,and Keller,H.B.1966,Analysis of Numerical Methods (New York:Wiley),82.1. Johnson,L.W.,and Riess,R.D.1982,Numerica/Analysis,2nd ed.(Reading,MA:Addison- Wesley),$2.2.1. Westlake,J.R.1968,A Handbook of Numerical Matrix Inversion and Solution of Linear Equations (New York:Wiley). 2.3 LU Decomposition and Its Applications Suppose we are able to write the matrix A as a product of two matrices, 8= L.U=A (2.3.1) where L is lower triangular (has elements only on the diagonal and below)and U is upper triangular (has elements only on the diagonal and above).For the case of a 4 x 4 matrix A,for example,equation(2.3.1)would look like this: 011 0 0 0 311 12 613 B14 [a11 a12 a13 a14 server 2 a21 022 0 0 022 23 24 a21 a22 a23 a24 (Nort a31 a32 a33 0 0 0 334 a31 a32 a33 a34 041 a42 a43 044 0 0 344 a41 a42 a43 a44 America University Press. THE (2.3.2) ART We can use a decomposition such as(2.3.1)to solve the linear set Progra A·x=(LU)x=L·(Ux)=b (2.3.3) OF SCIENTIFIC( by first solving for the vector y such that 61 L·y=b (2.3.4) and then solving U·x=y (2.3.5) COMPUTING (ISBN 188810920 What is the advantage of breaking up one linear set into two successive ones? The advantage is that the solution of a triangular set of equations is quite trivial,as we have already seen in 82.2 (equation 2.2.4).Thus,equation(2.3.4)can be solved by forward substitution as follows, uurggoglrion Numerical Recipes 10-621 43106 b1 1= 011 (outside i- (2.3.6) North Software. = bi i=2,3,,N j=1 while(2.3.5)can then be solved by backsubstitution exactly as in equations(2.2.2)- (2.2.4), yN TN= BNN (2.3.7) i= i=N-1,N-2,,1 Bu
2.3 LU Decomposition and Its Applications 43 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Isaacson, E., and Keller, H.B. 1966, Analysis of Numerical Methods (New York: Wiley), §2.1. Johnson, L.W., and Riess, R.D. 1982, Numerical Analysis, 2nd ed. (Reading, MA: AddisonWesley), §2.2.1. Westlake, J.R. 1968, A Handbook of Numerical Matrix Inversion and Solution of Linear Equations (New York: Wiley). 2.3 LU Decomposition and Its Applications Suppose we are able to write the matrix A as a product of two matrices, L · U = A (2.3.1) where L is lower triangular (has elements only on the diagonal and below) and U is upper triangular (has elements only on the diagonal and above). For the case of a 4 × 4 matrix A, for example, equation (2.3.1) would look like this: α11 000 α21 α22 0 0 α31 α32 α33 0 α41 α42 α43 α44 · β11 β12 β13 β14 0 β22 β23 β24 0 0 β33 β34 000 β44 = a11 a12 a13 a14 a21 a22 a23 a24 a31 a32 a33 a34 a41 a42 a43 a44 (2.3.2) We can use a decomposition such as (2.3.1) to solve the linear set A · x = (L · U) · x = L · (U · x) = b (2.3.3) by first solving for the vector y such that L · y = b (2.3.4) and then solving U · x = y (2.3.5) What is the advantage of breaking up one linear set into two successive ones? The advantage is that the solution of a triangular set of equations is quite trivial, as we have already seen in §2.2 (equation 2.2.4). Thus, equation (2.3.4) can be solved by forward substitution as follows, y1 = b1 α11 yi = 1 αii bi − i−1 j=1 αijyj i = 2, 3,...,N (2.3.6) while (2.3.5) can then be solved by backsubstitution exactly as in equations (2.2.2)– (2.2.4), xN = yN βNN xi = 1 βii yi − N j=i+1 βijxj i = N − 1, N − 2,..., 1 (2.3.7)
44 Chapter 2.Solution of Linear Algebraic Equations Equations(2.3.6)and(2.3.7)total (for each right-hand side b)N2 executions of an inner loop containing one multiply and one add.If we have N right-hand sides which are the unit column vectors (which is the case when we are inverting a matrix),then taking into account the leading zeros reduces the total execution count of (2.3.6)from N3 to N3,while (2.3.7)is unchanged at N3 Notice that,once we have the LU decomposition of A,we can solve with as many right-hand sides as we then care to,one at a time.This is a distinct advantage over the methods of $2.1 and $2.2. Performing the LU Decomposition How then can we solve for L and U,given A?First,we write out the i,ith component of equation(2.3.1)or(2.3.2).That component always is a sum beginning with ICAL 01B1j十···=ai对 RECIPES The number of terms in the sum depends,however,on whether i or j is the smaller number.We have.in fact.the three cases. 9 i<j: 011j+02B21+··+aiP对=ai (2.3.8) i=j:ai1B1j+ai2B2j+...+aiiBii=aij (2.3.9) 9 i>j: a1B1j+a2P2j+…+afi=a (2.3.10) 色 9 Equations(2.3.8)-(2.3.10)total N2 equations for the N2+N unknown a's and B's(the diagonal being represented twice).Since the number of unknowns is greater than the number ofequations,we are invited to specify N of the unknowns arbitrarily and then try to solve for the others.In fact,as we shall see,it is always possible to take ai≡1i=1,.,N (2.3.11) A surprising procedure,now,is Crout's algorithm,which quite trivially solves the set of N2+N equations(2.3.8)-(2.3.11)for all the a's and B's by just arranging 兰丝母 10621 the equations in a certain order!That order is as follows: Numerica .Set ai =1,i =1,...,N (equation 2.3.11). 43126 For each j=1,2,3,...,N do these two procedures:First,for i= 1,2,....j,use (2.3.8),(2.3.9),and (2.3.11)to solve for Bij,namely i1 North =a1 (2.3.12) k= (Wheni=1 in 2.3.12 the summation term is taken to mean zero.)Second. fori=j+1,j+2,...,N use (2.3.10)to solve for aii,namely 2.3.13) Be sure to do both procedures before going on to the next j
44 Chapter 2. Solution of Linear Algebraic Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Equations (2.3.6) and (2.3.7) total (for each right-hand side b) N 2 executions of an inner loop containing one multiply and one add. If we have N right-hand sides which are the unit column vectors (which is the case when we are inverting a matrix), then taking into account the leading zeros reduces the total execution count of (2.3.6) from 1 2N3 to 1 6N3, while (2.3.7) is unchanged at 1 2N3. Notice that, once we have the LU decomposition of A, we can solve with as many right-hand sides as we then care to, one at a time. This is a distinct advantage over the methods of §2.1 and §2.2. Performing the LU Decomposition How then can we solve for L and U, given A? First, we write out the i, jth component of equation (2.3.1) or (2.3.2). That component always is a sum beginning with αi1β1j + ··· = aij The number of terms in the sum depends, however, on whether i or j is the smaller number. We have, in fact, the three cases, i<j : αi1β1j + αi2β2j + ··· + αiiβij = aij (2.3.8) i = j : αi1β1j + αi2β2j + ··· + αiiβjj = aij (2.3.9) i>j : αi1β1j + αi2β2j + ··· + αijβjj = aij (2.3.10) Equations (2.3.8)–(2.3.10) total N 2 equations for the N 2 + N unknown α’s and β’s (the diagonal being represented twice). Since the number of unknowns is greater than the number of equations, we are invited to specify N of the unknowns arbitrarily and then try to solve for the others. In fact, as we shall see, it is always possible to take αii ≡ 1 i = 1,...,N (2.3.11) A surprising procedure, now, is Crout’s algorithm, which quite trivially solves the set of N2 + N equations (2.3.8)–(2.3.11) for all the α’s and β’s by just arranging the equations in a certain order! That order is as follows: • Set αii = 1, i = 1,...,N (equation 2.3.11). • For each j = 1, 2, 3,...,N do these two procedures: First, for i = 1, 2,...,j, use (2.3.8), (2.3.9), and (2.3.11) to solve for β ij , namely βij = aij − i−1 k=1 αikβkj . (2.3.12) (When i = 1 in 2.3.12 the summation term is taken to mean zero.) Second, for i = j + 1, j + 2,...,N use (2.3.10) to solve for αij , namely αij = 1 βjj aij − j−1 k=1 αikβkj . (2.3.13) Be sure to do both procedures before going on to the next j
2.3 LU Decomposition and Its Applications 45 a etc. http://www.nr.com or call 1-800-872- (including this one) internet ( ① diagonal elements subdiagonal elements -7423(North America to any server computer,is users to make one paper etc. only),or send to dir Copyright (C) from NUMERICAL RECIPES IN C:THE ART OF SCIENTIFIC COMPUTING(ISBN 1988-1992 by Cambridge University Press.Programs Figure 2.3.1.Crout's algorithm for LU decomposition of a matrix.Elements of the original matrix are modified in the order indicated by lower case letters:a,b,c.etc.Shaded boxes show the previously modified elements that are used in modifying two typical elements,each indicated by an"x". 1788-1982 If you work through a few iterations of the above procedure,you will see that the a's and B's that occur on the right-hand side of equations(2.3.12)and(2.3.13) .Further reproduction, Numerical Recipes 10-621 are already determined by the time they are needed.You will also see that every a is used only once and never again.This means that the corresponding a ij or Bij can be stored in the location that the a used to occupy:the decomposition is"in place." 43195 [The diagonal unity elements a(equation 2.3.11)are not stored at all.]In brief, (outside Crout's method fills in the combined matrix of a's and B's. North Software. 11 012613 3141 ing of 021 322323 524 (2.3.14) Q31 032 333 34 visit website machine L041 Q42 043 344」 by columns from left to right,and within each column from top to bottom (see Figure 2.3.1). What about pivoting?Pivoting(i.e.,selection of a salubrious pivot element for the division in equation 2.3.13)is absolutely essential for the stability of Crout's
2.3 LU Decomposition and Its Applications 45 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). c g i b d f h j diagonal elements subdiagonal elements etc. etc. x x a e Figure 2.3.1. Crout’s algorithm for LU decomposition of a matrix. Elements of the original matrix are modified in the order indicated by lower case letters: a, b, c, etc. Shaded boxes show the previously modified elements that are used in modifying two typical elements, each indicated by an “x”. If you work through a few iterations of the above procedure, you will see that the α’s and β’s that occur on the right-hand side of equations (2.3.12) and (2.3.13) are already determined by the time they are needed. You will also see that every a ij is used only once and never again. This means that the corresponding α ij or βij can be stored in the location that the a used to occupy: the decomposition is “in place.” [The diagonal unity elements αii (equation 2.3.11) are not stored at all.] In brief, Crout’s method fills in the combined matrix of α’s and β’s, β11 β12 β13 β14 α21 β22 β23 β24 α31 α32 β33 β34 α41 α42 α43 β44 (2.3.14) by columns from left to right, and within each column from top to bottom (see Figure 2.3.1). What about pivoting? Pivoting (i.e., selection of a salubrious pivot element for the division in equation 2.3.13) is absolutely essential for the stability of Crout’s
46 Chapter 2.Solution of Linear Algebraic Equations method.Only partial pivoting(interchange of rows)can be implemented efficiently. However this is enough to make the method stable.This means,incidentally,that we don't actually decompose the matrix A into LU form,but rather we decompose a rowwise permutation of A.(If we keep track of what that permutation is,this decomposition is just as useful as the original one would have been.) Pivoting is slightly subtle in Crout's algorithm.The key point to notice is that equation (2.3.12)in the case of i=j (its final application)is exactly the same as equation(2.3.13)except for the division in the latter equation;in both cases the upper limit of the sum isk =j-1(=i-1).This means that we don't have to commit ourselves as to whether the diagonal element Bi is the one that happens 8 to fall on the diagonal in the first instance,or whether one of the (undivided)ai's below it in the column,i=j+1,...,N,is to be"promoted"to become the diagonal B.This can be decided after all the candidates in the column are in hand.As you should be able to guess by now,we will choose the largest one as the diagonal B (pivot element),then do all the divisions by that element en masse.This is Crout's method with partial pivoting.Our implementation has one additional wrinkle:It MERICAI initially finds the largest element in each row,and subsequently (when it is looking for the maximal pivot element)scales the comparison as if we had initially scaled all the equations to make their maximum coefficient equal to unity;this is the implicit pivoting mentioned in $2.1. Press. THE ART #include <math.h> #include "nrutil.h" 9 #define TINY 1.0e-20 A small number. Program void ludcmp(float *a,int n,int *indx,float *d) Given a matrix a[1..n][1..n],this routine replaces it by the LU decomposition of a rowwise permutation of itself.a and n are input.a is output,arranged as in equation(2.3.14)above; indx [1..n]is an output vector that records the row permutation effected by the partial to dir pivoting:d is output as 1 depending on whether the number of row interchanges was even or odd,respectively.This routine is used in combination with lubksb to solve linear equations or invert a matrix. OF SCIENTIFIC COMPUTING(ISBN 1988-19920 int i,imax,,k; float big,dum,sum,tempi float *vv; vv stores the implicit scaling of each row vv=vector(1,n); *d=1.0; No row interchanges yet. Numerical Recipes 10-621 43106 for(i=1;1<=n;i++)[ Loop over rows to get the implicit scaling informa- big=0.0; tion. for (j=1;j<=n;j++) (outside 膜 if ((temp=fabs(a[i][j]))>big)big-temp; oftware. if (big ==0.0)nrerror("Singular matrix in routine ludcmp"); No nonzero largest element. v[1]=1.0/b1g; Save the scaling. for(j=1;j<=n;j++){ This is the loop over columns of Crout's method. for (i=1;i<j;i++) This is equation (2.3.12)except for i=j. sum=a[i][j]; for (k=1;k<i;k++)sum -a[i][k]*a[k][j]; a[i][j]=sum; b1g=0.0 Initialize for the search for largest pivot element. for (i=j;i<-n;i++){ This is i=j of equation(2.3.12)andi=j+1...N sum=a[i][j]; of equation (2.3.13). for (k=1;k<j;k++)
46 Chapter 2. Solution of Linear Algebraic Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). method. Only partial pivoting (interchange of rows) can be implemented efficiently. However this is enough to make the method stable. This means, incidentally, that we don’t actually decompose the matrix A into LU form, but rather we decompose a rowwise permutation of A. (If we keep track of what that permutation is, this decomposition is just as useful as the original one would have been.) Pivoting is slightly subtle in Crout’s algorithm. The key point to notice is that equation (2.3.12) in the case of i = j (its final application) is exactly the same as equation (2.3.13) except for the division in the latter equation; in both cases the upper limit of the sum is k = j − 1 (= i − 1). This means that we don’t have to commit ourselves as to whether the diagonal element βjj is the one that happens to fall on the diagonal in the first instance, or whether one of the (undivided) α ij ’s below it in the column, i = j + 1,...,N, is to be “promoted” to become the diagonal β. This can be decided after all the candidates in the column are in hand. As you should be able to guess by now, we will choose the largest one as the diagonal β (pivot element), then do all the divisions by that element en masse. This is Crout’s method with partial pivoting. Our implementation has one additional wrinkle: It initially finds the largest element in each row, and subsequently (when it is looking for the maximal pivot element) scales the comparison as if we had initially scaled all the equations to make their maximum coefficient equal to unity; this is the implicit pivoting mentioned in §2.1. #include <math.h> #include "nrutil.h" #define TINY 1.0e-20 A small number. void ludcmp(float **a, int n, int *indx, float *d) Given a matrix a[1..n][1..n], this routine replaces it by the LU decomposition of a rowwise permutation of itself. a and n are input. a is output, arranged as in equation (2.3.14) above; indx[1..n] is an output vector that records the row permutation effected by the partial pivoting; d is output as ±1 depending on whether the number of row interchanges was even or odd, respectively. This routine is used in combination with lubksb to solve linear equations or invert a matrix. { int i,imax,j,k; float big,dum,sum,temp; float *vv; vv stores the implicit scaling of each row. vv=vector(1,n); *d=1.0; No row interchanges yet. for (i=1;i<=n;i++) { Loop over rows to get the implicit scaling informabig=0.0; tion. for (j=1;j<=n;j++) if ((temp=fabs(a[i][j])) > big) big=temp; if (big == 0.0) nrerror("Singular matrix in routine ludcmp"); No nonzero largest element. vv[i]=1.0/big; Save the scaling. } for (j=1;j<=n;j++) { This is the loop over columns of Crout’s method. for (i=1;i<j;i++) { This is equation (2.3.12) except for i = j. sum=a[i][j]; for (k=1;k<i;k++) sum -= a[i][k]*a[k][j]; a[i][j]=sum; } big=0.0; Initialize for the search for largest pivot element. for (i=j;i<=n;i++) { This is i = j of equation (2.3.12) and i = j + 1 ...N sum=a[i][j]; of equation (2.3.13). for (k=1;k<j;k++)
2.3 LU Decomposition and Its Applications 47 sum -a[i][k]*a[k][j]; a[i][i]=sum; if (dum=vv[i]*fabs(sum))>=big){ Is the figure of merit for the pivot better than the best so far? big=dum; imax=i; if (i !imax){ Do we need to interchange rows? for(k=1;k<=n;k++){ Yes,do so... dum=a [imax][k]; a[imax][k]=a[j][k]; http://www.nr. a[j][k]=dum; *d=-(*d) .and change the parity of d. 83g vv[imax]=vv[j]; Also interchange the scale factor. granted for 2 indx[i]=imax; 1.800 if (a[j][j]==0.0)a[j][j]=TINY; If the pivot element is zero the matrix is singular (at least to the precision of the algorithm).For some applications on singular matrices,it is desirable to substitute from NUMERICAL RECIPESI 19881992 9 TINY for zero if (j !n){ Now,finally,divide by the pivot element. dum=1.0/(a[j][j]); (Nort server for(1=j+1;1<=n;1+)a[i][j]*=dum; THE Go back for the next column in the reduction. America computer, free_vector(vv,1,n); ART 9 Programs Here is the routine for forward substitution and backsubstitution,implementing equations (2.3.6)and (2.3.7). void lubksb(float **a,int n,int *indx,float b[]) to dir Solves the set of n linear equations A.X=B.Here a[1..n][1..n]is input,not as the matrix A but rather as its LU decomposition,determined by the routine ludcmp.indx[1..n]is input as the permutation vector returned by ludcmp.b[1..n]is input as the right-hand side vector SCIENTIFIC COMPUTING(ISBN B,and returns with the solution vector X.a,n,and indx are not modified by this routine 1881892 and can be left in place for successive calls with different right-hand sides b.This routine takes into account the possibility that b will begin with many zero elements,so it is efficient for use in matrix inversion. 10-621 int i,ii=0,ip,j; Numerical Recipes -43108 float sum; for (i=1;i<=n;i++){ When ii is set to a positive value,it will become the ip=indx[i]; index of the first nonvanishing element of b.We now (outside sum=b[ip]; do the forward substitution,equation (2.3.6).The Software. b[ip]=b[i]; only new wrinkle is to unscramble the permutation 1f(11) as we go. for (j=ii;j<=i-1;j++)sum-=a[i][j]*b[j]; else if (sum)ii=i; A nonzero element was encountered,so from now on we b[i]-sum; will have to do the sums in the loop above. for(i=n;1>=1;1-)[ Now we do the backsubstitution,equation (2.3.7). sum=b[i]; for (j=i+1;j<=n;j++)sum -a[i][j]*b[j]; b[i]=sum/a[i][i]; Store a component of the solution vector X. All done!
2.3 LU Decomposition and Its Applications 47 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). sum -= a[i][k]*a[k][j]; a[i][j]=sum; if ( (dum=vv[i]*fabs(sum)) >= big) { Is the figure of merit for the pivot better than the best so far? big=dum; imax=i; } } if (j != imax) { Do we need to interchange rows? for (k=1;k<=n;k++) { Yes, do so... dum=a[imax][k]; a[imax][k]=a[j][k]; a[j][k]=dum; } *d = -(*d); ...and change the parity of d. vv[imax]=vv[j]; Also interchange the scale factor. } indx[j]=imax; if (a[j][j] == 0.0) a[j][j]=TINY; If the pivot element is zero the matrix is singular (at least to the precision of the algorithm). For some applications on singular matrices, it is desirable to substitute TINY for zero. if (j != n) { Now, finally, divide by the pivot element. dum=1.0/(a[j][j]); for (i=j+1;i<=n;i++) a[i][j] *= dum; } } Go back for the next column in the reduction. free_vector(vv,1,n); } Here is the routine for forward substitution and backsubstitution, implementing equations (2.3.6) and (2.3.7). void lubksb(float **a, int n, int *indx, float b[]) Solves the set of n linear equations A·X = B. Here a[1..n][1..n] is input, not as the matrix A but rather as its LU decomposition, determined by the routine ludcmp. indx[1..n] is input as the permutation vector returned by ludcmp. b[1..n] is input as the right-hand side vector B, and returns with the solution vector X. a, n, and indx are not modified by this routine and can be left in place for successive calls with different right-hand sides b. This routine takes into account the possibility that b will begin with many zero elements, so it is efficient for use in matrix inversion. { int i,ii=0,ip,j; float sum; for (i=1;i<=n;i++) { When ii is set to a positive value, it will become the index of the first nonvanishing element of b. We now do the forward substitution, equation (2.3.6). The only new wrinkle is to unscramble the permutation as we go. ip=indx[i]; sum=b[ip]; b[ip]=b[i]; if (ii) for (j=ii;j<=i-1;j++) sum -= a[i][j]*b[j]; else if (sum) ii=i; A nonzero element was encountered, so from now on we b[i]=sum; will have to do the sums in the loop above. } for (i=n;i>=1;i--) { Now we do the backsubstitution, equation (2.3.7). sum=b[i]; for (j=i+1;j<=n;j++) sum -= a[i][j]*b[j]; b[i]=sum/a[i][i]; Store a component of the solution vector X. } All done! }