150.6The Euclidean inner product and norm0.6TheEuclideaninnerproductandnorm0.6.1Definitions.The scalar (x,y)=y*x is theEuclidean inner product (standardinner product, usual inner product, scalar product, dot product) of x,y e Cn.TheEuclidean norm (usual norm, Euclidean length)function on Cn is the real-valuedfunctionlxl2=(x,x)/2=(x*x)//2;twoimportantpropertiesofthisfunctionarethatlxll2≥Oforall nonzeroxeC"and Ilaxl2=αl xl2forall xe Cnand all αeC.Thefunction(,):Cn×C"→Cis linearin thefirstargumentand conjugatelinearin the second; that is, (αxi +βx2,y) =α(x1,y) +(x2,y) and (x,αy1 +βy2) =α(x,yi)+β(x,y2) for all α,β e C and yr,y2 e Cn.If V is a real or complex vectorspace and f :V× V→Cis a function that is linear in its first argument and conjugatelinear in its second argument, we say that f is sesquilinear on V; f is a semi-innerproduct on V if it is sesquilinear on V and f(x,x) ≥ O for every x e V: f is an innerproduct on V if it is sesquilinear on V and f(x,x) > O for every nonzero x e V. Aninner productspace is a pair (V,f)in which V is a real or complex vector space andf is an inner product on V.0.6.2 Orthogonality and orthonormality. Two vectors x, y e C" are orthogonal if(x,y) = 0.In R? and R3,"orthogonal"has the conventional geometric interpretation of"perpendicular."A list of vectors xi,.,XmEC"issaidtobeorthogonalif (xi,xj)=O for all distinct i, j e(l,..,m).An orthogonal list of nonzero vectors is linearlyindependent. A vector whose Euclidean norm is I is said to be normalized (a unitvector).For anynonzeroxeCn,x/ llxll2 isa unit vector.An orthogonal listof vectors isanorthonormal list if each of its elements isaunitvector.An orthonormal listof vectorsis linearly independent.Eachof these concepts has a straightforward generalization tothe context of an inner product space.0.6.3 The Cauchy-Schwarz inequality.The Cauchy-Schwarz inequality statesthat1(x, y)/≤ I/xl/2llyl/2for all xye Cn,with equalityif and onlyifoneof thevectors is a scalarmultipleofthe other.Theanglebetween two nonzero real vectors x,y eR" is defined by(x, y)(0.6.3.1)cosa=0≤0≤元Ixl/2 llyll2'0.6.4 Gram-Schmidt orthonormalization.Any finite independent list of vectorsin an innerproduct spacemaybe replacedby anorthonormal list withthe same span.This replacement may be carried out in many ways, but there is a systematic wayto do so that has a useful special property.The Gram-Schmidt process starts with alist of vectors ui,..., un and (if the given list is linearly independent) produces anorthonormal list of vectors zi,...., zn such that span(z1,..., z) = span(xj...., xx)for each k =1,...,n.The vectors zi maybe calculated in turn as follows: Let yi = xiand normalize it: z1 =yi/ llyill2.Let y2 = x2 -(x2,zi)z1 (y2 is orthogonal to z1) andnormalize it: z2 = y2/ ly2ll2. Once zi,..., zk- have been determined, the vectoryk = Xk - (xk, zk-1)zk=1 (xk, zk=2)zk-2 ---* - (xk, z1)z1
0.6 The Euclidean inner product and norm 15 0.6 The Euclidean inner product and norm 0.6.1 Definitions. The scalar x, y = y∗x is the Euclidean inner product (standard inner product, usual inner product, scalar product, dot product) of x, y ∈ Cn. The Euclidean norm (usual norm, Euclidean length) function on Cn is the real-valued function x2 = x, x1/2 = (x∗x) 1/2; two important properties of this function are that x2 > 0 for all nonzero x ∈ Cn and αx2 = |α| x2 for all x ∈ Cn and all α ∈ C. The function ·, · : Cn× Cn → C is linear in the first argument and conjugate linear in the second; that is, αx1 + βx2, y = α x1, y + β x2, y and x, αy1 + βy2 = α¯ x, y1 + β¯ x, y2 for all α, β ∈ C and y1, y2 ∈ Cn. If V is a real or complex vector space and f : V× V → C is a function that is linear in its first argument and conjugate linear in its second argument, we say that f is sesquilinear on V; f is a semi-inner product on V if it is sesquilinear on V and f (x, x) ≥ 0 for every x ∈ V; f is an inner product on V if it is sesquilinear on V and f (x, x) > 0 for every nonzero x ∈ V. An inner product space is a pair (V, f ) in which V is a real or complex vector space and f is an inner product on V. 0.6.2 Orthogonality and orthonormality. Two vectors x, y ∈ Cn are orthogonal if x, y = 0. In R2 and R3, “orthogonal” has the conventional geometric interpretation of “perpendicular.” A list of vectors x1,., xm ∈ Cn is said to be orthogonal if xi, x j = 0 for all distinct i, j ∈ {1,., m}. An orthogonal list of nonzero vectors is linearly independent. A vector whose Euclidean norm is 1 is said to be normalized (a unit vector). For any nonzero x ∈ Cn, x/ x2 is a unit vector. An orthogonal list of vectors is an orthonormal list if each of its elements is a unit vector. An orthonormal list of vectors is linearly independent. Each of these concepts has a straightforward generalization to the context of an inner product space. 0.6.3 The Cauchy–Schwarz inequality. The Cauchy–Schwarz inequality states that | x, y| ≤ x2 y2 for all x, y ∈ Cn, with equality if and only if one of the vectors is a scalar multiple of the other. The angle θ between two nonzero real vectors x, y ∈ Rn is defined by cos θ = x, y x2 y2 , 0 ≤ θ ≤ π (0.6.3.1) 0.6.4 Gram–Schmidt orthonormalization. Any finite independent list of vectors in an inner product space may be replaced by an orthonormal list with the same span. This replacement may be carried out in many ways, but there is a systematic way to do so that has a useful special property. The Gram–Schmidt process starts with a list of vectors v1,.,vn and (if the given list is linearly independent) produces an orthonormal list of vectors z1,.,zn such that span{z1,.,zk } = span{x1,., xk } for each k = 1,., n. The vectors zi may be calculated in turn as follows: Let y1 = x1 and normalize it: z1 = y1/ y12. Let y2 = x2 − x2,z1z1 (y2 is orthogonal to z1) and normalize it: z2 = y2/ y22. Once z1,.,zk−1 have been determined, the vector yk = xk − xk ,zk−1zk−1 − xk ,zk−2zk−2 −···− xk ,z1z1
16Review and miscellaneais orthogonal to zi,...,zk-1; normalize it: zk = yk/llyxll2.Continue until k = n. Ifwe denote Z = [z1 .. zn] and X = [x1 ... xn], the Gram-Schmidt process gives afactorization X =ZR,in which the square matrix R=[rij] is nonsingular and uppertriangular; thatis,rij =Owheneveri >j.If thevectors xi...., Xxareorthonormal and the vectors xi...,Xx,X+,...,nare linearlyindependent, applying the Gram-Schmidt process to thelatter list producesthelistx1,...,xk,zk+1,...,zn oforthonormal vectors.The Gram-Schmidt process may be applied to any finite list of vectors, independentor not. If xi,...,Xn are linearly dependent, the Gram-Schmidt process produces a vec-tor yk =O for theleast value ofkfor which xis a linear combination of xi,..,xk-1.0.6.5Orthonormalbases.Anorthonormalbasisof aninnerproductspaceisabasis whoseelements constitute an orthonormal list.Since anyfiniteordered basis maybetransformed with theGram-Schmidtprocessto an orthonormalbasis,anyfinitedimensional innerproduct spacehas an orthonormalbasis,andanyorthonormal listmay be extended to an orthonormal basis.Such a basis is pleasant to work with, sincethe cross terms in inner product calculations all vanish.0.6.6 Orthogonal complements.Given any subset S c Cn,theorthogonal comple-ment of S is the set s=(x e C" : x*y=Ofor all y eS).Even if S is nota subspace,S- is always a subspace. We have (S+)+= span S, and (S-)+ = S if s is a subspace.It is always the case that dim $+ + dim(S-)+ = n. If S, and S2 are subspaces, then(SI + S2)- = S+nS+.For agiven A eMm.n,rangeAis the orthogonal complement of nullspace A*.There-fore, for a given b eCm, the linear system Ax =b has a solution (not necessarilyunique) if and only if b*z = 0 for every z E Cm such that A*z = 0. This equivalenceissometimesstated astheFredholmalternative(theoremofthealternative)ex-actly one of thefollowingtwo statements is true:Either (l)Ax =b has a solution or(2)A*y=0 has a solution suchthat y*b0.If AeMm,n and Be Mm.q if X e Mm.r and Y eMm,s, and if rangeX-nullspaceA* and rangeY=nullspaceB*,thenwehavethefollowing companionto(0.2.7.1) and (0.2.7.2):range ANrange B = nullspace(0.6.6.1)0.7Partitioned sets andmatricesA partition of a set S is a collection of subsets of S such that each element of Sis a member of one and only one of the subsets.For example, a partition of theset (1,2,...,n) is a collection of subsets α1,...,Q, (called index sets) such thateach integerbetween1andn is in one and only oneof theindexsets.Asequentialpartition of (1, 2, ...,n) is a partition in which the index sets have the special formα1 =(1,.*,i],α2 = [i +1,.*,i2],..,αt= [-1 +1,, n)
16 Review and miscellanea is orthogonal to z1,.,zk−1; normalize it: zk = yk/ yk2. Continue until k = n. If we denote Z = [z1 . zn] and X = [x1 . xn], the Gram–Schmidt process gives a factorization X = Z R, in which the square matrix R = [ri j] is nonsingular and upper triangular; that is, ri j = 0 whenever i > j. If the vectors x1,., xk are orthonormal and the vectors x1,., xk , xk+1,., xn are linearly independent, applying the Gram–Schmidt process to the latter list produces the list x1,., xk ,zk+1,.,zn of orthonormal vectors. The Gram–Schmidt process may be applied to any finite list of vectors, independent or not. If x1,., xn are linearly dependent, the Gram–Schmidt process produces a vector yk = 0 for the least value of k for which xk is a linear combination of x1,., xk−1. 0.6.5 Orthonormal bases. An orthonormal basis of an inner product space is a basis whose elements constitute an orthonormal list. Since any finite ordered basis may be transformed with the Gram–Schmidt process to an orthonormal basis, any finitedimensional inner product space has an orthonormal basis, and any orthonormal list may be extended to an orthonormal basis. Such a basis is pleasant to work with, since the cross terms in inner product calculations all vanish. 0.6.6 Orthogonal complements. Given any subset S ⊂ Cn, the orthogonal complement of S is the set S⊥ = {x ∈ Cn : x∗ y = 0 for all y ∈ S}. Even if S is not a subspace, S⊥ is always a subspace. We have (S⊥) ⊥ = span S, and (S⊥) ⊥ = S if S is a subspace. It is always the case that dim S⊥ + dim(S⊥) ⊥ = n. If S1 and S2 are subspaces, then (S1 + S2) ⊥ = S⊥ 1 ∩ S⊥ 2 . For a given A ∈ Mm,n, rangeA is the orthogonal complement of nullspace A∗. Therefore, for a given b ∈ Cm, the linear system Ax = b has a solution (not necessarily unique) if and only if b∗z = 0 for every z ∈ Cm such that A∗z = 0. This equivalence is sometimes stated as the Fredholm alternative (theorem of the alternative) — exactly one of the following two statements is true: Either (1) Ax = b has a solution or (2) A∗ y = 0 has a solution such that y∗b = 0. If A ∈ Mm,n and B ∈ Mm,q , if X ∈ Mm,r and Y ∈ Mm,s, and if range X = nullspace A∗ and range Y = nullspace B∗, then we have the following companion to (0.2.7.1) and (0.2.7.2): range A ∩ range B = nullspace X∗ Y ∗ (0.6.6.1) 0.7 Partitioned sets and matrices A partition of a set S is a collection of subsets of S such that each element of S is a member of one and only one of the subsets. For example, a partition of the set {1, 2,., n} is a collection of subsets α1,.,αt (called index sets) such that each integer between 1 and n is in one and only one of the index sets. A sequential partition of {1, 2,., n} is a partition in which the index sets have the special form α1 = {1,.,i1}, α2 = {i1 + 1,.,i2},.,αt = {it−1 + 1,., n}.
170.7Partitioned sets and matricesApartition of a matrix is a decomposition of the matrix into submatrices such thateach entry of the original matrix is in one and only one of the submatrices.Partitioningofmatrices is oftena convenient devicefor perception of useful structure.For example,partitioningB=[b,...b,JM,(F)accordingtoitscolumnsrevealsthepresentationAB =[Ab,...Abn] of thematrix product,partitioned according to the columns ofAB.0.7.1 Submatrices. Let A e Mm.r(F). For index sets α C(1,...,m) and β C(l, .**,n], we denote by A[α, β] the (sub)matrix of entries that lie in the rows ofAindexedbyαandthecolumnsindexed by.Forexample,232[(1, 3], (1, 2, 3]456189If α = β, the submatrix A[α] = A[α, α] is a principal submatrix of A. An n-by-nmatrix has ()distinct principal submatrices of size k.For A e M,(F) and k e (1...., n), A[(l, ..., k)]l is a leading principal submatrixand A[(k,...,n]] is atrailingprincipal submatrixIt is often convenient to indicate a submatrix or principal submatrix via deletion,rather than inclusion,of rows or columns.Thismay be accomplished bycomplementingthe index sets. Let α = (l, ...,m))α and βc = (l, .., n)/β denote the index setscomplementary to α and β, respectively.Then A[α,β'] is the submatrix obtainedby deleting the rows indexed by α and the columns indexed by β.For example,the submatrix A[α,] contains the rows of A indexed by α; A[°,β] contains thecolumns of A indexed by β.The determinant of an r-by-r submatrix of A is called a minor; if we wish toindicate the size of the submatrix, we call its determinant a minor of size r.If ther-by-r submatrix is a principal submatrix, then its determinant is a principal minor (ofsizer);ifthe submatrix isa leading principal matrix,then its determinant isa leadingprincipal minor;if the submatrix is a trailing principal submatrix,then its determinantis a trailing principal minor.By convention, the empty principal minor is I; that is,det A[0] = 1.A signed minor, such as those appearing in the Laplace expansion (0.3.1.1)[(-1)i+j det A;l is called a cofactor; if we wish to indicate the size of the subma-trix, we call its signed determinant a cofactor of size r.o.7.2Partitions,block matrices,and multiplication.If αi,...,α, constituteapartition of (l, ...,m) and βi,.., β, constitute a partition of (1, .., n), then thematrices A[αi. β] form a partition of the matrix A e Mm.n(F), 1 ≤i ≤t, I ≤ j≤s.If A e Mm.n(F) and B e Mn.p(F) are partitioned so that the two partitions of (1, ..., n)coincide, the two matrix partitions are said to be conformal. In this event,(0.7.2.1)(AB)[αi,Y]=A[αi.BR]B[Bk,Y]tin which the respective collections of submatrices A[ai,β,] and B[βk,J are confor-mal partitions of A and B, respectively.The left-hand side of (0.7.2.1)is a submatrix
0.7 Partitioned sets and matrices 17 A partition of a matrix is a decomposition of the matrix into submatrices such that each entry of the original matrix is in one and only one of the submatrices. Partitioning of matrices is often a convenient device for perception of useful structure. For example, partitioning B = [b1 . bn] ∈ Mn(F) according to its columns reveals the presentation AB = [Ab1 . Abn] of the matrix product, partitioned according to the columns of AB. 0.7.1 Submatrices. Let A ∈ Mm,n(F). For index sets α ⊆ {1,., m} and β ⊆ {1,., n}, we denote by A[α, β] the (sub)matrix of entries that lie in the rows of A indexed by α and the columns indexed by β. For example, ⎡ ⎣ 123 456 789 ⎤ ⎦ [{1, 3},{1, 2, 3}] = 123 789 If α = β, the submatrix A[α] = A[α, α] is a principal submatrix of A. An n-by-n matrix has n k distinct principal submatrices of size k. For A ∈ Mn(F) and k ∈ {1., n}, A[{1,., k}] is a leading principal submatrix and A[{k,., n}] is a trailing principal submatrix. It is often convenient to indicate a submatrix or principal submatrix via deletion, rather than inclusion, of rows or columns. This may be accomplished by complementing the index sets. Let αc = {1,., m}\α and βc = {1,., n}\β denote the index sets complementary to α and β, respectively. Then A[αc, βc ] is the submatrix obtained by deleting the rows indexed by α and the columns indexed by β. For example, the submatrix A[α, ∅c] contains the rows of A indexed by α; A[∅c, β] contains the columns of A indexed by β. The determinant of an r-by-r submatrix of A is called a minor; if we wish to indicate the size of the submatrix, we call its determinant a minor of size r. If the r-by-r submatrix is a principal submatrix, then its determinant is a principal minor (of size r); if the submatrix is a leading principal matrix, then its determinant is a leading principal minor; if the submatrix is a trailing principal submatrix, then its determinant is a trailing principal minor. By convention, the empty principal minor is 1; that is, det A[∅] = 1. A signed minor, such as those appearing in the Laplace expansion (0.3.1.1) [(−1) i+ j det Ai j] is called a cofactor; if we wish to indicate the size of the submatrix, we call its signed determinant a cofactor of size r. 0.7.2 Partitions, block matrices, and multiplication. If α1,.,αt constitute a partition of {1,., m} and β1,.,βs constitute a partition of {1,., n}, then the matrices A αi, β j form a partition of the matrix A ∈ Mm,n(F), 1 ≤ i ≤ t, 1 ≤ j ≤ s. If A ∈ Mm,n(F) and B ∈ Mn,p(F) are partitioned so that the two partitions of {1,., n} coincide, the two matrix partitions are said to be conformal. In this event, (AB) αi, γ j = s k=1 A αi, βk B βk , γ j (0.7.2.1) in which the respective collections of submatrices A[αi, βk ] and B[βk , γ j] are conformal partitions of A and B, respectively. The left-hand side of (0.7.2.1) is a submatrix
18Review and miscellaneaof the product AB (calculated in the usual way),and each summand on the right-hand sideis a standardmatrixproduct.Thus,multiplication of conformallypartitionedmatrices mimics usual matrix multiplication.The sumof two partitioned matricesA, B e Mm.n(F) of the same size has a similarly pleasant representation if the parti-tions of their rows (respectively,of their columns)are the same:(A+ B)[αi, β] = A[αi, β,] + B[αi,β]Ifa matrix is partitionedby sequential partitions ofits rows and columns,theresulting partitioned matrix is called a block matrix.For example,if the rows and columnsof A eM,(F) are partitioned by the same sequential partition α= (l, ..,k),α2(k+1....,n),theresultingblockmatrixis[A[α1,α]A[α1,α2][AnA12A[α2,α]A[α2,α2]A21A22in which the blocks are Aij = A[αi,αjl. Computations with block matrices are em-ployed throughout the book; 2-by-2 block matrices are the most important and useful.0.7.3The inverseofapartitioned matrix.Itcan beuseful toknowthe correspond-ingblocksintheinverseofapartitioned nonsingularmatrixA,thatis,topresenttheinverseof a partitioned matrix in conformallypartitioned form.Thismaybe done in avariety of apparently different, but equivalent, waysassuming that certain subma-trices of Ae Mn(F)and A-are also nonsingular.For simplicity,letAbepartitionedasa2-by-2blockmatrix[A A2A:[A21 A22with Ai e Mn.(F),i = I, 2, and ni + n2 = n. A useful expression for the correspond-inglypartitioned presentation of A-l is(A1l - A12A2’A21)-A'A12 (A21A'A12- A22)-(0.7.3.1)[A2’ A21 (A12AA21 - A)-1(A22 - A21A-A12)assuming that all the relevant inverses exist. This expression for A-I may be verifiedby doing a partitioned multiplication by A and then simplifying. In general index setnotation,wemaywriteA-' [α] = (A [α] - A[α, α]A[α]-" A[α,α]]andA-'[α, α'] = A [α]-" A [α, α"] (A[α", α] A[α]-" A[α, α] - A[α°]-1= (A[α,α]A[α]-" A[α,α] - A[]]A[α,@]A[α]-]againassuming thattherelevant inverses exist.There is an intimaterelationship betweentheserepresentationsandtheSchurcomplement;see(0.8.5).NoticethatA-'[α]isasubmatrix of A-l, while A[α]-l is the inverse of a submatrix of A; these two objectsare not, in general, the same
18 Review and miscellanea of the product AB (calculated in the usual way), and each summand on the righthand side is a standard matrix product. Thus, multiplication of conformally partitioned matrices mimics usual matrix multiplication. The sum of two partitioned matrices A, B ∈ Mm,n(F) of the same size has a similarly pleasant representation if the partitions of their rows (respectively, of their columns) are the same: (A + B) αi, β j = A αi, β j + B αi, β j If a matrix is partitioned by sequential partitions of its rows and columns, the resulting partitioned matrix is called a block matrix. For example, if the rows and columns of A ∈ Mn(F) are partitioned by the same sequential partition α1 = {1,., k}, α2 = {k + 1,., n}, the resulting block matrix is A = A[α1, α1] A[α1, α2] A[α2, α1] A[α2, α2] = A11 A12 A21 A22 in which the blocks are Ai j = A[αi, α j]. Computations with block matrices are employed throughout the book; 2-by-2 block matrices are the most important and useful. 0.7.3 The inverse of a partitioned matrix. It can be useful to know the corresponding blocks in the inverse of a partitioned nonsingular matrix A, that is, to present the inverse of a partitioned matrix in conformally partitioned form. This may be done in a variety of apparently different, but equivalent, ways — assuming that certain submatrices of A ∈ Mn(F) and A−1 are also nonsingular. For simplicity, let A be partitioned as a 2-by-2 block matrix A = A11 A12 A21 A22 with Aii ∈ Mni(F),i = 1, 2, and n1 + n2 = n. A useful expression for the correspondingly partitioned presentation of A−1 is A11 − A12A−1 22 A21−1 A−1 11 A12 A21A−1 11 A12 − A22−1 A−1 22 A21 A12A−1 22 A21 − A11−1 A22 − A21A−1 11 A12−1 (0.7.3.1) assuming that all the relevant inverses exist. This expression for A−1 may be verified by doing a partitioned multiplication by A and then simplifying. In general index set notation, we may write A−1 [α] = A [α] − A α, αc A αc −1 A αc , α−1 and A−1 α, αc = A [α] −1 A α, αc A αc , α A [α] −1 A α, αc − A αc −1 = A α, αc A αc −1 A αc , α − A [α] −1 A α, αc A αc −1 again assuming that the relevant inverses exist. There is an intimate relationship between these representations and the Schur complement; see (0.8.5). Notice that A−1[α] is a submatrix of A−1, while A[α] −1 is the inverse of a submatrix of A; these two objects are not, in general, the same.
190.7 Partitioned sets and matrices0.7.4 The Sherman-Morrison-Woodburyformula.Suppose that a nonsingularmatrix A e M,(F) has a known inverse A-l and consider B = A+XRY, in whichX is n-by-r, Y is r-by-n, and R is r-by-r and nonsingular. If B and R-l + Y A-'X arenonsingular,thenB-I = A-I -A-IX(R-I +YA-IX)-IYA-1(0.7.4.1)If r is much smaller than n, then R and R-I +YA-IX may be much easier to invertthanB.Forexample,if x,y eF" are nonzero vectors,X=x,Y=y,yA-ix -1,and R = [1], then (0.7.4.1) becomes a formula for the inverse of a rank-1 adjustmentto A:(A +xyT)-' = A-I - (I+ yT A-'x)- A-'xyT A-I(0.7.4.2)In particular, if B=I +xyT for x,y eF" and yTx+-1, then B-1 =I -(1+yTx)-lxyT.0.7.5Complementary nullities.Suppose thatAeMn(F)is nonsingular, letα andβbe nonempty subsets of (1,..,n), and write[α|=r and β| =sfor the cardinalitiesofαandβ.Thelawof complementarynullitiesisnullity(A [α, β]) = nullity (A-' [β, α°)(0.7.5.1)which is equivalent to the rank identityrank(A[α, β)=rank (A-I[β,α]) +r +s- n(0.7.5.2)Since we can permute rows and columns to place first ther rows indexed by α andthes columns indexed byβ,it sufficestoconsiderthepresentations[BuB12A11A12and A-1AB21B22A21A22in which Ann and B are r-by-s and A22 and B22 are (n - r)-by-(n - s). Then (0.7.5.1)says that nullityA=nullity B22The underlying principle here is very simple. Suppose that the nullity of Ain is k.Ifk ≥ 1, let the columns of X Ms.(F) be a basis for the null space of Ai1. Since A isnonsingular,[3] [ [x]hasfull rank,soA2,Xhaskindependentcolumns.ButB12(A21X)--[ -[] []B22(A21X)so B22(A2:X) = 0 and hence nullity B22 ≥ k = nullity An, a statement that is trivi-ally correct if k = 0. A similar argument starting with B22 shows that nullity Ai1 ≥nullity B22. For a different approach, see (3.5.P13).Of course, (0.7.5.1) also tells us that nullity A2= nullity B/2, nullity A21 =nullityB21,and nullityA22=nullityBu.If r+s=n,then rank Ai=rank B22andrank A22 = rank Bu1, while if n = 2r = 2s, then we also haverank A12 = rank B12 and
0.7 Partitioned sets and matrices 19 0.7.4 The Sherman–Morrison–Woodbury formula. Suppose that a nonsingular matrix A ∈ Mn(F) has a known inverse A−1 and consider B = A + X RY , in which X is n-by-r, Y is r-by-n, and R is r-by-r and nonsingular. If B and R−1 + Y A−1X are nonsingular, then B−1 = A−1 − A−1X(R−1 + Y A−1X) −1 Y A−1 (0.7.4.1) If r is much smaller than n, then R and R−1 + Y A−1X may be much easier to invert than B. For example, if x, y ∈ Fn are nonzero vectors, X = x, Y = yT , yT A−1x = −1, and R = [1], then (0.7.4.1) becomes a formula for the inverse of a rank-1 adjustment to A: A + x yT −1 = A−1 − 1 + yT A−1x −1 A−1x yT A−1 (0.7.4.2) In particular, if B = I + x yT for x, y ∈ Fn and yT x = −1, then B−1 = I − (1 + yT x) −1x yT . 0.7.5 Complementary nullities. Suppose that A ∈ Mn(F) is nonsingular, let α and β be nonempty subsets of {1,., n}, and write |α| = r and |β| = s for the cardinalities of α and β. The law of complementary nullities is nullity (A [α, β]) = nullity A−1 βc , αc (0.7.5.1) which is equivalent to the rank identity rank (A [α, β]) = rank A−1 βc , αc + r + s − n (0.7.5.2) Since we can permute rows and columns to place first the r rows indexed by α and the s columns indexed by β, it suffices to consider the presentations A = A11 A12 A21 A22 and A−1 = B11 B12 B21 B22 in which A11 and BT 11 are r-by-s and A22 and BT 22 are (n − r)-by-(n − s). Then (0.7.5.1) says that nullity A11 = nullity B22. The underlying principle here is very simple. Suppose that the nullity of A11 is k. If k ≥ 1, let the columns of X ∈ Ms,k (F) be a basis for the null space of A11. Since A is nonsingular, A X 0 = A11X A21X = 0 A21X has full rank, so A21X has k independent columns. But B12 (A21X) B22 (A21X) = A−1 0 A21X = A−1A X 0 = X 0 so B22(A21X) = 0 and hence nullity B22 ≥ k = nullity A11, a statement that is trivially correct if k = 0. A similar argument starting with B22 shows that nullity A11 ≥ nullity B22. For a different approach, see (3.5.P13). Of course, (0.7.5.1) also tells us that nullity A12 = nullity B12, nullity A21 = nullity B21, and nullity A22 = nullity B11. If r + s = n, then rank A11 = rank B22 and rank A22 = rank B11, while if n = 2r = 2s, then we also have rank A12 = rank B12 and