20Review and miscellanearank A21 = rank B21.Finally, (0.7.5.2) tells us that the rank of an r-by-s submatrix ofan n-by-n nonsingularmatrix is at leastr+s-n.0.7.6 Rank in a partitioned matrix and rank-principal matrices.Partition A M,(F)as[AnA12A=A11 EM,(F), A22 E Mn-r(F)A21A22If An is nonsingular, then of course rank[Au Au2] =r and rankA= r. Remarkably,the converseis true:A1if rank A = rank[A1 A12] = rank,then An is nonsingular(0.7.6.1)This follows from (0.4.6(c): If A is singular, then rank An = k <r, and there arenonsingular S, T M,(F) such that0TSAnT0OrkTherefore,SA12A-[82][6 2] -[6 02]A21TA22has rank r,as do its first block row and column.Because the rth row of the firstblock column of A is zero,theremustbe some column in SAi2whose rth entry isnot zero, which means that A has at least r + 1 independent columns. This contradictsrankA=rankA=r,soAnimustbenonsingular.Let A e Mm.n(F) and suppose that rank A = r > O. Let A= XyT be a full-rank factorization with X,Y eMm.r(F); see (0.4.6c).Let α,βC(1,...,m) and,(l,...,n]be index setsofcardinalityr.Then A[α,y] =X[α,jY[y,]M,(F),which is nonsingular whenever rank X[α,°]=rank Y[y,°]=r.Themul-tiplicativity property (0.3.5) ensures that(0.7.6.2)det A[α, ] det A[β, 3] = det A[α, S]det A[β, ]Suppose that A e M,(F)and rank A=r.We say that A is rank principal if it has anonsingular r-by-r principal submatrix. It follows from (0.7.6.1) that if there is someindex setα C (1,..., n) such thatrank A = rank A[α, °] = rank A[,α](0.7.6.3)(that is, if there are r linearly independent rows of A such that the correspondingr columns are linearly independent), then A is rank principal; moreover, A[α] isnonsingular.If AeM,(F)is symmetric or skew symmetric, or if AeM,(C) isHermitian orskew Hermitian, then rank A[α, ] =rank A[,α] for every index set α, so Asatisfies (0.7.6.3) and is therefore rank principal
20 Review and miscellanea rank A21 = rank B21. Finally, (0.7.5.2) tells us that the rank of an r-by-s submatrix of an n-by-n nonsingular matrix is at least r + s − n. 0.7.6 Rank in a partitioned matrix and rank-principal matrices. Partition A ∈ Mn(F) as A = A11 A12 A21 A22 , A11 ∈ Mr(F), A22 ∈ Mn−r(F) If A11 is nonsingular, then of course rank [ A11 A12 ] = r and rank A11 A21 = r. Remarkably, the converse is true: if rank A = rank[A11 A12] = rank A11 A21 , then A11 is nonsingular (0.7.6.1) This follows from (0.4.6(c)): If A11 is singular, then rank A11 = k < r, and there are nonsingular S, T ∈ Mr(F) such that S A11T = Ik 0 0 0r−k Therefore, Aˆ = S 0 0 In−r A T 0 0 In−r = ⎡ ⎣ Ik 0 0 0r−k S A12 A21T A22 ⎤ ⎦ has rank r, as do its first block row and column. Because the rth row of the first block column of Aˆ is zero, there must be some column in S A12 whose rth entry is not zero, which means that Aˆ has at least r + 1 independent columns. This contradicts rank Aˆ = rank A = r, so A11 must be nonsingular. Let A ∈ Mm,n(F) and suppose that rank A = r > 0. Let A = XY T be a fullrank factorization with X, Y ∈ Mm,r(F); see (0.4.6c). Let α, β ⊆ {1,., m} and γ,δ ⊆ {1,., n} be index sets of cardinality r. Then A[α, γ ] = X[α, ∅c]Y [γ , ∅c] T ∈ Mr(F), which is nonsingular whenever rank X[α, ∅c] = rank Y [γ , ∅c] = r. The multiplicativity property (0.3.5) ensures that det A[α, γ ] det A[β,δ] = det A[α, δ] det A[β,γ ] (0.7.6.2) Suppose that A ∈ Mn(F) and rank A = r. We say that A is rank principal if it has a nonsingular r-by-r principal submatrix. It follows from (0.7.6.1) that if there is some index set α ⊂ {1,., n} such that rank A = rank A α, ∅c = rank A ∅c , α (0.7.6.3) (that is, if there are r linearly independent rows of A such that the corresponding r columns are linearly independent), then A is rank principal; moreover, A[α] is nonsingular. If A ∈ Mn(F) is symmetric or skew symmetric, or if A ∈ Mn(C) is Hermitian or skew Hermitian, then rank A [α, ∅c ] = rank A [∅c, α] for every index set α, so A satisfies (0.7.6.3) and is therefore rank principal.
210.8 Determinants again0.7.7Commutativity,anticommutativity,andblock diagonal matrices.TwomatricesA,BeM,(F)are saidtocommuteifAB=BA.Commutativityis nottypical,but one important instance is encountered frequently. Suppose that A=[Aijl.j=1 EMn(F)isablockmatrixinwhichAij=O if i+j;Aii=入,ln,for some入iEFforeachi = 1,...,s; and >, +入, ifi+ j.Partition B =[Bi,jl.j=I e M,(F)conformallywithA.Then AB=BAif and onlyif >,Buj=Bja,for eachi, j=1,...,s, that is,(ai ->j)Bij = O for each i, j = 1,..., s. These identities are satisfied if and only ifBj=Owheneveri+j.Thus,AcommuteswithBifand onlyif Bisblock diagonalconformalwithA; see (0.9.2).TwomatricesA,BeMn(F)are saidtoanticommute if AB=-BA.For example,[andthematricesanticommute.0.7.8 The vec mapping.Partition a matrix A e Mm.n(F) according to its columns:A=[a...an].Themappingvec:Mm.n(F)-→Fmn isvecA-[a] ...a,]that is, vec A is the vector obtained by stacking the columns of A, left toright.The vecoperatorcan bea convenienttool inproblems involvingmatrix equations.0.8DeterminantsagainSome additional facts about and identitiesfor thedeterminant are useful for reference.0.8.1 Compound matrices.Let A Mm.n(F). Let α (1,...,m) and β(1,...,n) be index sets ofcardinalityr ≤min(m, n)elements.The (m)-by-(")matrixwhose α,β, entry is det A[α,β] is called the rth compound matrix of A and is denotedby Cr(A). In forming the rows and column of Cr(A), we arrange index sets lexico-graphically, that is, (1,2,4) before (1, 2, 5) before (1,3, 4), and so on. For example,if372465A=(0.8.1.0)7810thenC2(A)=32detdetet32866-3-6-3ai3930detdet-6-1142-3-21odetdetdetIf A e Mm,(F), B e Mk.n(F), andr ≤min(m, k, n),it follows from the Cauchy-Binetformula (0.8.7) thatC,(AB)=C,(A)C,(B)(0.8.1.1)which is the multiplicativity property of therth compound matrix
0.8 Determinants again 21 0.7.7 Commutativity, anticommutativity, and block diagonal matrices. Two matrices A, B ∈ Mn(F) are said to commute if AB = B A. Commutativity is not typical, but one important instance is encountered frequently. Suppose that = [i j] s i,j=1 ∈ Mn(F) is a block matrix in which i j = 0 if i = j; ii = λi Ini for some λi ∈ F for each i = 1,.,s; and λi = λj if i = j. Partition B = [Bi j] s i,j=1 ∈ Mn(F) conformally with . Then B = B if and only if λi Bi j = Bi jλj for each i, j = 1,.,s, that is, (λi − λj)Bi j = 0 for each i, j = 1,.,s. These identities are satisfied if and only if Bi j = 0 whenever i = j. Thus, commutes with B if and only if B is block diagonal conformal with ; see (0.9.2). Two matrices A, B ∈ Mn(F) are said to anticommute if AB = −B A. For example, the matrices 1 0 0 −1 and 0 1 1 0 anticommute. 0.7.8 The vec mapping. Partition a matrix A ∈ Mm,n(F) according to its columns: A = [a1 . an]. The mapping vec : Mm,n(F) → Fmn is vec A = [aT 1 . aT n ] T that is, vec A is the vector obtained by stacking the columns of A, left to right. The vec operator can be a convenient tool in problems involving matrix equations. 0.8 Determinants again Some additional facts about and identities for the determinant are useful for reference. 0.8.1 Compound matrices. Let A ∈ Mm,n(F). Let α ⊆ {1,., m} and β ⊆ {1,., n} be index sets of cardinality r ≤ min{m, n} elements. The m r -by- n r matrix whose α, β, entry is det A[α, β] is called the rth compound matrix of A and is denoted by Cr(A). In forming the rows and column of Cr(A), we arrange index sets lexicographically, that is, {1, 2, 4} before {1, 2, 5} before {1, 3, 4}, and so on. For example, if A = ⎡ ⎣ 12 3 45 6 7 8 10 ⎤ ⎦ (0.8.1.0) then C2(A) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ det 1 2 4 5 det 1 3 4 6 det 2 3 5 6 det 1 2 7 8 det 1 3 7 10 det 2 3 8 10 det 4 5 7 8 det 4 6 7 10 det 5 6 8 10 ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ = ⎡ ⎣ −3 −6 −3 −6 −11 −4 −3 −2 2 ⎤ ⎦ If A ∈ Mm,k (F), B ∈ Mk,n(F), and r ≤ min{m, k, n}, it follows from the Cauchy–Binet formula (0.8.7) that Cr(AB) = Cr(A)Cr(B) (0.8.1.1) which is the multiplicativity property of the rth compound matrix
22Reviewand miscellaneaWe define Co(A) = 1. We have Ci(A) = A; if A e M,(F), then Cn(A) = det A.If Ae Mm.k(F)andt eF, then C,(tA)=t'Cr(A)If1≤r ≤n, then C,(In)= I() e M()If A e M, is nonsingular and I≤r ≤n, then C,(A)-I = Cr(A-l)If A e M, and 1 ≤r ≤ n,then detC,(A) = (det A)(=i)If A E Mm.n(F)andr = rank A, then rank C(A) = 1If AEMm.n(F)and1≤r ≤min(m,n),thenCr(A)=Cr(A))If A E Mm.n(C) and 1 ≤r ≤min(m,n), then C,(A*)= C,(A)*If △= [dij]e M,(F) is upper (respectively, lower) triangular (see (0.9.3), thenCr()is upper (respectively,lower)triangular; itsmain diagonal entries are the (")possible products of r entries chosen from the list dii,...,dnn, that is, they are the() scalars din..di, such that 1≤ii<...<i, ≤n, arranged lexicographicallyConsequently, if D =diag(di,...,d,) e M,(F) is diagonal, then so is C,(D); itsmain diagonal entriesarethe(")possibleproducts ofrentries chosenfrom the listdi,...,dn, that is, they are the () scalars di,..di, such that 1 ≤in<...<ir≤n,arranged lexicographically. See chapter 6 of (Fiedler, 1986) for a detailed discussionof compoundmatrices.0.8.2 The adjugateand theinverse. If A Mn(F)andn≥2,the transposed matrixof cofactorsofAadj A = [(-1)+/ det A [Uj°, [i]'](0.8.2.0)is theadjugateof A; it is also called the classical adjoint of A.For example,ad[][]A calculation using the Laplace expansion for the determinant reveals the basicproperty of the adjugate:(adj A) A = A (adj A) = (det A)I(0.8.2.1)Thus, adj A is nonsingular if A is nonsingular, and det (adj A) = (det A)n-1If A is nonsingular, thenadj A = (det A) A-I, that is, A-I = (det A)-I adj A(0.8.2.2)For example,a= (ad-bc)-1-bbifad+bc.Inparticular,adj(A-l)=A/ det A = (adj A)-1If A is singular and rank A ≤n -2, then every minor of A of size n-1 is zero, soadjA= 0.If A is singular and rank A = n - 1, then some minor of A of size n -I is nonzero,soadjA0andrankadjA≥1.Moreover,somelistofn-1columns ofA islin-early independent, so the identity (adj A) A = (det A) I = O ensures that the null spaceof adj A has dimension at least n -1 and hence rank adj A ≤ 1. We conclude thatrank adj A = 1. The full-rank factorization (0.4.6(e)) ensures that adj A =αxyf forsome nonzero α eFand nonzerox,y eF" that are determined as follows: Compute(Ax)yI = A(adj A) = 0 = (adj A)A =x(yT A)
22 Review and miscellanea We define C0(A) = 1. We have C1(A) = A; if A ∈ Mn(F), then Cn(A) = det A. If A ∈ Mm,k (F) and t ∈ F, then Cr(t A) = t rCr(A) If 1 ≤ r ≤ n, then Cr(In) = I( n r ) ∈ M( n r ) If A ∈ Mn is nonsingular and 1 ≤ r ≤ n, then Cr(A) −1 = Cr(A−1) If A ∈ Mn and 1 ≤ r ≤ n, then detCr(A) = (det A) n−1 r−1 If A ∈ Mm,n(F) and r = rank A, then rank Cr(A) = 1 If A ∈ Mm,n(F) and 1 ≤ r ≤ min{m, n}, then Cr(AT ) = Cr(A) T If A ∈ Mm,n(C) and 1 ≤ r ≤ min{m, n}, then Cr(A∗) = Cr(A) ∗ If = [di j] ∈ Mn(F) is upper (respectively, lower) triangular (see (0.9.3)), then Cr( ) is upper (respectively, lower) triangular; its main diagonal entries are the n r possible products of r entries chosen from the list d11,., dnn, that is, they are the n r scalars di1i1 ··· dirir such that 1 ≤ i1 < ··· < ir ≤ n, arranged lexicographically. Consequently, if D = diag(d1,., dn) ∈ Mn(F) is diagonal, then so is Cr(D); its main diagonal entries are the n r possible products of r entries chosen from the list d1,., dn, that is, they are the n r scalars di1 ··· dir such that 1 ≤ i1 < ··· < ir ≤ n, arranged lexicographically. See chapter 6 of (Fiedler, 1986) for a detailed discussion of compound matrices. 0.8.2 The adjugate and the inverse. If A ∈ Mn(F) and n ≥ 2, the transposed matrix of cofactors of A adj A = (−1) i+ j det A {j} c ,{i} c (0.8.2.0) is the adjugate of A; it is also called the classical adjoint of A. For example, adj a b c d = d −b −c a . A calculation using the Laplace expansion for the determinant reveals the basic property of the adjugate: (adj A) A = A (adj A) = (det A) I (0.8.2.1) Thus, adj A is nonsingular if A is nonsingular, and det (adj A) = (det A) n−1 . If A is nonsingular, then adj A = (det A) A−1 , that is, A−1 = (det A) −1 adj A (0.8.2.2) For example, a b c d −1 = (ad − bc) −1 d −b −c a if ad = bc. In particular, adj(A−1) = A/ det A = (adj A) −1. If A is singular and rank A ≤ n − 2, then every minor of A of size n − 1 is zero, so adj A = 0. If A is singular and rank A = n − 1, then some minor of A of size n − 1 is nonzero, so adj A = 0 and rank adj A ≥ 1. Moreover, some list of n − 1 columns of A is linearly independent, so the identity (adj A) A = (det A) I = 0 ensures that the null space of adj A has dimension at least n − 1 and hence rank adj A ≤ 1. We conclude that rank adj A = 1. The full-rank factorization (0.4.6(e)) ensures that adj A = αx yT for some nonzero α ∈ F and nonzero x, y ∈ Fn that are determined as follows: Compute (Ax)yT = A(adj A) = 0 = (adj A)A = x(yT A)
230.8Determinantsagainand conclude that Ax =O and yT A=O, that is, x (respectively,y) is determined uptoanonzeroscalarfactorasanonzeroelementoftheone-dimensionalnullspaceof A(respectively,A).Thefunction A-→adjA is continuous on Mn (eachentryof adj A is a multinomialintheentriesofA)andeverymatrixinM,isa limitofnonsingularmatrices,soproper-ties of the adjugate can be deduced from continuity and properties ofthe inverse func-tion.For example,if A, B Mn are nonsingular, then adj(AB)= (detAB)(AB)-1(det A)(det B)B- A-1 = (det B)B-I(det A)A-I = (adj B)(adj A). Continuity then en-sures that(0.8.2.3)adj(AB) = (adj B)(adj A) for all A, B E MnFor any c e F and any A e M,(F), adj(cA) = cn-I adj A. In particular, adj(cl) =c-1 I and adj 0 = 0.If A is nonsingular, thenadj(adj A) = adj(det A)A-l) = (det A)"-I adj A-I= (det A)"-I (A/ det A) = (det A)"-2 Asocontinuityensures thatadj(adj A) = (det A)"-2 A for all A E Mn(0.8.2.4)If A+Bisnonsingular,then A(A+B)-IB =B(A+B)-IA, so continuity ensuresthat(0.8.2.5)Aadj(A+ B)B =Badj(A+ B)A for all A, B E MnLetA,BeMn and supposethat A commutes with B.If A isnonsingular,then BA-! =A-IABA-! =A-'BAA- = A-IB, so A-! commutes with B. ButBA-I = (det A)-IB adjA and A-'B = (det A)-'(adjA)B, so adj A commutes withB.Continuity ensures that adj A commutes with Bwhenever A commutes withB,even if A is singular.If A = [aj] is upper triangular, then adj A = [bij] is upper triangular and eachbii =lj+i aj; if A is diagonal,then so is adj A.The adjugate is the transpose of the gradient of det A:a det A(adj A) =(0.8.2.6)LaaiiIf A is nonsingular, itfollows from (0.8.2.6) thatCdetA= (det A) A-1(0.8.2.7)aaijIfAEM,isnonsingular,then adjAT=(detAT)A-T=(det A)A-T(det A)A-I)T = (adj A)T. Continuity ensures that(0.8.2.8)adjAT=(adjA)for all A E Mn(F)A similar argument shows that(0.8.2.9)adj A* = (adj A)* for all A e Mn
0.8 Determinants again 23 and conclude that Ax = 0 and yT A = 0, that is, x (respectively, y) is determined up to a nonzero scalar factor as a nonzero element of the one-dimensional null space of A (respectively, AT ). The function A → adj A is continuous on Mn (each entry of adj A is a multinomial in the entries of A) and every matrix in Mn is a limit of nonsingular matrices, so properties of the adjugate can be deduced from continuity and properties of the inverse function. For example, if A, B ∈ Mn are nonsingular, then adj(AB) = (det AB)(AB) −1 = (det A)(det B)B−1A−1 = (det B)B−1(det A)A−1 = (adj B)(adj A). Continuity then ensures that adj (AB) = (adj B) (adj A) for all A, B ∈ Mn (0.8.2.3) For any c ∈ F and any A ∈ Mn(F), adj(cA) = cn−1 adj A. In particular, adj(c I) = cn−1 I and adj 0 = 0. If A is nonsingular, then adj(adj A) = adj((det A)A−1 ) = (det A) n−1 adj A−1 = (det A) n−1 (A/ det A) = (det A) n−2 A so continuity ensures that adj(adj A) = (det A) n−2A for all A ∈ Mn (0.8.2.4) If A + B is nonsingular, then A(A + B) −1B = B(A + B) −1A, so continuity ensures that A adj (A + B) B = B adj (A + B) A for all A, B ∈ Mn (0.8.2.5) Let A, B ∈ Mn and suppose that A commutes with B. If A is nonsingular, then B A−1 = A−1ABA−1 = A−1BAA−1 = A−1B, so A−1 commutes with B. But B A−1 = (det A) −1B adj A and A−1B = (det A) −1(adj A)B, so adj A commutes with B. Continuity ensures that adj A commutes with B whenever A commutes with B, even if A is singular. If A = [ai j] is upper triangular, then adj A = [bi j] is upper triangular and each bii = j=i aj j ; if A is diagonal, then so is adj A. The adjugate is the transpose of the gradient of det A: (adj A) = ∂ ∂ai j det A T (0.8.2.6) If A is nonsingular, it follows from (0.8.2.6) that ∂ ∂ai j det A T = (det A) A−1 (0.8.2.7) If A ∈ Mn is nonsingular, then adj AT = (det AT )A−T = (det A)A−T = ((det A)A−1) T = (adj A) T . Continuity ensures that adj AT = (adj A) T for all A ∈ Mn(F) (0.8.2.8) A similar argument shows that adj A∗ = (adj A) ∗ for all A ∈ Mn (0.8.2.9)
24ReviewandmiscellaneaLet A= [a ... an] e M,(F) be partitioned according to its columns and letb eF".Define(A b) =[ai... ai-I b ai+1 ... an]that is, (A b)denotes the matrix whose ith column is b and whose remainingcolumnscoincidewiththoseofA.ExaminationoftheLaplaceexpansion(O.3.1.1)ofdet(Ab) by minors along column i reveals that it is the ith entry of the vector(adjA)b,that is.[det (A -b)]"=- = (adj A) b(0.8.2.10)Applying this vector identity to each column of C =[ci...Cn] e Mn(F) gives thematrix identity[det (A c)].j=I = (adj A)C(0.8.2.11)0.8.3Cramer's rule.Cramer's rule is a useful way to present analytically a particularentry of the solution to Ax =b when A e M,(F) is nonsingular.The identityA [det(A ←-b)]"=↑ = A (adj A)b = (det A)bfollows from (0.8.2.9). If det A+ 0, we obtain Cramer's ruledet(A b)xi=detAfor the ithentryxiof the solutionvectorx.Cramer's rule alsofollows directlyfrommultiplicativityof the determinant.The systemAx=bmay be rewritten asA(I x)=A band taking determinants of both sides (using multiplicativity)gives(det A) det(I x) = det(A b)But det(I x)= xi, and theformulafollows.0.8.4 Minors of the inverse. Jacobi's identity generalizes the adjugate formula forthe inverse of a nonsingular matrix andrelates the minors of A-Ito those of A e Mn(F):det A- [ar, β,] = (-1)P(a,P) det A [β, α](0.8.4.1)det Ain which p(α, β) = Z i + ise j. Our universal convention is that det A[0] = 1.For principal submatrices,Jacobi's identity assumes the simpleformdet A [α]det A-'[α'] =-(0.8.4.2)detA0.8.5 Schur complements and determinantal formulae. Let A = [aij] e Mn(F)be given and suppose that α (l, ..., n) is an index set such that A[α] is nonsingular.An importantformula for det A,based on the2-partition of A using α andα,isdet A = det A[α]det (A[α] - A[α,α|A[α]- A[α, αD(0.8.5.1)
24 Review and miscellanea Let A = [a1 . an] ∈ Mn(F) be partitioned according to its columns and let b ∈ Fn. Define A ←i b = [a1 . ai−1 b ai+1 . an] that is, (A ←i b) denotes the matrix whose ith column is b and whose remaining columns coincide with those of A. Examination of the Laplace expansion (0.3.1.1) of det A ←i b by minors along column i reveals that it is the ith entry of the vector (adj A) b, that is, det A ←i b n i=1 = (adj A) b (0.8.2.10) Applying this vector identity to each column of C = [c1 . cn] ∈ Mn(F) gives the matrix identity det A ←i c j n i,j=1 = (adj A)C (0.8.2.11) 0.8.3 Cramer’s rule. Cramer’s rule is a useful way to present analytically a particular entry of the solution to Ax = b when A ∈ Mn(F) is nonsingular. The identity A det A ←i b n i=1 = A (adj A) b = (det A) b follows from (0.8.2.9). If det A = 0, we obtain Cramer’s rule xi = det(A ←i b) det A for the ith entry xi of the solution vector x. Cramer’s rule also follows directly from multiplicativity of the determinant. The system Ax = b may be rewritten as A(I ←i x) = A ←i b and taking determinants of both sides (using multiplicativity) gives (det A) det(I ←i x) = det(A ←i b) But det(I ←i x) = xi , and the formula follows. 0.8.4 Minors of the inverse. Jacobi’s identity generalizes the adjugate formula for the inverse of a nonsingular matrix and relates the minors of A−1 to those of A ∈ Mn(F): det A−1 αc , βc = (−1)p(α,β) det A [β,α] det A (0.8.4.1) in which p(α, β) = i∈α i + j∈β j. Our universal convention is that det A[∅] = 1. For principal submatrices, Jacobi’s identity assumes the simple form det A−1 [αc ] = det A [α] det A (0.8.4.2) 0.8.5 Schur complements and determinantal formulae. Let A = [ai j] ∈ Mn(F) be given and suppose that α ⊆ {1,., n} is an index set such that A[α] is nonsingular. An important formula for det A, based on the 2-partition of A using α and αc, is det A = det A [α] det A αc − A αc , α A [α] −1 A α, αc (0.8.5.1)