101.Linearalgebraany orthonormal basis H, P = HH', where H'H = Ik. Thus, incidentally,tr P =k and the dimension of the image space is expressed in the trace.However,by this representation we see that for any two orthogonalprojections, P1=HH'and P2=GG',PP2=0H'G=0一GH=0一P2P1=0Definition1.1 PandP2aresaidtobemutually orthogonal projectionsiffP1 andP2are orthogonal projectionssuchthatP,P2=0.WewritePiIP2whenthis isthe case.Although orthogonal projection and orthogonal transformation are farfrom synonymous,there is,nevertheless,finally a very close connectionbetween the two concepts.If we partition any orthogonal transformationH=(Hi,...,H), then the brute algebraic factHH'=I=H,H'+...+H,H'represents a precisely corresponding partition of the identity into mutuallyorthogonal projections.As a last comment on othogonal projection, if P is the orthogonal projec-tion on the subspace V C Rn, then Q =I-P, which satisfies Q = Q' = Q?is also an othogonal projection.Infact, sincePQ=o,then Im Qand ImPare orthogonal subspaces and, thus, Q is the orthogonal projection on V+.1.7Matrix decompositionsDenote the groups of triangular matrices with positive diagonal elementsasLt=[TeRn : Tis lower triangular,t>0, i=1,...,n],Ut = [Te Rn : Tis upper triangular,tu >0, i- l,...,n].An important implication of Proposition 1.2 for matrices is the followingmatrix decomposition.Proposition 1.13 If A E Rn is nonsingular, then A = TH for someHeOn andTeLt.Moreover, this decomposition isunique.Proof. The existence follows from the Gram-Schmidt method applied tothe basis formed by the rows of A. The rows of H form the orthonormalbasis obtained at the end of that procedure and the elements of T = (ti)are the coefficients needed to go from one basis to the other.By the Gram-Schmidt construction itself, it is clear that T E Lt.For unicity,supposeTH=THi, where Ti ELt and Hi EOn.Then, T-T=HH' is amatrix in LnOn.But, In is the only such matrix (why?).Hence, T-Ti口and H=Hi
10 1. Linear algebra any orthonormal basis H, P = HH , where H H = Ik. Thus, incidentally, tr P = k and the dimension of the image space is expressed in the trace. However, by this representation we see that for any two orthogonal projections, P1 = HH and P2 = GG , P1P2 = 0 ⇐⇒ H G = 0 ⇐⇒ G H = 0 ⇐⇒ P2P1 = 0. Definition 1.1 P1 and P2 are said to be mutually orthogonal projections iff P1 and P2 are orthogonal projections such that P1P2 = 0. We write P1 ⊥ P2 when this is the case. Although orthogonal projection and orthogonal transformation are far from synonymous, there is, nevertheless, finally a very close connection between the two concepts. If we partition any orthogonal transformation H = (H1,., Hk), then the brute algebraic fact HH = I = H1H 1 + ··· + HkH k represents a precisely corresponding partition of the identity into mutually orthogonal projections. As a last comment on othogonal projection, if P is the orthogonal projection on the subspace V ⊂ Rn, then Q = I−P, which satisfies Q = Q = Q2 is also an othogonal projection. In fact, since PQ = 0, then Im Q and Im P are orthogonal subspaces and, thus, Q is the orthogonal projection on V⊥. 1.7 Matrix decompositions Denote the groups of triangular matrices with positive diagonal elements as L+ n = {T ∈ Rn n : T is lower triangular, tii > 0, i = 1,.,n}, U+ n = {T ∈ Rn n : T is upper triangular, tii > 0, i = 1,.,n}. An important implication of Proposition 1.2 for matrices is the following matrix decomposition. Proposition 1.13 If A ∈ Rn n is nonsingular, then A = TH for some H ∈ On and T ∈ L+ n . Moreover, this decomposition is unique. Proof. The existence follows from the Gram-Schmidt method applied to the basis formed by the rows of A. The rows of H form the orthonormal basis obtained at the end of that procedure and the elements of T = (tij ) are the coefficients needed to go from one basis to the other. By the GramSchmidt construction itself, it is clear that T ∈ L+ n . For unicity, suppose TH = T1H1, where T1 ∈ L+ n and H1 ∈ On. Then, T−1 1 T = H1H is a matrix in L+ n ∩On. But, In is the only such matrix (why?). Hence, T = T1 and H = H1. ✷
111.8.ProblemsAslight generalization of Proposition 1.13when A ERp is of rank A=pis proposedinProblem 1.8.7.Another similar triangular decomposition.known in statistics as theBartlett decomposition,for positive definitematrices can nowbeeasily obtained.Proposition 1.l4IfSePn,thenS=TT'fora uniqueTeL+Proof.Since S>O,then S=HDH', whereHEOn andD =diag(,)with 入; > 0. Let D1/2 = diag(/2) and A = HD1/2. Then, we can writeS=AA',whereA is nonsingular.From Proposition 1.13, there existsTEL and G EOn such that A - TG. But, then, S = TGG'T' = TT'. Forunicity, suppose TT' = TiTi, where T e Lt. Then, T-'TT'T'-1 = I,口which implies that T-lT e L+ nOn = [I]. Hence, T= Ti.Other notions of linear algebra such as Kronecker product and"vec"operator will be recalled when needed in the sequel.1.8ProblemsS11S121.Consider thepartitioned matrixS= (sii):S21S22(i)If Sii is nonsingular,prove that[S|=|S11|S22-S21S-S12](ii)ForS>0,proveHadamard'sinequality,S<,Sii(ii)Let S and Sit be nonsingular.Prove that(STl +SiS12S22.1S21ST)-SiS12SS-1S22.1-S22.1S21Si1where S22.1 = S22 - S21S=-1S12:(iv)LetS and S22benonsingular.ProvethatSil2-Si1.2S12S22S-1-S2S21Si1.2S22+S22S21S-12S12S22where S11.2= S11 - S12SlS21.Hint:Define10Si'S12Aand B10S21Siland considertheproduct ASB2.Establish with the partitioning(x1,x2),xS12(S11sS21S22)
1.8. Problems 11 A slight generalization of Proposition 1.13 when A ∈ Rp n is of rank A = p is proposed in Problem 1.8.7. Another similar triangular decomposition, known in statistics as the Bartlett decomposition, for positive definite matrices can now be easily obtained. Proposition 1.14 If S ∈ Pn, then S = TT for a unique T ∈ L+ n . Proof. Since S > 0, then S = HDH , where H ∈ On and D = diag(λi) with λi > 0. Let D1/2 = diag(λ1/2 i ) and A = HD1/2. Then, we can write S = AA , where A is nonsingular. From Proposition 1.13, there exists T ∈ L+ n and G ∈ On such that A = TG. But, then, S = TGG T = TT . For unicity, suppose TT = T1T 1, where T1 ∈ L+ n . Then, T−1 1 TT T−1 1 = I, which implies that T−1 1 T ∈ L+ n ∩ On = {I}. Hence, T = T1. ✷ Other notions of linear algebra such as Kronecker product and “vec” operator will be recalled when needed in the sequel. 1.8 Problems 1. Consider the partitioned matrix S = (sij ) = S11 S12 S21 S22 . (i) If S11 is nonsingular, prove that |S| = |S11|·|S22 − S21S−1 11 S12|. (ii) For S > 0, prove Hadamard’s inequality, |S| ≤ i sii. (iii) Let S and S11 be nonsingular. Prove that S−1 = S−1 11 + S−1 11 S12S−1 22.1S21S−1 11 −S−1 11 S12S−1 22.1 −S−1 22.1S21S−1 11 S−1 22.1 , where S22.1 = S22 − S21S−1 11 S12. (iv) Let S and S22 be nonsingular. Prove that S−1 = S−1 11.2 −S−1 11.2S12S−1 22 −S−1 22 S21S−1 11.2 S−1 22 + S−1 22 S21S−1 11.2S12S−1 22 , where S11.2 = S11 − S12S−1 22 S21. Hint: Define A = I 0 −S21S−1 11 I and B = I −S−1 11 S12 0 I and consider the product ASB. 2. Establish with the partitioning x = (x 1, x 2) , S = S11 S12 S21 S22
121.LinearalgebrathatxS-1x=(X1-S12S2x2)Si12(x1-S12S2x2)+xS22x23.ForanyAeRp,BeRg,prove thefollowing:(i) [Ip + AB| = [Ig + BA/Hint:(I+ABA0IA0Ia0B1AIpAB0Ig+BAB(i) The nonzero eigenvalues of AB and BA are the same.4.ProveProposition1.25.ProveProposition 1.10.6. Show that if P defines an orthogonal projection, then the eigenvaluesof P are either 0 or 1.7.Demonstratethe slightgeneralizationsof Proposition1.13:(i) If A e Rn is of rank A = p, then A = HT for some T e Ut andH satisfying H'H=I,.Further, T and H are unique.Hint: For unicity, note that if A=HT=H,T with TiEU+and H'Hi = Ip, then Im A = Im H= Im Hi and H,H' is theorthogonalprojection on Im H,(i) If A e Rn is of rank A = n, then A = TH, where T L andHH'=In.Further, T and H are unique.8.Assuming A and A +uvare nonsingular,proveA-luv'A-1(A +uv)-1 =A-1 _ (1 +vA-1u)9.Vector differentiation.Let f(x) be a real valued function of x ERn.Define0f(x)/ox=(of(x)/0r)Verify(i)0a'x/0x=a.(ii)x'Ax/ox=2Ax,ifAissymmetric10. Matrix differentiation [Srivastava and Khatri (1979),P.37]Let g(S) be a real-valued function of the symmetric matrix S ERnDefinef(S)/aS =((1+8)of(S)/0sit).Verify(i)otr(S-1A)/aS=-S-1AS-1,ifAissymmetric(ii) lns//as=s-1
12 1. Linear algebra that x S−1x = (x1 − S12S−1 22 x2) S−1 11.2(x1 − S12S−1 22 x2) + x 2S−1 22 x2. 3. For any A ∈ Rp q , B ∈ Rq p, prove the following: (i) |Ip + AB| = |Iq + BA|. Hint: Ip + AB A 0 Iq = Ip A −B Iq Ip 0 B Iq , Ip A 0 Iq + BA = Ip 0 B Iq Ip A −B Iq . (ii) The nonzero eigenvalues of AB and BA are the same. 4. Prove Proposition 1.2. 5. Prove Proposition 1.10. 6. Show that if P defines an orthogonal projection, then the eigenvalues of P are either 0 or 1. 7. Demonstrate the slight generalizations of Proposition 1.13: (i) If A ∈ Rn p is of rank A = p, then A = HT for some T ∈ U+ p and H satisfying H H = Ip. Further, T and H are unique. Hint: For unicity, note that if A = HT = H1T1 with T1 ∈ U+ p and H 1H1 = Ip, then Im A = Im H = Im H1 and H1H 1 is the orthogonal projection on Im H1. (ii) If A ∈ Rn p is of rank A = n, then A = TH, where T ∈ L+ n and HH = In. Further, T and H are unique. 8. Assuming A and A + uv are nonsingular, prove (A + uv ) −1 = A−1 − A−1uv A−1 (1 + v A−1u) . 9. Vector differentiation. Let f(x) be a real valued function of x ∈ Rn. Define ∂f(x)/∂x = (∂f(x)/∂xi). Verify (i) ∂a x/∂x = a, (ii) ∂x Ax/∂x = 2Ax, if A is symmetric. 10. Matrix differentiation [Srivastava and Khatri (1979), p. 37]. Let g(S) be a real-valued function of the symmetric matrix S ∈ Rn n. Define ∂f(S)/∂S = 1 2 (1 + δij )∂f(S)/∂sij . Verify (i) ∂tr(S−1A)/∂S = −S−1AS−1, if A is symmetric, (ii) ∂ ln |S|/∂S = S−1
131.8.ProblemsHint for (ii): S-1 = [S|-1adj(S)1l.Rayleigh's quotient.AssumeS≥0inRnwitheigenvaluesAi≥.≥Anandcorresponding eigenvectors x1,...,xn.Prove:(i)x'SxAn<<>1, Vx±0x'x(ii) For any fixed j = 2,...n,XSx≤>j, Vx*0x'xsuch that (x,x1) =...= (x,xj-1) = 0.12. Demonstrate that if A is symmetric and B >0, thenh'Ah= ^1(AB-1)supIb/=i h/Bhwhere A(AB-1) denotes the largest eigenvalue of AB-1.13.Let Am>0 inRn (m =1,2,...)be a sequence.ForanyAERndefinelIAll? - Ei, ai, and let A1.m ≥ .. ≥ An.m be the orderedeigenvalues of Am'Prove that if >1,m → 1 and An,m -→ 1, thenlimm→o IIAm - I = 0.14.InRP,provethat ifxi/=x2l.thenthereexistsHEO,suchthatHx1=X2.Hint:When x1+o, consider H e Op with first row xi/xil15.Showthatfor anyV eRn and anym=1,2,...(i) if (I-tV) is nonsingular then [Srivastava and Khatri (1979), P33](I -tV)-1 = tivi + tm+1vm+1(I- tV)-1i=0(ii) If V>0with eigenvalues > ≥...≥^p and [t|<1/入1, then(I - tv)-1 - t'vi.i=0
1.8. Problems 13 Hint for (ii): S−1 = |S| −1adj(S). 11. Rayleigh’s quotient. Assume S ≥ 0 in Rn n with eigenvalues λ1 ≥ ··· ≥ λn and corresponding eigenvectors x1,., xn. Prove: (i) λn ≤ x Sx x x ≤ λ1, ∀x = 0. (ii) For any fixed j = 2,.,n, x Sx x x ≤ λj , ∀x = 0 such that x, x1 = ··· = x, xj−1 = 0. 12. Demonstrate that if A is symmetric and B > 0, then sup |h|=1 h Ah h Bh = λ1(AB−1), where λ1(AB−1) denotes the largest eigenvalue of AB−1. 13. Let Am > 0 in Rn n (m = 1, 2,.) be a sequence. For any A ∈ Rn n, define ||A||2 = i,j a2 ij and let λ1,m ≥ ··· ≥ λn,m be the ordered eigenvalues of Am. Prove that if λ1,m → 1 and λn,m → 1, then limm→∞ ||Am − I|| = 0. 14. In Rp, prove that if |x1| = |x2|, then there exists H ∈ Op such that Hx1 = x2. Hint: When x1 = 0, consider H ∈ Op with first row x 1/|x1|. 15. Show that for any V ∈ Rn n and any m = 1, 2,., (i) if (I − tV) is nonsingular then [Srivastava and Khatri (1979), p. 33] (I − tV) −1 = m i=0 t i Vi + t m+1Vm+1(I − tV) −1. (ii) If V > 0 with eigenvalues λ1 ≥···≥ λp and |t| < 1/λ1, then (I − tV) −1 = ∞ i=0 t i Vi .
2Random vectors2.1IntroductionA random vectoris simplya vectorwhose components arerandom vari-ables.The variables are the characteristics of interest that will be observedon each of the selected units in the sample.Questions related to prob-abilities of a variable to take on some values or probabilities of two ormore variables to take on simultaneously values in a set are common inmultivariate analysis.Chapter 2gives a collection of important probabilityconcepts on random vectors such as distribution functions,expected values,characteristic functions,discrete and absolutely continuous distributions,independence, etc.2.2Distribution functionsFirst, some basic notations concerning "rectangles"useful to describe thedistribution function of a random vector are given. Let R = R U (±oo] =[oo, oo]. It is convenient to define a partial order on Rn byx≤yiffai≤yi,Vi=l,...,n,andx<yiffr<yi,Vi=l,...,n
2 Random vectors 2.1 Introduction A random vector is simply a vector whose components are random variables. The variables are the characteristics of interest that will be observed on each of the selected units in the sample. Questions related to probabilities of a variable to take on some values or probabilities of two or more variables to take on simultaneously values in a set are common in multivariate analysis. Chapter 2 gives a collection of important probability concepts on random vectors such as distribution functions, expected values, characteristic functions, discrete and absolutely continuous distributions, independence, etc. 2.2 Distribution functions First, some basic notations concerning “rectangles” useful to describe the distribution function of a random vector are given. Let R¯ = R ∪ {±∞} = [−∞,∞]. It is convenient to define a partial order on R¯ n by x ≤ y iff xi ≤ yi, ∀i = 1, . . . , n, and x < y iff xi < yi, ∀i = 1, . . . , n