2.6. Singular Value Decomposition 2.6 Singular value Decomposition A very useful tool in matrix analysis is singular value decomposition(SVD). It will be seen that singular values of a matrix are good measures of the"size"of the matrix and hat the corresponding singular vectors are good indications of strong/weak input or Theorem 2.4 Let A E Fmxn. There exist unitary matrices um]∈Fmxm n]∈F such that A=U2V,∑=∑10 G 01≥022…2σp≥0,p=min{m,n} Proof. Let o=A and without loss of generality assume m 2 n. Then, from the definition of‖Al‖, there exists a 2∈ Fn such that Az‖=al‖l‖l By Lemma 2.2, there is a matrix U E Fmxn such that U*U=I and e nave V=[xV]∈Fnxn U11∈Fm be unitary. 2 Consequently, U"AV has the following structure A1: =U AV y*Axy·AVi UiAT UiAV1 JUiy U1AV1 Recall that it is always possible to extend an orthonormal set of vectors to an orthonormal basis for the whole space
2.6. Singular Value Decomposition 19 2.6 Singular Value Decomposition A very useful tool in matrix analysis is singular value decomposition (SVD). It will be seen that singular values of a matrix are good measures of the “size” of the matrix and that the corresponding singular vectors are good indications of strong/weak input or output directions. Theorem 2.4 Let A ∈ Fm×n. There exist unitary matrices U = [u1,u2,...,um] ∈ Fm×m V = [v1,v2,...,vn] ∈ Fn×n such that A = UΣV ∗, Σ = Σ1 0 0 0 , where Σ1 = σ1 0 ··· 0 0 σ2 ··· 0 . . . . . . ... . . . 0 0 ··· σp and σ1 ≥ σ2 ≥···≥ σp ≥ 0, p = min{m,n}. Proof. Let σ = kAk and without loss of generality assume m ≥ n. Then, from the definition of kAk, there exists a z ∈ Fn such that kAzk = σ kzk . By Lemma 2.2, there is a matrix U˜ ∈ F m×n such that U˜ ∗U˜ = I and Az = σUz. ˜ Now let x = z kzk ∈ Fn, y = Uz ˜ Uz ˜ ∈ Fm. We have Ax = σy. Let V = x V1 ∈ Fn×n and U = y U1 ∈ Fm×m be unitary.2 Consequently, U∗AV has the following structure: A1 := U∗AV = y∗Ax y∗AV1 U∗ 1 Ax U∗ 1 AV1 = σy∗y y∗AV1 σU∗ 1 y U∗ 1 AV1 = σ w∗ 0 B , 2Recall that it is always possible to extend an orthonormal set of vectors to an orthonormal basis for the whole space.
LINEAR ALGEBRA where au:=VA'y∈Fn-1andB:=U4V1∈I(m-1)x(n-1 A it follows that A12a2+u”. But since o=‖A‖=‖A1l, we must have w=0. An obvious induction argument gives This completes the proof. The oi is the ith singular value of A, and the vectors ui and Uj are, respectively, the ith left singular vector and the jth right singular vector It is easy to verify that The preceding equations can also be written as A"AU;= 0-Ui Hence a2 is an eigenvalue of AA* or A*A, i is an eigenvector of AA', and vi is an eigenvector of A'*A The following notations for singular values are often adopted T(A)=Omax()=01=the largest singular value of A; g(A)=Omin(A)=p=the smallest singular value of A Geometrically, the singular values of a matrix A are precisely the lengths of the se axes of the hyperellipsoid e defined by E={y:y=Ax,x∈Cn,‖ll=1} Thus v1 is the direction in which y is largest for all a= l; while Un is the direction in which lyll is smallest for all z= 1. From the input/output point of view, un(Un) is the highest(lowest) gain input(or control) direction, while ul(um) is the highest lowest) gain output(or observing) direction. This can be illustrated by the following A=/COSB1 -sin B, 62 sin 01 cos 01 o2 sin 02 cos 02
20 LINEAR ALGEBRA where w := V ∗ 1 A∗y ∈ Fn−1 and B := U∗ 1 AV1 ∈ F(m−1)×(n−1). Since A∗ 1 1 0 . . . 0 2 2 = (σ2 + w∗w), it follows that kA1k2 ≥ σ2 + w∗w. But since σ = kAk = kA1k, we must have w = 0. An obvious induction argument gives U∗AV = Σ. This completes the proof. ✷ The σi is the ith singular value of A, and the vectors ui and vj are, respectively, the ith left singular vector and the jth right singular vector. It is easy to verify that Avi = σiui A∗ui = σivi. The preceding equations can also be written as A∗Avi = σ2 i vi AA∗ui = σ2 i ui. Hence σ2 i is an eigenvalue of AA∗ or A∗A, ui is an eigenvector of AA∗, and vi is an eigenvector of A∗A. The following notations for singular values are often adopted: σ(A) = σmax(A) = σ1 = the largest singular value of A; and σ(A) = σmin(A) = σp = the smallest singular value of A. Geometrically, the singular values of a matrix A are precisely the lengths of the semiaxes of the hyperellipsoid E defined by E = {y : y = Ax, x ∈ Cn, kxk = 1}. Thus v1 is the direction in which kyk is largest for all kxk = 1; while vn is the direction in which kyk is smallest for all kxk = 1. From the input/output point of view, v1 (vn) is the highest (lowest) gain input (or control) direction, while u1 (um) is the highest (lowest) gain output (or observing) direction. This can be illustrated by the following 2 × 2 matrix: A = cos θ1 − sin θ1 sin θ1 cos θ1 σ1 σ2 cos θ2 − sin θ2 sin θ2 cos θ2
2.6. Singular Value Decomposition It is easy to see that A maps a unit circle to an ellipsoid with semiaxes of o1 and o2 Hence it is often convenient to introduce the following alternative definitions for the largest singular value o (A) and for the smallest singular value g of a atria 工(A) Ar‖ Lemma 2.5 Suppose A and A are square matrices. Then ()|a(A+△)-a(A川≤可(△); n)g(A△)≥a(4a(△); (in)7(A)=- if a is invertible (i) By definition a(A+△) =1m04+△z2mn,{Ax-△r IAT (4)-a(△ Hence-7(△)≤a(A+△)-a(A). The other inequality g(A+△)-a(4)≤可(△) follows by replacing A by A+△and△by-△ in the preceding proof (ii)This follows by noting that g(AA 144r nx*△*AA△x ≥g(A) min. rll=g(A)a(△) (ii) Let the singular value decomposition of A be A=UEV*; then A-I=v2-US Hence7(A-1)=可(∑-1)=1/a(∑)=1/a(4) Note that(ii) may not be true if A and A are not square matrices. For example, onsider4=121|and△=[34 J: then g(4)=0ba(4)=√5anda(△)=5
2.6. Singular Value Decomposition 21 It is easy to see that A maps a unit circle to an ellipsoid with semiaxes of σ1 and σ2. Hence it is often convenient to introduce the following alternative definitions for the largest singular value σ: σ(A) := max kxk=1 kAxk and for the smallest singular value σ of a tall matrix: σ(A) := min kxk=1 kAxk . Lemma 2.5 Suppose A and ∆ are square matrices. Then (i) |σ(A + ∆) − σ(A)| ≤ σ(∆); (ii) σ(A∆) ≥ σ(A)σ(∆); (iii) σ(A−1) = 1 σ(A) if A is invertible. Proof. (i) By definition σ(A + ∆) := min kxk=1 k(A + ∆)xk ≥ min kxk=1 {kAxk−k∆xk} ≥ min kxk=1 kAxk − max kxk=1 k∆xk = σ(A) − σ(∆). Hence −σ(∆) ≤ σ(A +∆) − σ(A). The other inequality σ(A + ∆) − σ(A) ≤ σ(∆) follows by replacing A by A + ∆ and ∆ by −∆ in the preceding proof. (ii) This follows by noting that σ(A∆) := min kxk=1 kA∆xk = r min kxk=1 x∗∆∗A∗A∆x ≥ σ(A) min kxk=1 k∆xk = σ(A)σ(∆). (iii) Let the singular value decomposition of A be A = UΣV ∗; then A−1 = V Σ−1U∗. Hence σ(A−1) = σ(Σ−1)=1/σ(Σ) = 1/σ(A). ✷ Note that (ii) may not be true if A and ∆ are not square matrices. For example, consider A = 1 2 and ∆ = 3 4 ; then σ(A∆) = 0 but σ(A) = √5 and σ(∆) = 5.
LINEAR ALGEBRA Some useful properties of SVD are collected in the following lemma. Lemma2.6LetA∈Fm×nand 01>02≥…≥r> =0,r≤min{m,n} Then 1. rank(A)=r: 1 and(Ker A) 3. ImA=panful,., ur) and(ImA)=spanfur+1 4.A∈Fm× has a dyadic expansion A=∑o U∑rVr there Ur= ur g01 7. 0iUoAVO)=0i(A), i=l, ..,p for any appropriately dimensioned unitary ma- trices Uo and 8. Let k <r=rank(A)and Ak: >1 Oiuiu; then A-B=‖A-Ak‖ rank(B) Proof. We shall only give a proof for part 8. It is easy to see that rank (Ak)< k and ence, we only need show that min‖A-B‖≥σk+1.LetB be any matrix such that rank(B)sk. Then B=|一U*BV‖ whB=[l+10]UBv41∈+)(+adm(≤k,Lr∈中1 be such that B r=0 and z= 1. Then 1A-B2|+-p2|(x+-B=|=1x ck+1r>0k+1 Since B is arbitrary, the conclusion follow
22 LINEAR ALGEBRA Some useful properties of SVD are collected in the following lemma. Lemma 2.6 Let A ∈ Fm×n and σ1 ≥ σ2 ≥···≥ σr > σr+1 = ··· = 0, r ≤ min{m,n}. Then 1. rank(A) = r; 2. KerA = span{vr+1,...,vn} and (KerA)⊥ = span{v1,...,vr}; 3. ImA = span{u1,...,ur} and (ImA)⊥ = span{ur+1,...,um}; 4. A ∈ Fm×n has a dyadic expansion: A = Xr i=1 σiuiv∗ i = UrΣrV ∗ r , where Ur = [u1,...,ur], Vr = [v1,...,vr], and Σr = diag(σ1,...,σr); 5. kAk2 F = σ2 1 + σ2 2 + ··· + σ2 r ; 6. kAk = σ1; 7. σi(U0AV0) = σi(A), i = 1,...,p for any appropriately dimensioned unitary matrices U0 and V0; 8. Let k<r = rank(A) and Ak := Pk i=1 σiuiv∗ i ; then min rank(B)≤k kA − Bk = kA − Akk = σk+1. Proof. We shall only give a proof for part 8. It is easy to see that rank(Ak) ≤ k and kA − Akk = σk+1. Hence, we only need show that min rank(B)≤k kA − Bk ≥ σk+1. Let B be any matrix such that rank(B) ≤ k. Then kA − Bk = kUΣV ∗ − Bk = kΣ − U∗BV k ≥ Ik+1 0 (Σ − U∗BV ) Ik+1 0 = Σk+1 − Bˆ , where Bˆ = Ik+1 0 U∗BV Ik+1 0 ∈ F(k+1)×(k+1) and rank(Bˆ) ≤ k. Let x ∈ Fk+1 be such that Bxˆ = 0 and kxk = 1. Then kA − Bk ≥ Σk+1 − Bˆ ≥ (Σk+1 − Bˆ)x = kΣk+1xk ≥ σk+1. Since B is arbitrary, the conclusion follows. ✷
2. 7. Semidefinite matrices Illustrative matlab commands 》[UΣⅤ]=svd(A)%A=U∑V Related matlaB commands: cond. condest 2.7 Semidefinite matrices A square Hermitian matrix A= A' is said to be positive definite(semidefinite), denoted byA>0C≥0),ix*Ax>0(≥0) for all ar≠0. Suppose A∈ and A=A≥0; then there exists a B E FnX with r rank(A)such that A=bB. Lemma27LetB∈ Fmxn and C∈Fxn. Suppose m≥ k and B→B=C"C.Then there exists a matrit U E Fmxk such that.U=I and B=UC Proof. Let Vi and v2 be unitary matrices such that C1 where Bi and CI are full-row rank. Then BI and CI have the same number of rows and V3: B1C1C1C1-satisfies V3 V3=I since B"B=C*C. Hence V3 is a unitary natrix and V3 B1=C1. Finally, le U=V1 V30 for any suitably dimensioned V such that VAV4=l. We can define square root for a positive semidefinite matrix A, A/2=( A=A42A1/2 Clearly, A1/2 can be computed by using spectral decomposition or SVD: Let A=UAU* A= diag(Ai, .. An), A1/2=diagfvAi, ..., vAn
2.7. Semidefinite Matrices 23 Illustrative MATLAB Commands: [U, Σ, V] = svd(A) % A = UΣV ∗ Related MATLAB Commands: cond, condest 2.7 Semidefinite Matrices A square Hermitian matrix A = A∗ is said to be positive definite (semidefinite), denoted by A > 0 (≥ 0), if x∗Ax > 0 (≥ 0) for all x 6= 0. Suppose A ∈ Fn×n and A = A∗ ≥ 0; then there exists a B ∈ Fn×r with r ≥ rank(A) such that A = BB∗. Lemma 2.7 Let B ∈ Fm×n and C ∈ Fk×n. Suppose m ≥ k and B∗B = C∗C. Then there exists a matrix U ∈ Fm×k such that U∗U = I and B = UC. Proof. Let V1 and V2 be unitary matrices such that B = V1 B1 0 , C = V2 C1 0 , where B1 and C1 are full-row rank. Then B1 and C1 have the same number of rows and V3 := B1C∗ 1 (C1C∗ 1 )−1 satisfies V ∗ 3 V3 = I since B∗B = C∗C. Hence V3 is a unitary matrix and V ∗ 3 B1 = C1. Finally, let U = V1 V3 0 0 V4 V ∗ 2 for any suitably dimensioned V4 such that V ∗ 4 V4 = I. ✷ We can define square root for a positive semidefinite matrix A, A1/2 = (A1/2)∗ ≥ 0, by A = A1/2A1/2. Clearly, A1/2 can be computed by using spectral decomposition or SVD: Let A = UΛU∗; then A1/2 = UΛ1/2U∗, where Λ = diag{λ1,...,λn}, Λ1/2 = diag{ pλ1,...,pλn}