2.1 TRACES,MOMENTS AND COMBINATORICS 17 fi(i)=fi(i-1)+1 (resp.,fi(i)=fn(i-1)-1)if i is the first (resp.,second) member of the part of II to which i belongs.It is easy to check that fi is a Dyck path,and furthermore that the map IIfi puts non-crossing pair partitions of into bijective correspondence with Dyck paths of length k.Now choose a Wigner word w whose associated Dyck path D(w),see the proof of Lemma 2.1.6, equals fi.One can verify that I=II. (iii)Given I=I,one can verify that D(w)=D(w),from which the equiva- lence of w and w'follows. 0 2.1.4 Proof of Lemma 2.1.7:Sentences and Graphs By Chebyshev's inequality,it is enough to prove that mlE(w,}2)-么,1=0. Proceeding as in (2.1.10),one has 4)-=京 (2.1.21) 1…=1 …= where T=ETNTN-ETNET] (2.1.22) The role of words in the proof of Lemma 2.1.6 is now played by pairs of words, which is a particular case of a sentence. Definition 2.1.12 (.-Sentences)Given a set.9,an.9-sentence a is a finite se- quence of.-words w1,...,wn,at least one word long.Two .9-sentences al,a2 are called equivalent,denoted a ~a2,if there is a bijection on that maps one into the other. As with words,for a sentence a =(w1,w2,...,wn),we define the support as supp(a)=Usupp(wi),and the weight wt(a)as the cardinality of supp(a). Definition 2.1.13(Graph associated to an.9-sentence)Given a sentence a= (w1),with w)we set Ga=(Va,Ea)to be the graph with set of vertices Va=supp(a)and (undirected)edges Ea={+}J=1,,(w)-1,i=1…,k. We define the set of self edges as E=e EEa:e=fu,u),uEVa}and the set of connecting edges as Eg=EaEg
2.1 TRACES, MOMENTS AND COMBINATORICS 17 fΠ(i) = fΠ(i−1)+1 (resp., fΠ(i) = fΠ(i−1)−1) if i is the first (resp., second) member of the part of Π to which i belongs. It is easy to check that fΠ is a Dyck path, and furthermore that the map Π 7→ fΠ puts non-crossing pair partitions of Kk into bijective correspondence with Dyck paths of length k. Now choose a Wigner word w whose associated Dyck path D(w), see the proof of Lemma 2.1.6, equals fΠ. One can verify that Πw = Π. (iii) Given Πw = Πw′ , one can verify that D(w) = D(w ′ ), from which the equivalence of w and w ′ follows. ⊓⊔ 2.1.4 Proof of Lemma 2.1.7 : Sentences and Graphs By Chebyshev’s inequality, it is enough to prove that lim N→∞ |E hLN,x k i 2 − hL¯N,x k i 2 | = 0. Proceeding as in (2.1.10), one has E(hLN,x k i 2 )− hL¯N,x k i 2 = 1 N2 N ∑ i1 ,...,i k=1 i ′ 1 ,...,i ′ k=1 T¯N i,i ′ , (2.1.21) where T¯N i,i ′ = ETN i T N i ′ −ETN i ET N i ′ . (2.1.22) The role of words in the proof of Lemma 2.1.6 is now played by pairs of words, which is a particular case of a sentence. Definition 2.1.12 (S -Sentences) Given a set S , an S -sentence a is a finite sequence of S -words w1,...,wn, at least one word long. Two S -sentences a1,a2 are called equivalent, denoted a1 ∼ a2, if there is a bijection on S that maps one into the other. As with words, for a sentence a = (w1,w2,...,wn), we define the support as supp(a) = Sn i=1 supp(wi), and the weight wt(a) as the cardinality of supp(a). Definition 2.1.13 (Graph associated to an S -sentence) Given a sentence a = (w1,...,wk), with wi = s i 1 s i 2 ···s i ℓ(wi) , we set Ga = (Va,Ea) to be the graph with set of vertices Va = supp(a) and (undirected) edges Ea = {{s i j ,s i j+1 }, j = 1,...,ℓ(wi)−1,i = 1,...,k}. We define the set of self edges as E s a = {e ∈ Ea : e = {u,u},u ∈Va} and the set of connecting edges as E c a = Ea \E s a .
18 2.WIGNER MATRICES In words,the graph associated with a sentence a=(w1,...,wk)is obtained by piecing together the graphs of the individual words w,(and in general,it differs from the graph associated with the word obtained by concatenating the words w).Unlike the graph of a word,the graph associated with a sentence may be disconnected.Note that the sentence a defines k paths in the graph Ga.For e Ea, we use Na to denote the number of times the union of these paths traverses the edge e(in any direction).We note that equivalent sentences generate the same graphs Ga and the same passage-counts Na Coming back to the evaluation of Tir,see(2.1.21),recall the closed words wi,wi of length k+1,and define the two-word sentence ai.=(wi,wr).Then, 0=LΠZ战)】) (2.1.23) -ⅡE(Z)ΠEy)ⅡEz)ⅡEY")] eEE eEE eEEwy In particular,0 unless N2 for all eEAlso,0 unless EE0.Further,(2.1.23)shows that if ai.r~aj.r then T=T.Finally, if N >t then there are exactly CN,N-sentences that are equivalent to a given N-sentence of weight t.We make the following definition: denotes a set ofrepresentatives for equivalence classes of sentences of weight t consisting of two closed t-words (w1,w2),each of length k+1, with Na≥2 for eache∈Ea,and Ew,nEw2≠0. (2.1.24) One deduces from(2.1.21)and (2.1.23)that E(Lw,3)-(亿N,2 (2.1.25) 之e(528马5的 -盟2凸0)™)™0m*) eEE eEE We have completed the preliminaries to Proof of Lemma.1.7 In view of (2.15),it suffices to check thatis empty fort>k+2.Since we need it later,we prove a slightly stronger claim,namely hatw2 is empty for1≥k+l。 Toward this end,note that ifthen isa conected graphwith vertices and at most k edges(since Na >2 for eE Ea),which is impossible when
18 2. WIGNER MATRICES In words, the graph associated with a sentence a = (w1,...,wk) is obtained by piecing together the graphs of the individual words wi (and in general, it differs from the graph associated with the word obtained by concatenating the words wi). Unlike the graph of a word, the graph associated with a sentence may be disconnected. Note that the sentence a defines k paths in the graph Ga. For e ∈ Ea, we use N a e to denote the number of times the union of these paths traverses the edge e (in any direction). We note that equivalent sentences generate the same graphs Ga and the same passage-counts N a e . Coming back to the evaluation of T¯ i,i ′ , see (2.1.21), recall the closed words wi ,wi ′ of length k +1, and define the two-word sentence ai,i ′ = (wi ,wi ′). Then, T¯N i,i ′ = 1 Nk h ∏e∈Ec a i,i ′ E(Z N a e 1,2 ) ∏e∈Es a i,i ′ E(Y N a e 1 ) (2.1.23) − ∏e∈Ec wi E(Z N wi e 1,2 ) ∏e∈Es wi E(Y N wi e 1 ) ∏e∈Ec w i ′ E(Z N w i ′ e 1,2 ) ∏e∈Es w i ′ E(Y N w i ′ e 1 ) i . In particular, T¯N i,i ′ = 0 unless N ai,i ′ e ≥ 2 for all e ∈ Eai,i ′ . Also, T¯N i,i ′ = 0 unless Ewi ∩Ewi ′ 6= /0. Further, (2.1.23) shows that if ai,i ′ ∼ aj,j ′ then T¯N i,i ′ = T¯N j,j ′ . Finally, if N ≥ t then there are exactly CN,t N-sentences that are equivalent to a given N-sentence of weight t. We make the following definition: W (2) k,t denotes a set of representatives for equivalence classes of sentences a of weight t consisting of two closed t-words (w1,w2), each of length k +1, with N a e ≥ 2 for each e ∈ Ea, and Ew1 ∩Ew2 6= /0 . (2.1.24) One deduces from (2.1.21) and (2.1.23) that E(hLN,x k i 2 )− hL¯N,x k i 2 (2.1.25) = 2k ∑ t=1 CN,t Nk+2 ∑ a=(w1,w2)∈W (2) k,t ∏e∈Ec a E(Z N a e 1,2 ) ∏e∈Es a E(Y N a e 1 ) − ∏e∈Ec w1 E(Z N w1 e 1,2 ) ∏e∈Es w1 E(Y N w1 e 1 ) ∏e∈Ec w2 E(Z N w2 e 1,2 ) ∏e∈Es w2 E(Y N w2 e 1 ) . We have completed the preliminaries to Proof of Lemma 2.1.7 In view of (2.1.25), it suffices to check that W (2) k,t is empty for t ≥ k + 2. Since we need it later, we prove a slightly stronger claim, namely that W (2) k,t is empty for t ≥ k +1. Toward this end, note that if a ∈ W (2) k,t then Ga is a connected graph, with t vertices and at most k edges (since N a e ≥ 2 for e ∈ Ea), which is impossible when
2.1 TRACES,MOMENTS AND COMBINATORICS 19 t>k+1.Considering the case t=k+1,it follows that Ga is a tree,and each edge must be visited by the paths generated by a exactly twice.Because the path generated by wi in the tree Ga starts and end at the same vertex,it must visit each edge an even number of times.Thus,the set of edges visited by w is disjoint from the set of edges visited by contradicting the definition of Q Remark 2.1.14 Note that in the course of the proof of Lemma 2.1.7,we actually showed that for N >2k, E((LNx2)-(LN,x)2 (2.1.26) a=(1,92)W -夏EZ)刀')E(Z)刀] that is,that the summation in (2.1.25)can be restricted tot<k Exercise 2.1.15 Consider symmetric random matrices XN,with the zero mean independent random variables [X(i,)N no longer assumed identically distributed nor all of variance 1/N.Check that Theorem 2.1.1 still holds if one assumes that for all e>0, lim #{6,):1-NEXw61<=1, N2 and for all k>1,there exists a finite r&independent of N such that sup Ev不xx(,≤rk. 1<i<j<N Exercise 2.1.16 Check that the conclusion of Theorem 2.1.1 remains true when convergence in probability is replaced by almost sure convergence. Hint:Using Chebyshev's inequality and the Borel-Cantelli lemma,it is enough to verify that for all integers k,there exists a constant C=C(k)such that (,)-,9≤是 Exercise 2.1.17 In the setup of Theorem 2.1.1,assume that r<for all k but not necessarily that E[Z2]=1.Show that for any integer number k, sup E[(LN,.x】=:C(L,L≤)<∞. NEN
2.1 TRACES, MOMENTS AND COMBINATORICS 19 t > k + 1. Considering the case t = k + 1, it follows that Ga is a tree, and each edge must be visited by the paths generated by a exactly twice. Because the path generated by w1 in the tree Ga starts and end at the same vertex, it must visit each edge an even number of times. Thus, the set of edges visited by w1 is disjoint from the set of edges visited by w2, contradicting the definition of W (2) k,t . ⊓⊔ Remark 2.1.14 Note that in the course of the proof of Lemma 2.1.7, we actually showed that for N > 2k, E(hLN,x k i 2 )− hL¯N,x k i 2 (2.1.26) = k ∑ t=1 CN,t Nk+2 ∑ a=(w1,w2)∈W (2) k,t h ∏e∈Ec a E(Z N a e 1,2 ) ∏e∈Es a E(Y N a e 1 ) − ∏e∈Ec w1 E(Z N w1 e 1,2 ) ∏e∈Es w1 E(Y N w1 e 1 ) ∏e∈Ec w2 E(Z N w2 e 1,2 ) ∏e∈Es w2 E(Y N w2 e 1 ) i , that is, that the summation in (2.1.25) can be restricted to t ≤ k. Exercise 2.1.15 Consider symmetric random matrices XN, with the zero mean independent random variables {XN(i, j)}1≤i≤j≤N no longer assumed identically distributed nor all of variance 1/N. Check that Theorem 2.1.1 still holds if one assumes that for all ε > 0, lim N→∞ #{(i, j) : |1−NEXN(i, j) 2 | < ε} N2 = 1, and for all k ≥ 1, there exists a finite rk independent of N such that sup 1≤i≤j≤N E √ NXN(i, j) k ≤ rk . Exercise 2.1.16 Check that the conclusion of Theorem 2.1.1 remains true when convergence in probability is replaced by almost sure convergence. Hint: Using Chebyshev’s inequality and the Borel-Cantelli lemma, it is enough to verify that for all integers k, there exists a constant C = C(k) such that |E hLN,x k i 2 − hL¯N,x k i 2 | ≤ C N2 . Exercise 2.1.17 In the setup of Theorem 2.1.1, assume that rk < ∞ for all k but not necessarily that E[Z 2 1,2 ] = 1. Show that for any integer number k, sup N∈N E[hLN,x k i] =: C(rℓ ,ℓ ≤ k) < ∞
20 2.WIGNER MATRICES Exercise 2.1.18 We develop in this exercise the limit theory for Wishart matrices. Let M=M(N)be a sequence of integers such that limM(W)/N=oa∈[l,∞). Consider an Nx M(N)matrix Yy with i.i.d.entries of mean zero and variance 1/N,and such that E (NYN(1,1))<.Define the Nx N Wishart matrix as WN=YNYN,and let LN denote the empirical measure of the eigenvalues of W. Set LN ELN. (①Write N(Lw,xas Σ EYN(i1,j)YN(i2,j)YN(i2,j)YN(is,j2)...YN(ik,jk)YN(i1,jk) 11,,1k J1....Jk and show that the only contributions to the sum (divided by N)that survive the passage to the limit are those in which each term appears exactly twice. Hint:use the words i1ji2j2...ji and a bi-partite graph to replace the Wigner analysis. (ii)Code the contributions as Dyck paths,where the even heights correspond to i indices and the odd heights correspond to j indices.Let e=e(i,j)denote the number of times the excursion makes a descent from an odd height to an even height(this is the number of distinct j indices in the tuple!),and show that the combinatorial weight of such a path is asymptotic to a (iii)Let e denote the number of times the excursion makes a descent from an even height to an odd height,and set B&= ∑ X= 习 Dyck paths of length 2k Dyck paths of length 2k (The B&are the kth moments of any weak limit of LN.)Prove that 民=立-1a=龙A-M-1k之1 =1 i=l (iv)Setting Ba(=)=0Bx,prove that Ba(=)=1+=Ba(=)2+(a-1)=Ba(=), and thus the limit F of LN possesses the Stieltjes transform(see Definition 2.4.1) --1Ba(1/=),where 1-(a-1-V1-4z哔-- Ba(=)= 2
20 2. WIGNER MATRICES Exercise 2.1.18 We develop in this exercise the limit theory for Wishart matrices. Let M = M(N) be a sequence of integers such that lim N→∞ M(N)/N = α ∈ [1,∞). Consider an N × M(N) matrix YN with i.i.d. entries of mean zero and variance 1/N, and such that E N k/2 |YN(1,1)| k ≤ rk < ∞. Define the N×N Wishart matrix as WN =YNY T N , and let LN denote the empirical measure of the eigenvalues of WN. Set L¯N = ELN. (i) Write NhL¯N,x k i as ∑ i1,...,ik j1,..., jk EYN(i1, j1)YN(i2, j1)YN(i2, j2)YN(i3, j2)···YN(ik , jk)YN(i1, jk) and show that the only contributions to the sum (divided by N) that survive the passage to the limit are those in which each term appears exactly twice. Hint: use the words i1 j1i2 j2 ... jk i1 and a bi-partite graph to replace the Wigner analysis. (ii) Code the contributions as Dyck paths, where the even heights correspond to i indices and the odd heights correspond to j indices. Let ℓ = ℓ(i,j) denote the number of times the excursion makes a descent from an odd height to an even height (this is the number of distinct j indices in the tuple!), and show that the combinatorial weight of such a path is asymptotic to N k+1α ℓ . (iii) Let ℓ¯denote the number of times the excursion makes a descent from an even height to an odd height, and set βk = ∑ Dyck paths of length 2k α ℓ , γk = ∑ Dyck paths of length 2k α ℓ¯ . (The βk are the kth moments of any weak limit of L¯N.) Prove that βk = α k ∑ j=1 γk−jβ j−1 , γk = k ∑ j=1 βk−jγ j−1 ,k ≥ 1. (iv) Setting ˆβα(z) = ∑ ∞ k=0 z kβk , prove that ˆβα(z) = 1 + z ˆβα(z) 2 +(α − 1)z ˆβα(z), and thus the limit Fα of L¯N possesses the Stieltjes transform (see Definition 2.4.1) −z −1 ˆβα(1/z), where ˆβα(z) = 1−(α −1)z− r 1−4z h α+1 2 − (α−1) 2z 4 i 2z
2.1 TRACES,MOMENTS AND COMBINATORICS 21 (v)Conclude that Fa possesses a density fa supported on [b-,b+],with b-= (1-va)2,b=(1+va)2,satisfying fa)=r-b-b4- 2,x∈[b-,b+] (2.1.27) 27πx (This is the famous Marchenko-Pastur law,due to [MaP67].) (vi)Prove the analog of Lemma 2.1.7 for Wishart matrices,and deduce that LN-Fa weakly,in probability. (vii)Note that Fi is the image of the semicircle distribution under the transforma- tionxx2 2.1.5 Some useful approximations This section is devoted to the following simple observation that often allows one to considerably simplify arguments concerning the convergence of empirical mea- sures. Lemma 2.1.19 (Hoffman-Wielandt)Let A,B be Nx N symmetric matrices,with eigenvalues1≤2≤..≤λfamdλB≤25≤..≤λR.Then, ∑I4-P≤r(4-B2. ✉1 Proof Note that trA2=∑,(24)2 and trB2=∑,(2B)2.Let U denote the matrix diagonalizing B written in the basis determined by A,and let D4,Dg denote the diagonal matrices with diagonal elementsrespectively.Then, tAB=trDAUDBUT=∑AG: The last sum is linear in the coefficients vi=and the orthogonality of U implies that v=1,ivi=1.Thus, rAB≤ sup Σ4月 (2.1.28) j20:=1,∑=1 But this is a maximization of a linear functional over the convex set of doubly stochastic matrices,and the maximum is obtained at the extreme points,which are well known to correspond to permutations The maximum among permuta- tions is then easily checked to beCollecting these facts together implies Lemma 2.1.19.Alternatively,one sees directly that a maximizing V=[vu}in (2.1.28),is the identity matrix.Indeed,assume w.1.o.g.that v<1.We then construct a matrix V={vi}with vi=1 and vn=vn for i>1 such that V is also
2.1 TRACES, MOMENTS AND COMBINATORICS 21 (v) Conclude that Fα possesses a density fα supported on [b−,b+], with b− = (1− √ α) 2 , b+ = (1+ √ α) 2 , satisfying fα(x) = p (x−b−)(b+ −x) 2πx , x ∈ [b−,b+]. (2.1.27) (This is the famous Marchenko-Pastur ˇ law, due to [MaP67].) (vi) Prove the analog of Lemma 2.1.7 for Wishart matrices, and deduce that LN → Fα weakly, in probability. (vii) Note that F1 is the image of the semicircle distribution under the transformation x 7→ x 2 . 2.1.5 Some useful approximations This section is devoted to the following simple observation that often allows one to considerably simplify arguments concerning the convergence of empirical measures. Lemma 2.1.19 (Hoffman-Wielandt) Let A, B be N ×N symmetric matrices, with eigenvalues λ A 1 ≤ λ A 2 ≤ ... ≤ λ A N and λ B 1 ≤ λ B 2 ≤ ... ≤ λ B N . Then, N ∑ i=1 |λ A i −λ B i | 2 ≤ tr(A−B) 2 . Proof Note that trA 2 = ∑i (λ A i ) 2 and trB 2 = ∑i (λ B i ) 2 . Let U denote the matrix diagonalizing B written in the basis determined by A, and let DA,DB denote the diagonal matrices with diagonal elements λ A i ,λ B i respectively. Then, trAB = trDAUDBU T = ∑ i, j λ A i λ B j u 2 i j . The last sum is linear in the coefficients vi j = u 2 i j, and the orthogonality of U implies that ∑j vi j = 1,∑i vi j = 1. Thus, trAB ≤ sup vi j≥0:∑j vi j=1,∑i vi j=1 ∑ i, j λ A i λ B j vi j . (2.1.28) But this is a maximization of a linear functional over the convex set of doubly stochastic matrices, and the maximum is obtained at the extreme points, which are well known to correspond to permutations The maximum among permutations is then easily checked to be ∑i λ A i λ B i . Collecting these facts together implies Lemma 2.1.19. Alternatively, one sees directly that a maximizing V = {vi j} in (2.1.28), is the identity matrix. Indeed, assume w.l.o.g. that v11 < 1. We then construct a matrix V¯ = {v¯i j} with ¯v11 = 1 and ¯vii = vii for i > 1 such that V¯ is also