22 2.WIGNER MATRICES a maximizing matrix.Indeed,because vi<1,there exist a j and a k with vij>0 and vi >0.Set v min(v1i,VKI)>0 and define vi =v11+v,Vki=v&i+v and vij=vI-v,V&I =VkI -v,and vab vab for all other pairs ab.Then, ∑(何-)=(+9--) i. =v(2-)(2-15)≥0 Thus,satisfies the constraints,is also a maximum,and the number of zero elements in the first row and column of V is larger by I at least from the corresponding one for V.If v=1,the claims follows,while if vi<1,one repeats this(at most 2N-2 times)to conclude.Proceeding in this manner with all diagonal elements of V,one sees that indeed the maximum of the right hand side of(2.1.28)is∑,λ4B,as claimed. 0 Remark 2.1.20 The statement and proof of Lemma 2.1.19 carry over to the case where 4 and B are both Hermitian matrices. Lemma 2.1.19 allows one to perform all sorts of truncations when proving con- vergence of empirical measures.For example,let us prove the following variant of Wigner's Theorem 2.1.1. Theorem 2.1.21 Assume XN is as in (2.1.2),except that instead of (2.1.1),only <is assumed.Then,the conclusion of Theorem 2.1.I still holds. Proof Fix a constant C and consider the symmetric matrix Xy whose elements satisfy,forl≤i≤j≤N, x(i.j)=Xx(i.j)1 xx(.lc-E(Xx(i,j)1 M(.msc). Then,withN denoting the eigenvalues of y,ordered,it follows from Lemma 2.1.19that 2-f≤- But, 所=Xv-龙w)P ≤[x(e-E(.' Since r2<,and the involved random variables are identical in law to either Z1.2 or Yi,it follows that E(())converges to 0 uniformly in
22 2. WIGNER MATRICES a maximizing matrix. Indeed, because v11 < 1, there exist a j and a k with v1 j > 0 and vk1 > 0. Set v = min(v1 j ,vk1) > 0 and define ¯v11 = v11 +v, ¯vk j = vk j +v and v¯1 j = v1 j −v, ¯vk1 = vk1 −v, and ¯vab = vab for all other pairs ab. Then, ∑ i, j λ A i λ B j (v¯i j −vi j) = v(λ A 1 λ B 1 +λ A k λ B j −λ A k λ B 1 −λ A 1 λ B j ) = v(λ A 1 −λ A k )(λ B 1 −λ B j ) ≥ 0. Thus, V¯ = {v¯i j} satisfies the constraints, is also a maximum, and the number of zero elements in the first row and column of V¯ is larger by 1 at least from the corresponding one for V. If ¯v11 = 1, the claims follows, while if ¯v11 < 1, one repeats this (at most 2N − 2 times) to conclude. Proceeding in this manner with all diagonal elements of V, one sees that indeed the maximum of the right hand side of (2.1.28) is ∑i λ A i λ B i , as claimed. ⊓⊔ Remark 2.1.20 The statement and proof of Lemma 2.1.19 carry over to the case where A and B are both Hermitian matrices. Lemma 2.1.19 allows one to perform all sorts of truncations when proving convergence of empirical measures. For example, let us prove the following variant of Wigner’s Theorem 2.1.1. Theorem 2.1.21 Assume XN is as in (2.1.2), except that instead of (2.1.1), only r2 < ∞ is assumed. Then, the conclusion of Theorem 2.1.1 still holds. Proof Fix a constant C and consider the symmetric matrix XˆN whose elements satisfy, for 1 ≤ i ≤ j ≤ N, XˆN(i, j) = XN(i, j)1√ N|XN(i, j)|≤C −E(XN(i, j)1√ N|XN(i, j)|≤C ). Then, with ˆλ N i denoting the eigenvalues of XˆN, ordered, it follows from Lemma 2.1.19 that 1 N N ∑ i=1 |λ N i − ˆλ N i | 2 ≤ 1 N tr(XN −XˆN) 2 . But, WN := 1 N tr(XN −XˆN) 2 ≤ 1 N2 ∑ i, j h√ NXN(i, j)1| √ NXN(i, j)|≥C −E( √ NXN(i, j)1| √ NXN(i, j)|≥C ) i2 . Since r2 < ∞, and the involved random variables are identical in law to either Z1,2 or Y1, it follows that E[(√ NXN(i, j))21| √ NXN(i, j)|≥C ] converges to 0 uniformly in
2.1 TRACES,MOMENTS AND COMBINATORICS 23 N,i,/when C converges to infinity.Hence,one may chose for each g a large enough C such that P(WNI>g)<g.Further,let Lip(R)={f∈Cb(R):suplf(x)I≤l,sup f-f@≤1}. x-川 Then,on the event {Wv<E),it holds that for fE Lip(R), a-,1≤∑a-别≤VE, where Ly denotes the empirical measure of the eigenvalues of Xy,and Jensen's inequality was used in the second inequality.This,together with the weak conver- gence in probability of LN toward the semicircle law assured by Theorem 2.1.1, and the fact that weak convergence is equivalent to convergence with respect to the Lipschitz bounded metric,see Theorem C.8,complete the proof of Theorem 2.1.21. 2.1.6 Maximal eigenvalues and Fiiredi-Komlos enumeration Wigner's theorem asserts the weak convergence of the empirical measure ofeigen- values to the compactly supported semicircle law.One immediately is led to sus- pect that the maximal eigenvalue of Yy should converge to the value 2,the largest element of the support of the semicircle distribution.This fact,however,does not follow from Wigner's theorem.Nonetheless,the combinatorial techniques we have already seen allow one to prove the following,where we use the notation introduced in (2.1.1)and (2.1.2). Theorem 2.1.22 (Maximal eigenvalue)Consider a Wigner matrix Xy satisfying rk kCk for some constant C and all integers k.Then.h comverges to 2 in probability. Remark:The assumption of Theorem 2.1.22 holds if the random variables |Z1.2 and Y|possess a finite exponential moment. Proof of Theorem 2.1.22 Fix 8>0 and let g:RR+be a continuous function supported on [2-6,2],with (o,g)=1.Then,applying Wigner's theorem 2.1.1, P2N<2-)≤PLw,8)=0)≤PILw,8-(o,8>)-→w-0.(2.129) We thus need to provide a complementary estimate on the probability that is large.We do that by estimating(LN,x2k)for k growing with N,using the bounds
2.1 TRACES, MOMENTS AND COMBINATORICS 23 N,i, j, when C converges to infinity. Hence, one may chose for each ε a large enough C such that P(|WN| > ε) < ε. Further, let Lip(R) = { f ∈ Cb(R) : sup x | f(x)| ≤ 1,sup x6=y | f(x)− f(y) |x−y| ≤ 1}. Then, on the event {|WN| < ε}, it holds that for f ∈ Lip(R), |hLN, fi − hLˆN, fi| ≤ 1 N ∑ i |λ N i − ˆλ N i | ≤ √ ε , where LˆN denotes the empirical measure of the eigenvalues of XˆN, and Jensen’s inequality was used in the second inequality. This, together with the weak convergence in probability of LˆN toward the semicircle law assured by Theorem 2.1.1, and the fact that weak convergence is equivalent to convergence with respect to the Lipschitz bounded metric, see Theorem C.8, complete the proof of Theorem 2.1.21. ⊓⊔ 2.1.6 Maximal eigenvalues and Furedi-Koml ¨ os enumeration ´ Wigner’s theorem asserts the weak convergence of the empirical measure of eigenvalues to the compactly supported semicircle law. One immediately is led to suspect that the maximal eigenvalue of XN should converge to the value 2, the largest element of the support of the semicircle distribution. This fact, however, does not follow from Wigner’s theorem. Nonetheless, the combinatorial techniques we have already seen allow one to prove the following, where we use the notation introduced in (2.1.1) and (2.1.2). Theorem 2.1.22 (Maximal eigenvalue) Consider a Wigner matrix XN satisfying rk ≤ k Ck for some constant C and all integers k. Then, λ N N converges to 2 in probability. Remark: The assumption of Theorem 2.1.22 holds if the random variables |Z1,2| and |Y1| possess a finite exponential moment. Proof of Theorem 2.1.22 Fix δ > 0 and let g : R 7→ R+ be a continuous function supported on [2−δ,2], with hσ,gi = 1. Then, applying Wigner’s theorem 2.1.1, P(λ N N < 2−δ) ≤ P(hLN,gi = 0) ≤ P(|hLN,gi− hσ,gi| > 1 2 ) →N→∞ 0. (2.1.29) We thus need to provide a complementary estimate on the probability that λ N N is large. We do that by estimating hL¯N,x 2k i for k growing with N, using the bounds
24 2.WIGNER MATRICES on r&provided in the assumptions.The key step is contained in the following combinatorial lemma,that gives information on the sets see(2.1.16). Lemma 2.1.23 For all integers k>2t-2 one has the estimate |%≤2k3(-21+2) (2.1.30) The proof of Lemma 2.1.23 is deferred to the end of this section. Equipped with Lemma 2.1.23,we have for 2k<N,using(2.1.17), 亿w,x2 (2.1.31) ΣN-+spⅡEZ)ΠE(Y To evaluate the last expectation,fix wEr,and let denote the number ofedges in Ef with N2=2.Holder's inequality then gives 黑E2LE0)≤n-. with the convention that ro =1.Since Gw is connected,E-1=t-1.On the other hand,by noting that Ne>3 for E-/edges,one has 2k>3(El- 1+21+2E.Hence,2k-21<6(k+1-t).Since r2g is a non-decreasing function of g bounded below by 1,we get,substituting back in (2.1.31),that for some constant c1 =c1(C)>0 and all k<N, 两宫( k+1- T6k+1-) 2(6(k+1-t0)6c)+1-1 4“(贷 (2.1.32)
24 2. WIGNER MATRICES on rk provided in the assumptions. The key step is contained in the following combinatorial lemma, that gives information on the sets Wk,t , see (2.1.16). Lemma 2.1.23 For all integers k > 2t −2 one has the estimate |Wk,t | ≤ 2 k k 3(k−2t+2) . (2.1.30) The proof of Lemma 2.1.23 is deferred to the end of this section. Equipped with Lemma 2.1.23, we have for 2k < N, using (2.1.17), hL¯N,x 2k i (2.1.31) ≤ k+1 ∑ t=1 N t−(k+1) |W2k,t | sup w∈W2k,t ∏e∈Ec w E(Z N w e 1,2 ) ∏e∈Es w E(Y N w e 1 ) ≤ 4 k k+1 ∑ t=1 (2k) 6 N k+1−t sup w∈W2k,t ∏e∈Ec w E(Z N w e 1,2 ) ∏e∈Es w E(Y N w e 1 ). To evaluate the last expectation, fix w ∈ W2k,t , and let l denote the number of edges in E c w with N w e = 2. H¨older’s inequality then gives ∏e∈Ec w E(Z N w e 1,2 ) ∏e∈Es w E(Y N w e 1 ) ≤ r2k−2l , with the convention thatr0 = 1. Since Gw is connected, |E c w | ≥ |Vw|−1 =t −1. On the other hand, by noting that N w e ≥ 3 for |E c w | − l edges, one has 2k ≥ 3(|E c w | − l) + 2l + 2|E s w|. Hence, 2k − 2l ≤ 6(k + 1 − t). Since r2q is a non-decreasing function of q bounded below by 1, we get, substituting back in (2.1.31), that for some constant c1 = c1(C) > 0 and all k < N, hL¯N,x 2k i ≤ 4 k k+1 ∑ t=1 (2k) 6 N k+1−t r6(k+1−t) ≤ 4 k k+1 ∑ t=1 (2k) 6 (6(k +1−t))6C N k+1−t ≤ 4 k k ∑ i=0 k c1 N i . (2.1.32)
2.1 TRACES,MOMENTS AND COMBINATORICS 25 Choose next a sequence k(N)→w-o∞such that k(N)ei/W→w-s0butk(W)/logN→ws oo.Then,for anyδ>0,and all N large, P(a>(2+δ)≤P(W(Lw,x2)>(2+8)2W) N(LN,x2(N)) (2+6)2kW 2N4k(N) ≤ (2+8)2而→w0, completing the proof of Theorem 2.1.22,modulo Lemma 2.1.23. ▣ Proof of Lemma 2.1.23 The idea of the proof it to keep track of the number of possibilities to prevent words in from having weight k/2+1.Toward this end,let wE be given.A parsing of the word w is a sentence aw =(w1,...,wn) such that the word obtained by concatenating the words w;is w.One can imagine creating a parsing of w by introducing commas between parts of w. We say that a parsing a=aw of w is an FK parsing (after Furedi and Komlos), and call the sentence a an FK sentence,if the graph associated with a is a tree,if Na <2 for all e E Ea,and if for any i=1,...,n-1,the first letter of w belongs to =suppwj.If the one word sentence a=w is an FK parsing,we say thatw is an FK word.Note that the constituent words in an FK parsing are FK words. As will become clear next,the graph of an FK word consists of trees whose edges have been visited twice by w,glued together by edges that have been visited only once.Recalling that a Wigner word is either a one letter word or a closed word of odd length and maximal weight(subject to the constraint that edges are visited at least twice),this leads to the following lemma. Lemma 2.1.24 Each FK word can be written in a unique way as a concatenation of pairwise disjoint Wigner words.Further,there are at most 2-equivalence classes of FK words of length n. Proof of Lemma 2.1.24 Let w=s1...sn be an FK word of length n.By definition, Gw is a tree.Let [si,i+)=1denote those edges of G visited only once by the walk induced by w.Defining io=1,one sees that the words w=5i1+15 j1,are closed,disjoint,and visit each edge in the tree Gw,exactly twice.In particular,withl:=i-ij-1-1,it holds that l is even (possibly,l=0 if w is a one letter word),and further if thenThis decomposi- tion being unique,one concludes that for any =with N denoting the number of
2.1 TRACES, MOMENTS AND COMBINATORICS 25 Choose next a sequence k(N)→N→∞ ∞ such that k(N) c1 /N →N→∞ 0 but k(N)/logN →N→∞ ∞. Then, for any δ > 0, and all N large, P(λ N N > (2+δ)) ≤ P(NhLN,x 2k(N) i > (2+δ) 2k(N) ) ≤ NhL¯N,x 2k(N) i (2+δ) 2k(N) ≤ 2N4 k(N) (2+δ) 2k(N) →N→∞ 0, completing the proof of Theorem 2.1.22, modulo Lemma 2.1.23. ⊓⊔ Proof of Lemma 2.1.23 The idea of the proof it to keep track of the number of possibilities to prevent words in Wk,t from having weight ⌊k/2⌋+1. Toward this end, let w ∈ Wk,t be given. A parsing of the word w is a sentence aw = (w1,...,wn) such that the word obtained by concatenating the words wi is w. One can imagine creating a parsing of w by introducing commas between parts of w. We say that a parsing a = aw of w is an FK parsing (after F¨uredi and Koml´os), and call the sentence a an FK sentence, if the graph associated with a is a tree, if N a e ≤ 2 for all e ∈ Ea, and if for any i = 1,...,n−1, the first letter of wi+1 belongs to Si j=1 suppwj . If the one word sentence a = w is an FK parsing, we say that w is an FK word. Note that the constituent words in an FK parsing are FK words. As will become clear next, the graph of an FK word consists of trees whose edges have been visited twice by w, glued together by edges that have been visited only once. Recalling that a Wigner word is either a one letter word or a closed word of odd length and maximal weight (subject to the constraint that edges are visited at least twice), this leads to the following lemma. Lemma 2.1.24 Each FK word can be written in a unique way as a concatenation of pairwise disjoint Wigner words. Further, there are at most 2 n−1 equivalence classes of FK words of length n. Proof of Lemma 2.1.24 Let w = s1 ···sn be an FK word of length n. By definition, Gw is a tree. Let {sij ,sij+1} r j=1 denote those edges of Gw visited only once by the walk induced by w. Defining i0 = 1, one sees that the words ¯wj = sij−1+1 ···sij , j ≥ 1, are closed, disjoint, and visit each edge in the tree Gw¯j exactly twice. In particular, with lj := ij − ij−1 − 1, it holds that lj is even (possibly, lj = 0 if ¯wj is a one letter word), and further if lj > 0 then ¯wj ∈ Wlj ,lj/2+1 . This decomposition being unique, one concludes that for any z, with Nn denoting the number of
26 2.WIGNER MATRICES equivalence classes of FK words of length n,and with |:=1, 三:-含之n%wel =1W=1=1 lyeven +宫w (2.1.33) in the sense of formal power series.By the proof of Lemma 2.1.6,1= C=B1.Hence,by Lemma 2.1.3,for<1/4, +∑244=A)=1-- 00 =1 22 Substituting in(2.1.33),one sees that(again,in the sense of power series) 邛(e2) 1-1-421,z+分 1-zβ(2) 2-1+-p=5+-标 Using that 告() one concludes that 三=++22(0)月 from which Lemma 2.1.24 follows. 0 Our interest in FK parsings is the following FK parsing w'of a word w= s1...sn.Declare an edge e of Gw to be new (relative to w)if for some index 1 <i<n we have e={si,s+1}and sits1,...,si.If the edge e is not new, then it is old.Define w'to be the sentence obtained by breaking w(that is,"insert- ing a comma")at all visits to old edges of Gw and at third and subsequent visits to new edges of G. Since a word w can be recovered from its FK parsing by omitting the extra commas,and since the number of equivalence classes of FK words is estimated by Lemma 2.1.24,one could hope to complete the proof of Lemma 2.1.23 by controlling the number of possible parsed FK sequences.A key step toward this end is the following lemma,which explains how FK words are fitted together to form FK sentences.Recall that any FK word w can be written in a unique way as a concatenation of disjoint Wigner words wi,i=1,...,r.With s,denoting the first (and last)letter of wi,define the skeleton of w as the word s...sr.Finally,for a
26 2. WIGNER MATRICES equivalence classes of FK words of length n, and with |W0,1| := 1, ∞ ∑ n=1 Nnz n = ∞ ∑ r=1 ∑ {l j } r j=1 lj even r ∏ j=1 z lj+1 |Wlj ,lj/2+1 | = ∞ ∑ r=1 z+ ∞ ∑ l=1 z 2l+1 |W2l,l+1| !r , (2.1.33) in the sense of formal power series. By the proof of Lemma 2.1.6, |W2l,l+1| = Cl = βl . Hence, by Lemma 2.1.3, for |z| < 1/4, z+ ∞ ∑ l=1 z 2l+1 |W2l,l+1| = z ˆβ(z 2 ) = 1− √ 1−4z 2 2z . Substituting in (2.1.33), one sees that (again, in the sense of power series) ∞ ∑ n=1 Nnz n = z ˆβ(z 2 ) 1−z ˆβ(z 2) = 1− √ 1−4z 2 2z−1+ √ 1−4z 2 = − 1 2 + z+ 1 √ 2 1−4z 2 . Using that r 1 1−t = ∞ ∑ k=0 t k 4 k 2k k , one concludes that ∞ ∑ n=1 Nnz n = z+ 1 2 (1+2z) ∞ ∑ n=1 z 2n 2n n , from which Lemma 2.1.24 follows. ⊓⊔ Our interest in FK parsings is the following FK parsing w ′ of a word w = s1 ···sn. Declare an edge e of Gw to be new (relative to w) if for some index 1 ≤ i < n we have e = {si ,si+1} and si+1 6∈ {s1,...,si}. If the edge e is not new, then it is old. Define w ′ to be the sentence obtained by breaking w (that is, “inserting a comma”) at all visits to old edges of Gw and at third and subsequent visits to new edges of Gw. Since a word w can be recovered from its FK parsing by omitting the extra commas, and since the number of equivalence classes of FK words is estimated by Lemma 2.1.24, one could hope to complete the proof of Lemma 2.1.23 by controlling the number of possible parsed FK sequences. A key step toward this end is the following lemma, which explains how FK words are fitted together to form FK sentences. Recall that any FK word w can be written in a unique way as a concatenation of disjoint Wigner words wi , i = 1,...,r. With si denoting the first (and last) letter of wi , define the skeleton of w as the word s1 ···sr . Finally, for a