1.3.Random Matrices:A Brief Historical Account 15 its entries are chosen to be i.i.d.,then the eigenvalues ofA are asymptotically uniformly distributed on the unit circle of the complex plane.This is commonly referred to as Girko's full-circle law,which is exemplified in Figure 1.5.It has been proved in various degrees of rigor and generality in [173,197,85,68,9].If the off-diagonal entries Ai;and Aji are Gaussian and pairwise correlated with correlation coefficient then [23]shows that the eigenvalues ofA are asymptotically uniformly distributed on an ellipse in the complex plane whose axes coincide with the real and imaginary axes and have radius 1+p and 1-p,respectively.When p=1,the projection on the real axis of such elliptic law is equal to the semicircle law. 15 0.5A -0.5 -1 -0.5 0 0.5 1 15 Fig.1.5 The full-circle law and the eigenvalues of a realization of a matrix of size n =500. Most of the results surveyed above pertain to the eigenvalues of square matrices with independent entries.However,as we saw in Sec- tion 1.2,key problems in wireless communications involve the singular values of rectangular matrices H;even if those matrices have indepen-
1.3. Random Matrices: A Brief Historical Account 15 its entries are chosen to be i.i.d., then the eigenvalues of √ 1 nA are asymptotically uniformly distributed on the unit circle of the complex plane. This is commonly referred to as Girko’s full-circle law, which is exemplified in Figure 1.5. It has been proved in various degrees of rigor and generality in [173, 197, 85, 68, 9]. If the off-diagonal entries Ai,j and Aj,i are Gaussian and pairwise correlated with correlation coefficient ρ, then [238] shows that the eigenvalues of √ 1 nA are asymptotically uniformly distributed on an ellipse in the complex plane whose axes coincide with the real and imaginary axes and have radius 1 + ρ and 1 − ρ, respectively. When ρ = 1, the projection on the real axis of such elliptic law is equal to the semicircle law. −1.5 −1 −0.5 0 0.5 1 1.5 −1.5 −1 −0.5 0 0.5 1 1.5 Fig. 1.5 The full-circle law and the eigenvalues of a realization of a matrix of size n = 500. Most of the results surveyed above pertain to the eigenvalues of square matrices with independent entries. However, as we saw in Section 1.2, key problems in wireless communications involve the singular values of rectangular matrices H; even if those matrices have indepen-
16 Introduction dent entries,the matrices HHt whose eigenvalues are of interest do not have independent entries. When the entries of H are zero-mean i.i.d.Gaussian,HHt is com- monly referred to as a Wishart matrix.The analysis of the joint dis- tribution of the entries of Wishart matrices is as old as random matrix theory itself [311].The joint distribution of the eigenvalues of such ma- trices is known as the Fisher-Hsu-Roy distribution and was discovered simultaneously and independently by Fisher [75],Hsu [120],Girshick [89]and Roy [210].The corresponding marginal distributions can be expressed in terms of the Laguerre polynomials[125]. The asymptotic theory of singular values of rectangular matrices has concentrated on the case where the matrix aspect ratio converges to a constant K N →B (1.19) as the size of the matrix grows. The first success in the quest for the limiting empirical singular value distribution of rectangular random matrices is due to Marcenko and Pastur [170]in 1967.This landmark paper considers matrices of the form W=Wo+HTH (1.20) where T is a real diagonal matrix independent of H,Wo is a determin- istic Hermitian matrix,and the columns of the N x K matrix H are i.i.d.random vectors whose distribution satisfies a certain symmetry condition (encompassing the cases of independent entries and uniform distribution on the unit sphere).In the special case where Wo =0, T=I,and H has i.i.d.entries with variance the limiting spectrum of W found in [170]is the density in (1.10).In the special case of square H,the asymptotic density function of the singular values,correspond- ing to the square root of the random variable whose p.d.f.is(1.10)with B=1,is equal to the quarter circle law: q回=V4-文,0≤x≤2 (1.21) As we will see in Section 2,in general (Wo 0 or TI)no closed-form expression is known for the limiting spectrum.Rather,[170]character-
16 Introduction dent entries, the matrices HH† whose eigenvalues are of interest do not have independent entries. When the entries of H are zero-mean i.i.d. Gaussian, HH† is commonly referred to as a Wishart matrix. The analysis of the joint distribution of the entries of Wishart matrices is as old as random matrix theory itself [311]. The joint distribution of the eigenvalues of such matrices is known as the Fisher-Hsu-Roy distribution and was discovered simultaneously and independently by Fisher [75], Hsu [120], Girshick [89] and Roy [210]. The corresponding marginal distributions can be expressed in terms of the Laguerre polynomials [125]. The asymptotic theory of singular values of rectangular matrices has concentrated on the case where the matrix aspect ratio converges to a constant K N → β (1.19) as the size of the matrix grows. The first success in the quest for the limiting empirical singular value distribution of rectangular random matrices is due to Mar˘cenko and Pastur [170] in 1967. This landmark paper considers matrices of the form W = W0 + HTH† (1.20) where T is a real diagonal matrix independent of H, W0 is a deterministic Hermitian matrix, and the columns of the N × K matrix H are i.i.d. random vectors whose distribution satisfies a certain symmetry condition (encompassing the cases of independent entries and uniform distribution on the unit sphere). In the special case where W0 = 0, T = I, and H has i.i.d. entries with variance 1 N , the limiting spectrum of W found in [170] is the density in (1.10). In the special case of square H, the asymptotic density function of the singular values, corresponding to the square root of the random variable whose p.d.f. is (1.10) with β = 1, is equal to the quarter circle law: q(x) = 1 π 4 − x2, 0 ≤ x ≤ 2. (1.21) As we will see in Section 2, in general (W0 = 0 or T = I) no closed-form expression is known for the limiting spectrum. Rather, [170] character-
1.3.Random Matrices:A Brief Historical Account 17 ized it indirectly through its Stieltjes transform,4 which uniquely deter- mines the distribution function.Since [170],this transform,which can be viewed as an iterated Laplace transform,has played a fundamental role in the theory of random matrices. 0.7 03 0.5 1.5 Fig.1.6 The quarter circle law compared a histogram of the average of 100 empirical sin- gular value density functions of a matrix of size 100 x 100. Figure 1.6 compares the quarter circle law density function (1.21) with the average of 100 empirical singular value density functions of a 100 x 100 square matrix H with independent zero-mean complex Gaussian entries with variance 100. Despite the ground-breaking nature of Marcenko and Pastur's con- tribution,it remained in obscurity for quite some time.For example,in 1977 Grenander and Silverstein [101]rediscovered (1.10)motivated by a neural network problem where the entries of H take only two values. Also unaware of the in-probability convergence result of [170],in 1978 Wachter [296]arrived at the same solution but in the stronger sense of almost sure convergence under the condition that the entries of H have 4The Stieltjes transform is defined in Section 2.2.1.The Dutch mathematician T.J.Stieltjes (1856-1894)provided the first inversion formula for this transform in [246]
1.3. Random Matrices: A Brief Historical Account 17 ized it indirectly through its Stieltjes transform,4 which uniquely determines the distribution function. Since [170], this transform, which can be viewed as an iterated Laplace transform, has played a fundamental role in the theory of random matrices. 0 0.5 1 1.5 2 2.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Fig. 1.6 The quarter circle law compared a histogram of the average of 100 empirical singular value density functions of a matrix of size 100 × 100. Figure 1.6 compares the quarter circle law density function (1.21) with the average of 100 empirical singular value density functions of a 100 × 100 square matrix H with independent zero-mean complex Gaussian entries with variance 1 100 . Despite the ground-breaking nature of Mar˘cenko and Pastur’s contribution, it remained in obscurity for quite some time. For example, in 1977 Grenander and Silverstein [101] rediscovered (1.10) motivated by a neural network problem where the entries of H take only two values. Also unaware of the in-probability convergence result of [170], in 1978 Wachter [296] arrived at the same solution but in the stronger sense of almost sure convergence under the condition that the entries of H have 4 The Stieltjes transform is defined in Section 2.2.1. The Dutch mathematician T. J. Stieltjes (1856-1894) provided the first inversion formula for this transform in [246]
18 Introduction uniformly bounded central moments of order higher than 2 as well as the same means and variances within a row.The almost sure conver- gence for the model(1.20)considered in [170]was shown in [227].Even as late as 1991,rediscoveries of the Marcenko-Pastur law can be found in the Physics literature [50]. The case where W=0 in (1.20),T is not necessarily diagonal but Hermitian and H has i.i.d.entries was solved by Silverstein [226]also in terms of the Stieltjes transform. The special case of (1.20)where Wo =0,H has zero-mean i.i.d. Gaussian entries and T=(YY)-1 where the K x m matrix Y has also zero-mean i.i.d.Gaussian entries with varianceindependent of H,is called a (central)multivariate F-matrix.Because of the statistical applications of such matrix,its asymptotic spectrum has received considerable attention culminating in the explicit expression found by Silverstein [223]in 1985. The speed of convergence to the limiting spectrum is studied in [8].For our applications it is more important,however,to assess the speed of convergence of the performance measures (e.g.capacity and MMSE)to their asymptotic limits.Note that the sums in the right side of(1.4)involve dependent terms.Thanks to that dependence,the convergence in (1.13)and (1.15)is quite remarkable:the deviations from the respective limits multiplied by N converge to Gaussian random variables with fixed means and variance.This has been established for general continuous functions,not just the logarithmic and rational functions of(1.13)and (1.15),in [15](see also [131]). The matrix of eigenvectors of Wishart matrices is known to be uniformly distributed on the manifold of unitary matrices (the so- called Haar measure)(e.g.[125,67]).In the case of HHT where H has i.i.d.non-Gaussian entries,much less success has been reported in the asymptotic characterization of the eigenvectors [153,224,225]. For matrices whose entries are Gaussian and correlated according to a Toeplitz structure,an integral equation is known for the Stielt- 5The mean is zero in the interesting special case where H has i.i.d.complex Gaussian entries [15]
18 Introduction uniformly bounded central moments of order higher than 2 as well as the same means and variances within a row. The almost sure convergence for the model (1.20) considered in [170] was shown in [227]. Even as late as 1991, rediscoveries of the Mar˘cenko-Pastur law can be found in the Physics literature [50]. The case where W = 0 in (1.20), T is not necessarily diagonal but Hermitian and H has i.i.d. entries was solved by Silverstein [226] also in terms of the Stieltjes transform. The special case of (1.20) where W0 = 0, H has zero-mean i.i.d. Gaussian entries and T = (YY† ) −1 where the K × m matrix Y has also zero-mean i.i.d. Gaussian entries with variance 1 m, independent of H, is called a (central) multivariate F-matrix. Because of the statistical applications of such matrix, its asymptotic spectrum has received considerable attention culminating in the explicit expression found by Silverstein [223] in 1985. The speed of convergence to the limiting spectrum is studied in [8]. For our applications it is more important, however, to assess the speed of convergence of the performance measures (e.g. capacity and MMSE) to their asymptotic limits. Note that the sums in the right side of (1.4) involve dependent terms. Thanks to that dependence, the convergence in (1.13) and (1.15) is quite remarkable: the deviations from the respective limits multiplied by N converge to Gaussian random variables with fixed mean5 and variance. This has been established for general continuous functions, not just the logarithmic and rational functions of (1.13) and (1.15), in [15] (see also [131]). The matrix of eigenvectors of Wishart matrices is known to be uniformly distributed on the manifold of unitary matrices (the socalled Haar measure) (e.g. [125, 67]). In the case of HH† where H has i.i.d. non-Gaussian entries, much less success has been reported in the asymptotic characterization of the eigenvectors [153, 224, 225]. For matrices whose entries are Gaussian and correlated according to a Toeplitz structure, an integral equation is known for the Stielt- 5 The mean is zero in the interesting special case where H has i.i.d. complex Gaussian entries [15]
1.3.Random Matrices:A Brief Historical Account 19 jes transform of the asymptotic spectrum as a function of the Fourier transform of the correlation function 147.198.55.Other results on random matrices with correlated and weakly dependent entries can be found in [170,196,146,53,199,145].Reference [191],in turn,consid- ers a special class of random matrices with dependent entries that falls outside the Marcenko-Pastur framework and that arises in the context of the statistical physics of disordered systems. Incidentally,another application of the Stieltjes transform approach is the generalization of Wigner's semicircle law to the sum of a Wigner matrix and a deterministic Hermitian matrix.Provided Lindeberg-type conditions are satisfied by the entries of the random component,[147] obtained the deformed semicircle law,which is only known in closed- form in the Stieltjes transform domain. Sometimes,an alternative to the characterization of asymptotic spectra through the Stieltjes transform is used,based on the proof of convergence and evaluation of moments such as +tr{(HHt)).For most cases of practical interest,the limiting spectrum has bounded support.Thus,the moment convergence theorem can be applied to obtain results on the limiting spectrum through its moments [297,314,315,313. An important recent development in asymptotic random matrix analysis has been the realization that the non-commutative free prob- ability theory introduced by Voiculescu [283,285]in the mid-1980s is applicable to random matrices.In free probability,the classical notion of independence of random variables is replaced by that of "freeness" or“free independence”. The power of the concept of free random matrices is best illustrated by the following setting.In general,we cannot find the eigenvalues of the sums of random matrices from the eigenvalues of the individual matrices (unless they have the same eigenvectors),and therefore the asymptotic spectrum of the sum cannot be obtained from the indi- vidual asymptotic spectra.An obvious exception is the case of inde- pendent diagonal matrices in which case the spectrum of the sum is simply the convolution of the spectra.When the random matrices are asymptotically free [287],the asymptotic spectrum of the sum is also obtainable from the individual asymptotic spectra.Instead of convolu-
1.3. Random Matrices: A Brief Historical Account 19 jes transform of the asymptotic spectrum as a function of the Fourier transform of the correlation function [147, 198, 55]. Other results on random matrices with correlated and weakly dependent entries can be found in [170, 196, 146, 53, 199, 145]. Reference [191], in turn, considers a special class of random matrices with dependent entries that falls outside the Mar˘cenko-Pastur framework and that arises in the context of the statistical physics of disordered systems. Incidentally, another application of the Stieltjes transform approach is the generalization of Wigner’s semicircle law to the sum of a Wigner matrix and a deterministic Hermitian matrix. Provided Lindeberg-type conditions are satisfied by the entries of the random component, [147] obtained the deformed semicircle law, which is only known in closedform in the Stieltjes transform domain. Sometimes, an alternative to the characterization of asymptotic spectra through the Stieltjes transform is used, based on the proof of convergence and evaluation of moments such as 1 N tr{(HH†)k}. For most cases of practical interest, the limiting spectrum has bounded support. Thus, the moment convergence theorem can be applied to obtain results on the limiting spectrum through its moments [297, 314, 315, 313]. An important recent development in asymptotic random matrix analysis has been the realization that the non-commutative free probability theory introduced by Voiculescu [283, 285] in the mid-1980s is applicable to random matrices. In free probability, the classical notion of independence of random variables is replaced by that of “freeness” or “free independence”. The power of the concept of free random matrices is best illustrated by the following setting. In general, we cannot find the eigenvalues of the sums of random matrices from the eigenvalues of the individual matrices (unless they have the same eigenvectors), and therefore the asymptotic spectrum of the sum cannot be obtained from the individual asymptotic spectra. An obvious exception is the case of independent diagonal matrices in which case the spectrum of the sum is simply the convolution of the spectra. When the random matrices are asymptotically free [287], the asymptotic spectrum of the sum is also obtainable from the individual asymptotic spectra. Instead of convolu-