EXPANDER GRAPHS AND THEIR APPLICATIONS 449 R FIGURE 1.Constructing a super concentrator. of our magical graph,where Lil=n,and Ril =3n/4.(ii)A super concentrator C connecting the input set R to the output set R2.The input,output sets have size 3n/4 and therefore C exists by induction.(iii)A perfect matching between L and L2.The input and output sets of our graph are L and L2 respectively.This is illustrated in Figure 1. We need to prove that the graph we have constructed,C,is indeed a super concentrator and derive an upper bound on the number of its edges.Let S be a set of input vertices and T a set of output vertices such that S=T=k. Ifk≤n/2,then Tc,(S)川≥IS]and rG2((T)川≥lTl,since G,G2 are magical graphs.Hence,by Hall's theorem there exists a perfect matching from S to Ic,(S) and from T to IG(T).Let S'Ic(S)be the set of vertices matched to vertices in S and likewise for T'and T.Since C is a super concentrator,the sets S'and T can be connected by k disjoint paths.Consequently,S and T can be connected by disjoint paths in C. If the two sets S and T are large,i.e.S=T=k>n/2,then there must exist at least k-n/2 vertices in S that are matched to vertices in T by direct matching edges of(iii)above.Therefore we can delete the matched vertices from S and T and reduce the problem to the previous case of k<n/2.It follows that C is a super concentrator. We still need to provide an upper bound on the number of edges e(n)in our n-inputs graph C.We obtain the following recursion: e(n)n2 2nd+n+e(3n/4)for n no forn≤no Solving this recursion yields e(n)<Kn,where K is a constant that depends only on no and d.Therefore we obtained a super concentrator with O(n)edges as required. A word about algorithms to construct such graphs:Suppose that we have an algorithm which constructs magical graphs of left size n in time t(n).It should be clear that the above recursive construction yields an algorithm that constructs a super concentrator with input/output size n in time O(t(n)). Finally,we note that super concentrators are but one example among a host of network construction problems in which expanders serve as a key building block
EXPANDER GRAPHS AND THEIR APPLICATIONS 449 G C 1 G2 L1 R1 R2 L2 Figure 1. Constructing a super concentrator. of our magical graph, where |Li| = n, and |Ri| = 3n/4. (ii) A super concentrator C connecting the input set R1 to the output set R2. The input, output sets have size 3n/4 and therefore C exists by induction. (iii) A perfect matching between L1 and L2. The input and output sets of our graph are L1 and L2 respectively. This is illustrated in Figure 1. We need to prove that the graph we have constructed, C , is indeed a super concentrator and derive an upper bound on the number of its edges. Let S be a set of input vertices and T a set of output vertices such that |S| = |T| = k. If k ≤ n/2, then |ΓG1 (S)|≥|S| and |ΓG2 (T)|≥|T|, since G1, G2 are magical graphs. Hence, by Hall’s theorem there exists a perfect matching from S to ΓG1 (S) and from T to ΓG2 (T). Let S ⊆ ΓG1 (S) be the set of vertices matched to vertices in S and likewise for T and T. Since C is a super concentrator, the sets S and T can be connected by k disjoint paths. Consequently, S and T can be connected by disjoint paths in C . If the two sets S and T are large, i.e. |S| = |T| = k > n/2, then there must exist at least k − n/2 vertices in S that are matched to vertices in T by direct matching edges of (iii) above. Therefore we can delete the matched vertices from S and T and reduce the problem to the previous case of k ≤ n/2. It follows that C is a super concentrator. We still need to provide an upper bound on the number of edges e(n) in our n-inputs graph C . We obtain the following recursion: e(n) ≤ 2nd + n + e (3n/4) for n>n0 n2 for n ≤ n0 . Solving this recursion yields e(n) ≤ Kn, where K is a constant that depends only on n0 and d. Therefore we obtained a super concentrator with O(n) edges as required. A word about algorithms to construct such graphs: Suppose that we have an algorithm which constructs magical graphs of left size n in time t(n). It should be clear that the above recursive construction yields an algorithm that constructs a super concentrator with input/output size n in time O(t(n)). Finally, we note that super concentrators are but one example among a host of network construction problems in which expanders serve as a key building block
450 SHLOMO HOORY.NATHAN LINIAL.AND AVI WIGDERSON These include the celebrated AKS sorting network [AKS83],and the variety of com- munication and computation networks which appear in [WZ93 and its extensive list of references. 1.3.2.Construction of good error correcting codes.We now turn to Shannon's prob- lem concerning communicating over a noisy channel and present a solution due to Sipser and Spielman [SS96].We observe a simple but useful property of magical graphs.Let G be such a graph with n left vertices and 3n/4 right vertices.We show that for every nonempty ScL with s=lSl≤品there exists a vertex u∈R with exactly one neighbor in S,namely,I(u)nS=1.To see this,consider e(S,I(S)), the number of edges between S and I(S).Clearly,e(S,r(S))=d.S=ds.On the other hand,since I(S)>5ds/8,the average number of neighbors in S that a vertex in I(S)has is at most 8/5<2.But every vertex in I(S)has at least one neighbor in S,so there must be some (indeed,many)vertices in I(S)with exactly one neighbor in S. We use the magical graph G to construct a code Cc {0,1]"with rate at least 1/4 and distance at least 1/10d.To this end,represent the magical graph G=(R,L,E)by a matrix A with row set R and column set L,where aij equals 1 or 0 depending on whether or not the i-th vertex in R is adjacent to the j-th vertex in L.The code is defined as the right kernel of A,viz.C=fz10,1)n Ax=0}. (Here calculations are done over the field with two elements.)Clearly C is a linear subspace of (0,1)"of dimension n/4.Hence CI>2/4,yielding the claimed lower bound on the code's rate. To prove a lower bound on the distance,first observe that since C is a linear code (i.e.a linear subspace of (0,1)")the smallest distance between two of its codewords equals the smallest weight of a nonzero codeword.Let x0 be an n-bit vector with support s={jL:zj=1}.If IS<10a,then,as we saw,there is some iE R with T(i)nS=1.It follows that the i-th coordinate in Ax is 1.and so z is not in the right kernel of A and cannot be a codeword.It follows that the normalized distance of C is at least 1/10d. The above construction is a special case of a so-called LDPC (for Low Density Parity Check)code.This idea was first suggested by Gallager [Gal63]and has inspired(among many others)the works by Bassalygo,Pinsker and Margulis Pin73 BP73,Mar73],the first to explicitly define expander graphs and construct them. After being nearly dormant for about 20 years,LDPC codes regained prominence in the 90's and are now believed to give simultaneously the best coding parameters as well as best algorithmic performance in various settings.For a survey of this fascinating field,see Richardson and Urbanke [RU]. Only fairly recently CRVW02 did the art of explicit constructions of expanding graphs reach the level that makes the above simple argument feasible.It should also be mentioned that this construction not only yields codes with linear distance but also linear time iterative decoding.We will review these "lossless expanders" in Section 10. As in the previous application,the time complexity of constructing the magical graph dominates the time to construct the(parity check matrix of the)appropriate code.This is yet another reason to seek efficient algorithms to construct these graphs.The next application calls for an even more concise and efficient description of these graphs
450 SHLOMO HOORY, NATHAN LINIAL, AND AVI WIGDERSON These include the celebrated AKS sorting network [AKS83], and the variety of communication and computation networks which appear in [WZ93] and its extensive list of references. 1.3.2. Construction of good error correcting codes. We now turn to Shannon’s problem concerning communicating over a noisy channel and present a solution due to Sipser and Spielman [SS96]. We observe a simple but useful property of magical graphs. Let G be such a graph with n left vertices and 3n/4 right vertices. We show that for every nonempty S ⊂ L with s = |S| ≤ n 10d there exists a vertex u ∈ R with exactly one neighbor in S, namely, |Γ(u) ∩ S| = 1. To see this, consider e(S, Γ(S)), the number of edges between S and Γ(S). Clearly, e(S, Γ(S)) = d · |S| = ds. On the other hand, since Γ(S) ≥ 5ds/8, the average number of neighbors in S that a vertex in Γ(S) has is at most 8/5 < 2. But every vertex in Γ(S) has at least one neighbor in S, so there must be some (indeed, many) vertices in Γ(S) with exactly one neighbor in S. We use the magical graph G to construct a code C ⊂ {0, 1}n with rate at least 1/4 and distance at least 1/10d. To this end, represent the magical graph G = (R, L, E) by a matrix A with row set R and column set L, where aij equals 1 or 0 depending on whether or not the i-th vertex in R is adjacent to the j-th vertex in L. The code is defined as the right kernel of A, viz. C = {x ∈ {0, 1}n | Ax = 0}. (Here calculations are done over the field with two elements.) Clearly C is a linear subspace of {0, 1}n of dimension ≥ n/4. Hence |C| ≥ 2n/4, yielding the claimed lower bound on the code’s rate. To prove a lower bound on the distance, first observe that since C is a linear code (i.e. a linear subspace of {0, 1}n) the smallest distance between two of its codewords equals the smallest weight of a nonzero codeword. Let x = 0 be an n-bit vector with support S = {j ∈ L : xj = 1}. If |S| < n 10d , then, as we saw, there is some i ∈ R with |Γ(i) ∩ S| = 1. It follows that the i-th coordinate in Ax is 1, and so x is not in the right kernel of A and cannot be a codeword. It follows that the normalized distance of C is at least 1/10d. The above construction is a special case of a so-called LDPC (for Low Density Parity Check) code. This idea was first suggested by Gallager [Gal63] and has inspired (among many others) the works by Bassalygo, Pinsker and Margulis [Pin73, BP73, Mar73], the first to explicitly define expander graphs and construct them. After being nearly dormant for about 20 years, LDPC codes regained prominence in the 90’s and are now believed to give simultaneously the best coding parameters as well as best algorithmic performance in various settings. For a survey of this fascinating field, see Richardson and Urbanke [RU]. Only fairly recently [CRVW02] did the art of explicit constructions of expanding graphs reach the level that makes the above simple argument feasible. It should also be mentioned that this construction not only yields codes with linear distance but also linear time iterative decoding. We will review these “lossless expanders” in Section 10. As in the previous application, the time complexity of constructing the magical graph dominates the time to construct the (parity check matrix of the) appropriate code. This is yet another reason to seek efficient algorithms to construct these graphs. The next application calls for an even more concise and efficient description of these graphs
EXPANDER GRAPHS AND THEIR APPLICATIONS 451 1.3.3.Deterministic error amplification for RP.Our last problem revolves around deciding membership in a language C E RP,with a given bound on the algo- rithm's error probability.The solution we present is due to Karp,Pippenger,and Sipser KPS85].It carries out dependent sampling of random strings using magi- cal graphs. As we explained above,we have to decide whether a given k-bit string x belongs to C or not.By assumption,there is a polytime algorithm that upon receiving x and a random k-bit string r calculates a function f(x,r)such that f(r,r)=1 whenever x E C,but f(x,r)=1 with probability at most 1/16 (the probability is over the choice of r)when z C. To reduce the probability of error we will be considering several strings r.How- ever,our goal is to reduce the failure probability below some set threshold while we utilize as few such strings r as possible.In other words,fix some zC and let B=fr E{0,1 If(r,r)=1}be the set of strings r that are "bad"in that they fail on input r.We would like to make it as likely as possible that at least one of the r's we consider lies outside of B.The only information we have about the set B C{0,1]is that it is not too big,Bl<n/16 where n =2k For any given integer d,we offer an algorithm for the membership problem that evaluates f only d times and fails with probability e<o.The algorithm is rather simple.Fix an (n,n;d)-magical graph G=(L,R,E)with n =2k,where each vertex in R and each vertex in L is associated with a unique k-bit string.To decide whether a given z is in C,sample a k-bit string r which may be considered as a vertex in L.Let r1,...,raER be the (strings associated with)the d neighbors of r.The algorithm outputs 1 iff f(r,r1)=f(,r2)=...=f(x,rd)=1. Clearly this algorithm fails iff xC and r1,...,rd E B,i.e.I(r)C B.Let SC L be the set of left vertices that satisfy this condition (so we fail iff r S). Clearly I(S)B.But we must have or else we get a contradiction: B>I(S)>(5d/8)(n/10d)>n/16 (this is the moment of magic here...).This upper bound on means that we fail with probability at most while using only the original k random bits.We can reduce the probability of error arbitrarily by increasing d appropriately.A key point is that we have reached this reduction in error probability without using any additional random bits. Here are a few comments on this algorithm. Unlike the previous two examples,the size n of the graph used is exponential in the natural size of the problem considered(the parameter k here).This means that for an efficient implementation of the new algorithm,our encoding of the magical graph must be much more efficient than in the previous applications.Specifically. given the name of a vertex (a k-bit string),we must be able to generate its d neighbors in time poly(k),which is far smaller than the size of the graph.We will later see that even this level of "explicitness"is achievable. Next,with the d(dependent)samples used here,we can reduce the error to O(1/d).This is much inferior to the exponential decay of the error as a function of d when we "waste"random bits and make d independent samples.We will later see that (other)dependent sampling via expanders (which uses only a few more random bits than the solution above)can achieve such an exponential decay as well. Another comment concerns the 1-sided errors.Many probabilistic algorithms err both on inputs in the language C and those outside it,and the above amplification
EXPANDER GRAPHS AND THEIR APPLICATIONS 451 1.3.3. Deterministic error amplification for RP. Our last problem revolves around deciding membership in a language L ∈ RP, with a given bound on the algorithm’s error probability. The solution we present is due to Karp, Pippenger, and Sipser [KPS85]. It carries out dependent sampling of random strings using magical graphs. As we explained above, we have to decide whether a given k-bit string x belongs to L or not. By assumption, there is a polytime algorithm that upon receiving x and a random k-bit string r calculates a function f(x, r) such that f(x, r)=1 whenever x ∈ L, but f(x, r) = 1 with probability at most 1/16 (the probability is over the choice of r) when x /∈ L. To reduce the probability of error we will be considering several strings r. However, our goal is to reduce the failure probability below some set threshold while we utilize as few such strings r as possible. In other words, fix some x /∈ L and let B = {r ∈ {0, 1}k | f(x, r)=1} be the set of strings r that are “bad” in that they fail on input x. We would like to make it as likely as possible that at least one of the r’s we consider lies outside of B. The only information we have about the set B ⊆ {0, 1}k is that it is not too big, |B| ≤ n/16 where n = 2k. For any given integer d, we offer an algorithm for the membership problem that evaluates f only d times and fails with probability ≤ 1 10d . The algorithm is rather simple. Fix an (n,n; d)-magical graph G = (L, R, E) with n = 2k, where each vertex in R and each vertex in L is associated with a unique k-bit string. To decide whether a given x is in L, sample a k-bit string r which may be considered as a vertex in L. Let r1,...,rd ∈ R be the (strings associated with) the d neighbors of r. The algorithm outputs 1 iff f(x, r1) = f(x, r2) = ... = f(x, rd) = 1. Clearly this algorithm fails iff x /∈ L and r1,...,rd ∈ B, i.e. Γ(r) ⊆ B. Let S ⊂ L be the set of left vertices that satisfy this condition (so we fail iff r ∈ S). Clearly Γ(S) ⊆ B. But we must have |S| ≤ n 10d or else we get a contradiction: |B|≥|Γ(S)| > (5d/8)(n/10d) ≥ n/16 (this is the moment of magic here...). This upper bound on |S| means that we fail with probability at most 1 10d while using only the original k random bits. We can reduce the probability of error arbitrarily by increasing d appropriately. A key point is that we have reached this reduction in error probability without using any additional random bits. Here are a few comments on this algorithm. Unlike the previous two examples, the size n of the graph used is exponential in the natural size of the problem considered (the parameter k here). This means that for an efficient implementation of the new algorithm, our encoding of the magical graph must be much more efficient than in the previous applications. Specifically, given the name of a vertex (a k-bit string), we must be able to generate its d neighbors in time poly(k), which is far smaller than the size of the graph. We will later see that even this level of “explicitness” is achievable. Next, with the d (dependent) samples used here, we can reduce the error to O(1/d). This is much inferior to the exponential decay of the error as a function of d when we “waste” random bits and make d independent samples. We will later see that (other) dependent sampling via expanders (which uses only a few more random bits than the solution above) can achieve such an exponential decay as well. Another comment concerns the 1-sided errors. Many probabilistic algorithms err both on inputs in the language L and those outside it, and the above amplification
452 SHLOMO HOORY.NATHAN LINIAL.AND AVI WIGDERSON does not work as stated.However,we will later see that an appropriate modification of dependent sampling via expanders can achieve nearly optimal error reduction in such situations as well. These problems and results have developed into a whole subfield of Theoretical Computer Science called Randomness Ertraction.Two excellent surveys of these issues are [Gol97]and [Sha04]. 2.Graph expansion and eigenvalues 2.1.Edge expansion and a combinatorial definition of expanders.Let us introduce some conventions now.Unless we say otherwise,a graph G=(V,E) is undirected and d-regular (all vertices have the same degree d;that is each vertex is incident to exactly d edges).Self loops and multiple edges are allowed. The number of vertices V is denoted by n.Unlike the previous section,graphs need not be bipartite.For S,Tc V,denote the set of edges from S to T by E(S,T)={(u,v)lu E S,v ET,(u,v)EE).Here we think of every undirected edge as a pair of directed edges,so E(S,T)is a set of directed edges.It will also be convenient to define E(S)as the set of edges for which both vertices belong to S. Definition 2.1. (1)The Edge Boundary of a set S,denoted as,is OS=E(S,S).This is the set of edges emanating from the set S to its complement. (2)The (edge)Expansion Ratio of G,denoted h(G),is defined as: las h(G)=min {s1s1≤}1S There are two important avenues for extending this definition.The first is in considering different notions of boundary.The most notable is vertex expansion, where we count the number of neighboring vertices of vertex sets S rather than the number of outgoing edges.See Sections 4 and 10 for more on this.The second av- enue,proceeds to explore expansion as a function of the set size.See subsection 4.6. Definition 2.2.A sequence of d-regular graphs {GilieN of size increasing with i is a Family of Expander Graphs if there exists e>0 such that h(Gi)>e for all i. Issues concerning the explicit construction of mathematical objects are fun- damental to all of computer science,and expander graphs are no exception.There are two natural levels of efficiency to be considered in the construction of such graphs,which we have already seen in the examples of the previous section.In the first we require that an n-vertex graph should be generated"from scratch"in time polynomial in n.In the stronger version we demand that the neighborhood of any given vertex should be computable in time that is polynomial in the description length of the vertex(usually polynomial in logn). The technicalities of these definitions may seem odd to the uninitiated reader, but they reflect a very natural need.Expander graphs are to be used by various algorithms.The algorithms'performance will depend on efficiently obtaining the relevant information of the expanders being used
452 SHLOMO HOORY, NATHAN LINIAL, AND AVI WIGDERSON does not work as stated. However, we will later see that an appropriate modification of dependent sampling via expanders can achieve nearly optimal error reduction in such situations as well. These problems and results have developed into a whole subfield of Theoretical Computer Science called Randomness Extraction. Two excellent surveys of these issues are [Gol97] and [Sha04]. 2. Graph expansion and eigenvalues 2.1. Edge expansion and a combinatorial definition of expanders. Let us introduce some conventions now. Unless we say otherwise, a graph G = (V,E) is undirected and d-regular (all vertices have the same degree d; that is each vertex is incident to exactly d edges). Self loops and multiple edges are allowed. The number of vertices |V | is denoted by n. Unlike the previous section, graphs need not be bipartite. For S, T ⊂ V , denote the set of edges from S to T by E(S, T) = {(u, v)|u ∈ S, v ∈ T,(u, v) ∈ E}. Here we think of every undirected edge as a pair of directed edges, so E(S, T) is a set of directed edges. It will also be convenient to define E(S) as the set of edges for which both vertices belong to S. Definition 2.1. (1) The Edge Boundary of a set S, denoted ∂S, is ∂S = E(S, S). This is the set of edges emanating from the set S to its complement. (2) The (edge) Expansion Ratio of G, denoted h(G), is defined as: h(G) = min {S | |S|≤ n 2 } |∂S| |S| . There are two important avenues for extending this definition. The first is in considering different notions of boundary. The most notable is vertex expansion, where we count the number of neighboring vertices of vertex sets S rather than the number of outgoing edges. See Sections 4 and 10 for more on this. The second avenue,proceeds to explore expansion as a function of the set size. See subsection 4.6. Definition 2.2. A sequence of d-regular graphs {Gi}i∈N of size increasing with i is a Family of Expander Graphs if there exists > 0 such that h(Gi) ≥ for all i. Issues concerning the explicit construction of mathematical objects are fundamental to all of computer science, and expander graphs are no exception. There are two natural levels of efficiency to be considered in the construction of such graphs, which we have already seen in the examples of the previous section. In the first we require that an n-vertex graph should be generated “from scratch” in time polynomial in n. In the stronger version we demand that the neighborhood of any given vertex should be computable in time that is polynomial in the description length of the vertex (usually polynomial in log n). The technicalities of these definitions may seem odd to the uninitiated reader, but they reflect a very natural need. Expander graphs are to be used by various algorithms. The algorithms’ performance will depend on efficiently obtaining the relevant information of the expanders being used
EXPANDER GRAPHS AND THEIR APPLICATIONS 453 Definition 2.3.Let {Gi}i be a family of expander graphs where Gi is a d-regular graph on ni vertices and the integers ini are increasing,but not too fast (e.g. ni+1≤n will do). (1)The family is called Mildly Explicit if there is an algorithm that generates the j-th graph in the family G;in time polynomial in j.(That is,G;is computed in time <Aj for some constants A,B>0.) (2)The family is called Very Explicit if there is an algorithm that on input of an integer i,a vertex v∈V(Gi)andk∈{l,·,d}computes the k-th neighbor of the vertex v in the graph Gi.This algorithm's run time should be polynomial in its input length(the number of bits needed to express the triple (iv,k)). 2.2.Examples of expander graphs. (1)A family of 8-regular graphs Gm for every integer m.The vertex set is Vm =Zm x Zm.The neighbors of the vertex (r,y)are (r+y,y), (x-y,),(x,y+x),(c,y-x),(x+y+1,y),(x-y+1,),(x,y+x+1),(x,y- x+1),(all operations are mod m). This family of graphs,due to Margulis Mar73],is the first explicitly constructed family of expander graphs.Margulis'proof of expansion was based on representation theory and did not provide any specific bound on the expansion ratio h.Gabber and Galil [GG81]later derived such a bound using harmonic analysis.In Section 8 we show that Margulis'graphs are expanders.Note that this family is very explicit. (2)A family of 3-regular p-verter graphs for every prime p.Here Vp Zp:and a vertex z is connected to z+1,x-1,and to its inverse x-(operations are mod p,and we define the inverse of 0 to be 0). Here,the proof of expansion depends on a deep result in Number Theory, the Selberg 3/16 theorem;see the discussion in subsection 11.1.2 for more details.This family is only mildly explicit,since we are at present unable to generate large primes deterministically.See [Gra05]for a survey of the Agrawal-Kayal-Saxenaan polytime primality testing algorithm. 2.3.Graph spectrum and an algebraic definition of expansion.The Ad- jacency Matrix of an n-vertex graph G,denoted A=A(G),is an n x n matrix whose (u,v)entry is the number of edges in G between vertex u and vertex v. Being real and symmetric,the matrix A has n real eigenvalues which we denote by入1≥2≥·≥入n.We can also associate with it an orthonormal system of eigenvectors v1,...,Un with Avi=Aivi.We often refer to the eigenvalues of A(G)as the Spectrum of the graph G.The spectrum of a graph encodes a lot of information about it.Here are some simple illustrations of how certain properties of a d-regular graph are reflected in its spectrum: ·λ1=d,and the corresponding eigenvector is vi=l/元=(1/元,.., 1/vm). ·The graph is connected iff A1>λ2: ·The graph is bipartite iff入i=-入n As seen in the next theorem,the graph's second eigenvalue is closely related to its expansion parameter
EXPANDER GRAPHS AND THEIR APPLICATIONS 453 Definition 2.3. Let {Gi}i be a family of expander graphs where Gi is a d-regular graph on ni vertices and the integers {ni} are increasing, but not too fast (e.g. ni+1 ≤ n2 i will do). (1) The family is called Mildly Explicit if there is an algorithm that generates the j-th graph in the family Gj in time polynomial in j. (That is, Gj is computed in time < AjB for some constants A,B > 0.) (2) The family is called Very Explicit if there is an algorithm that on input of an integer i, a vertex v ∈ V (Gi) and k ∈ {1, ··· , d} computes the k-th neighbor of the vertex v in the graph Gi. This algorithm’s run time should be polynomial in its input length (the number of bits needed to express the triple (i, v, k)). 2.2. Examples of expander graphs. (1) A family of 8-regular graphs Gm for every integer m. The vertex set is Vm = Zm × Zm. The neighbors of the vertex (x, y) are (x + y,y), (x−y,y),(x, y+x),(x, y−x),(x+y+1, y),(x−y+1, y),(x, y+x+1),(x, y− x + 1), (all operations are mod m). This family of graphs, due to Margulis [Mar73], is the first explicitly constructed family of expander graphs. Margulis’ proof of expansion was based on representation theory and did not provide any specific bound on the expansion ratio h. Gabber and Galil [GG81] later derived such a bound using harmonic analysis. In Section 8 we show that Margulis’ graphs are expanders. Note that this family is very explicit. (2) A family of 3-regular p-vertex graphs for every prime p. Here Vp = Zp, and a vertex x is connected to x + 1, x − 1, and to its inverse x−1 (operations are mod p, and we define the inverse of 0 to be 0). Here, the proof of expansion depends on a deep result in Number Theory, the Selberg 3/16 theorem; see the discussion in subsection 11.1.2 for more details. This family is only mildly explicit, since we are at present unable to generate large primes deterministically. See [Gra05] for a survey of the Agrawal-Kayal-Saxenaan polytime primality testing algorithm. 2.3. Graph spectrum and an algebraic definition of expansion. The Adjacency Matrix of an n-vertex graph G, denoted A = A(G), is an n × n matrix whose (u, v) entry is the number of edges in G between vertex u and vertex v. Being real and symmetric, the matrix A has n real eigenvalues which we denote by λ1 ≥ λ2 ≥ ··· ≥ λn. We can also associate with it an orthonormal system of eigenvectors v1,...,vn with Avi = λivi. We often refer to the eigenvalues of A(G) as the Spectrum of the graph G. The spectrum of a graph encodes a lot of information about it. Here are some simple illustrations of how certain properties of a d-regular graph are reflected in its spectrum: • λ1 = d, and the corresponding eigenvector is v1 = 1/ √n = (1/ √n, . . . , 1/ √n). • The graph is connected iff λ1 > λ2. • The graph is bipartite iff λ1 = −λn. As seen in the next theorem, the graph’s second eigenvalue is closely related to its expansion parameter