464 SHLOMO HOORY.NATHAN LINIAL.AND AVI WIGDERSON As before,this simple approach seems to give away a factor of vBoB,which is im- portant for certain applications.For further discussion,see Alon-Feige-Wigderson- Zuckerman [AFWZ95]or Bilu-Hoory [BH04].For a different analysis using pertur- bation theory which gives a Chernoff-like bound for expander walks,see [Gil98]. 3.3.Applications. 3.3.1.Efficient error reduction in probabilistic algorithms.We now return to the computational problem raised in subsection 1.1.3 of reducing the error in proba- bilistic algorithms while trying to save on random bits used for this task. Let A be a probabilistic algorithm for the language (=set of binary strings) C.We first deal with the simpler case that the algorithm makes errors only on inputs outside C(so called one-sided error),in which case we say that C is in the complexity class RP.We then deal with the case that A may err both on inputs in and outside C(so called two-sided error),in which case C is in the corresponding complexity class BPP. Again,recall that if we do not attempt to save random bits.an obvious reduction in error can be achieved by running the algorithm many(say t)times,each time with independent random bits.In the one-sided error case we would take the conjunction of the answers.and in the two-sided error case we would take the majority.In both cases the error probability will drop exponentially in t.However,the number of random bits used will increase by a factor of t.We now proceed to achieve the same error probability in both cases,with much fewer random bits,using expander walks. One-sided error.Let A be an algorithm to decide membership in C that is ran- domized with a one-sided error.To decide whether a given input z belongs to C,the algorithm samples a string r E{0,1]and computes in polynomial time a boolean function A(x,r).If x E C,then A(x,r)=1 for all r.If xC the probability (over choices of r)that A(,r)=1 is at most B.Again our goal is to reduce the probability of error to below a given threshold without a substantial increase in the number of random bits that are required.To this end,choose an explicit (n,d,a)-graph G=(V,E),with V=(0,1),and a value of a sufficiently smaller than the bound B which is the error of the given algorithm.Note that the choice of a will put a lower bound on d (but as we shall see later d can be taken to be O(a-2)). For a given input z let B =B {0,1 be the set of all strings r for which the algorithm A errs on input r.We now introduce another algorithm A'for the membership problem. (1)Pick a vertex vo EV uniformly at random. (2)Start from it a length t random walk,say (vo,v1,...,v). (3)Return Ai=0A(r,vi). We note that for the new algorithm A'to be efficient this walk has to be efficiently computed,hence the importance of having G explicitly described. By Theorem 3.6 Pr[A'fails=Priv:∈Bl≤(g+a). Compared with the algorithm from Section 1,the new algorithm achieves an ex- ponential reduction in error probability,while the number of random bits used is only m +tlogd =m+O(t)
464 SHLOMO HOORY, NATHAN LINIAL, AND AVI WIGDERSON As before, this simple approach seems to give away a factor of √β0βt, which is important for certain applications. For further discussion, see Alon-Feige-WigdersonZuckerman [AFWZ95] or Bilu-Hoory [BH04]. For a different analysis using perturbation theory which gives a Chernoff-like bound for expander walks, see [Gil98]. 3.3. Applications. 3.3.1. Efficient error reduction in probabilistic algorithms. We now return to the computational problem raised in subsection 1.1.3 of reducing the error in probabilistic algorithms while trying to save on random bits used for this task. Let A be a probabilistic algorithm for the language (=set of binary strings) L. We first deal with the simpler case that the algorithm makes errors only on inputs outside L (so called one-sided error), in which case we say that L is in the complexity class RP. We then deal with the case that A may err both on inputs in and outside L (so called two-sided error), in which case L is in the corresponding complexity class BPP. Again, recall that if we do not attempt to save random bits, an obvious reduction in error can be achieved by running the algorithm many (say t) times, each time with independent random bits. In the one-sided error case we would take the conjunction of the answers, and in the two-sided error case we would take the majority. In both cases the error probability will drop exponentially in t. However, the number of random bits used will increase by a factor of t. We now proceed to achieve the same error probability in both cases, with much fewer random bits, using expander walks. One-sided error. Let A be an algorithm to decide membership in L that is randomized with a one-sided error. To decide whether a given input x belongs to L, the algorithm samples a string r ∈ {0, 1}k and computes in polynomial time a boolean function A(x, r). If x ∈ L, then A(x, r) = 1 for all r. If x ∈ L the probability (over choices of r) that A(x, r) = 1 is at most β. Again our goal is to reduce the probability of error to below a given threshold without a substantial increase in the number of random bits that are required. To this end, choose an explicit (n, d, α)-graph G = (V,E), with V = {0, 1}k, and a value of α sufficiently smaller than the bound β which is the error of the given algorithm. Note that the choice of α will put a lower bound on d (but as we shall see later d can be taken to be O(α−2)). For a given input x let Bx = B ⊆ {0, 1}k be the set of all strings r for which the algorithm A errs on input x. We now introduce another algorithm A for the membership problem. (1) Pick a vertex v0 ∈ V uniformly at random. (2) Start from it a length t random walk, say (v0, v1,...,vt). (3) Return t i=0 A(x, vi). We note that for the new algorithm A to be efficient this walk has to be efficiently computed, hence the importance of having G explicitly described. By Theorem 3.6 Pr[A fails ] = Pr[∀i vi ∈ B] ≤ (β + α) t . Compared with the algorithm from Section 1, the new algorithm achieves an exponential reduction in error probability, while the number of random bits used is only m + tlog d = m + O(t)
EXPANDER GRAPHS AND THEIR APPLICATIONS 465 Two-sided errors.When our basic algorithm can err on every input,not just when rC,we say that it makes a two-sided error.We show that the probability of success can be amplified as well in algorithms which make two-sided errors using the same trick.We say that the language C belongs to the complexity class BPP if there is a polynomial time randomized algorithm A to decide whether a given input z belongs to C.It is assumed that for every z(either in or out of C)A errs with probability To reduce our probability of error we can run A on t independently sampled random strings and take a majority vote.It is a simple consequence of the Chernoff bound that the resulting error probability decreases exponentially with t.To save on randomness we again use expander walks. As before we assume that A uses k random bits and we employ an (n,d,a)- graph on the vertex set V=10,1.Again let B=BCV be the collection of all random strings for which the algorithm A errs on input z.Our modified algorithm A'works as follows: (1)Pick a vertex vo EV uniformly at random. (2)Start from it a length t random walk,say (vo,...,vt). (3)Return majorityA(r,vi)}. The algorithm A'fails iff a majority of the vi's belong to B.Fix a set of indices K C(0,1,...,t}of cardinality (t+1)/2.By Theorem 3.10 Prv∈B for all i∈K≤(B+a)lK-1≤(B+a)t-1)/2 We will assume that a+3<1/8 and apply the union bound on the possible choices of K to deduce that Pr[A'fails]≤2.(3+a)t-1/2=0(2-t/2) So here too we achieve an exponential reduction of the error probability using only m+O(t)random bits.In the following table we collect the main parameters of the various techniques presented for error reduction. Method Error Probability No.of random bits Randomized algorithm A 1/10 m t independent repetitions of A 2-1 t.m Sampling a point and its neigh- 1/t m bors in an (n,t,1/vt)-graph. A random walk of length t on an 2-72 m+O(t) (n,d,1/40)-graph Further progress and reading. The exact form of the exponential decay in error using expander walks and its dependence on the spectral gap was found by Gillman [Gil98]and is a natural optimal generalization of the Chernoff bound for independent sampling. It is easy to see that a good approximation to the probability of hitting any set (event)in the space actually gives a good sampler for the average of any real function,and indeed expander walks are used for that purpose(for a good survey on randomness efficient samplers,see Gol97). 3.3.2.Hardness of approrimating marimum clique size.We now turn to a different application of random walks on expanders to computational complexity.We show how they are used in enhancing hardness of approximation factors of the clique problem.We first give the necessary background on hardness of approximation
EXPANDER GRAPHS AND THEIR APPLICATIONS 465 Two-sided errors. When our basic algorithm can err on every input, not just when x ∈ L, we say that it makes a two-sided error. We show that the probability of success can be amplified as well in algorithms which make two-sided errors using the same trick. We say that the language L belongs to the complexity class BPP if there is a polynomial time randomized algorithm A to decide whether a given input x belongs to L. It is assumed that for every x (either in or out of L) A errs with probability ≤ β ≤ 1 10 . To reduce our probability of error we can run A on t independently sampled random strings and take a majority vote. It is a simple consequence of the Chernoff bound that the resulting error probability decreases exponentially with t. To save on randomness we again use expander walks. As before we assume that A uses k random bits and we employ an (n, d, α)- graph on the vertex set V = {0, 1}k. Again let Bx = B ⊂ V be the collection of all random strings for which the algorithm A errs on input x. Our modified algorithm A works as follows: (1) Pick a vertex v0 ∈ V uniformly at random. (2) Start from it a length t random walk, say (v0,...,vt). (3) Return majority{A(x, vi)}. The algorithm A fails iff a majority of the vi’s belong to B. Fix a set of indices K ⊂ {0, 1,...,t} of cardinality |K| ≥ (t + 1)/2. By Theorem 3.10 Pr[vi ∈ B for all i ∈ K] ≤ (β + α) |K|−1 ≤ (β + α) (t−1)/2. We will assume that α+β ≤ 1/8 and apply the union bound on the possible choices of K to deduce that Pr[A fails ] ≤ 2t · (β + α) (t−1)/2 = O 2−t/2 . So here too we achieve an exponential reduction of the error probability using only m +O(t) random bits. In the following table we collect the main parameters of the various techniques presented for error reduction. Method Error Probability No. of random bits Randomized algorithm A 1/10 m t independent repetitions of A 2−t t · m Sampling a point and its neighbors in an (n, t, 1/ √t)-graph. 1/t m A random walk of length t on an (n, d, 1/40)-graph 2−t/2 m + O(t) Further progress and reading. The exact form of the exponential decay in error using expander walks and its dependence on the spectral gap was found by Gillman [Gil98] and is a natural optimal generalization of the Chernoff bound for independent sampling. It is easy to see that a good approximation to the probability of hitting any set (event) in the space actually gives a good sampler for the average of any real function, and indeed expander walks are used for that purpose (for a good survey on randomness efficient samplers, see [Gol97]). 3.3.2. Hardness of approximating maximum clique size. We now turn to a different application of random walks on expanders to computational complexity. We show how they are used in enhancing hardness of approximation factors of the clique problem. We first give the necessary background on hardness of approximation
466 SHLOMO HOORY.NATHAN LINIAL.AND AVI WIGDERSON Recall that a clique in a graph G is a subset of vertices S c V(G)in which every two vertices are adjacent.The clique number w(G)of a graph G is defined as the largest cardinality of a clique in G.An important computational problem is estimating this parameter.No efficient (or even subexponential time)algorithm is known for this problem.The discussion below explains why it is unlikely that such an algorithm exists. Among the great achievements of theoretical computer science in the 1970's and 80's was the discovery of numerous natural decision problems that are NP- complete.The determination of the clique number is among these problems.We are given a graph G and an integer k and are asked to determine whether w(G)>k. The fact that this problem is NP-complete means that if there is a polynomial- time algorithm to solve the clique problem,then P=NP.This means that every problem in NP has a polynomial time algorithm.That conclusion is considered ex- tremely unlikely,even though we seem far from proving it.One empirical reason for the belief that P NP is that the class NP is extremely rich.Literally thousands of important problems from many diverse branches of science and technology are known to be NP-complete.These problems have been attacked (independently) for decades by many scientists and engineers for their practical importance,and no efficient algorithm for any was found. Assuming these problems,most of which are about finding the optimal solution to a given problem,are all hard,a natural relaxation is to seek an approximate solution.How hard are these problems?For a long time only a few hardness results for approximation were known.A breakthrough in the study of this fundamental problem was made with the proof of the PCP Theorem in [AS98,ALM98].The connection between such theorems and hardness of approximation was established in FGL91].We are unable to go into this fascinating subject at any length and refer the reader to surveys by Sudan [Sud04]and by Arora-Lund [AL96]. As mentioned above,in a classical paper,Karp Kar72]showed that it is NP- hard to determine exactly the clique number of a given graph.An early triumph of the PCP theory was the proof that it is NP-hard even to approximate the clique number to within any constant factor.2 Theorem 3.12(Feige-Goldwasser-Lovasz-Safra-Szegedy [FGL91]).There are two constants 1>61>62 >0 such that it is NP-hard to decide for a given n-verter graph G whether w(G)≤d2norw(G≥61n. In this section we will show that even obtaining a very rough approximation for w(G)is NP-hard.That is,we will show that it is NP-hard to approximate w(G)even within a factor of ne for some fixed e>0.Specifically we show the following theorem. Theorem 3.13.There exists a constant e>0 with the following property.If there exists a polynomial-time algorithm A whose output on every n-verter graph G satisfies n-e≤A(G)/w(G)≤ne,then NP=P This theorem was proven by [ALM98].Our proof will follow [AFWZ95].It as- sumes an algorithm A as in the theorem and creates from it an efficient algorithm B for the problem of approximating the clique number to within a constant factor. 2In fact,the simplest form of the PCP Theorem is almost equivalent to this statement
466 SHLOMO HOORY, NATHAN LINIAL, AND AVI WIGDERSON Recall that a clique in a graph G is a subset of vertices S ⊆ V (G) in which every two vertices are adjacent. The clique number ω(G) of a graph G is defined as the largest cardinality of a clique in G. An important computational problem is estimating this parameter. No efficient (or even subexponential time) algorithm is known for this problem. The discussion below explains why it is unlikely that such an algorithm exists. Among the great achievements of theoretical computer science in the 1970’s and 80’s was the discovery of numerous natural decision problems that are NPcomplete. The determination of the clique number is among these problems. We are given a graph G and an integer k and are asked to determine whether ω(G) ≥ k. The fact that this problem is NP-complete means that if there is a polynomialtime algorithm to solve the clique problem, then P = NP. This means that every problem in NP has a polynomial time algorithm. That conclusion is considered extremely unlikely, even though we seem far from proving it. One empirical reason for the belief that P = NP is that the class NP is extremely rich. Literally thousands of important problems from many diverse branches of science and technology are known to be NP-complete. These problems have been attacked (independently) for decades by many scientists and engineers for their practical importance, and no efficient algorithm for any was found. Assuming these problems, most of which are about finding the optimal solution to a given problem, are all hard, a natural relaxation is to seek an approximate solution. How hard are these problems? For a long time only a few hardness results for approximation were known. A breakthrough in the study of this fundamental problem was made with the proof of the PCP Theorem in [AS98, ALM98]. The connection between such theorems and hardness of approximation was established in [FGL91]. We are unable to go into this fascinating subject at any length and refer the reader to surveys by Sudan [Sud04] and by Arora-Lund [AL96]. As mentioned above, in a classical paper, Karp [Kar72] showed that it is NPhard to determine exactly the clique number of a given graph. An early triumph of the PCP theory was the proof that it is NP-hard even to approximate the clique number to within any constant factor.2 Theorem 3.12 (Feige-Goldwasser-Lov´asz-Safra-Szegedy [FGL91]). There are two constants 1 > δ1 > δ2 > 0 such that it is NP-hard to decide for a given n-vertex graph G whether ω(G) ≤ δ2n or ω(G) ≥ δ1n. In this section we will show that even obtaining a very rough approximation for ω(G) is NP-hard. That is, we will show that it is NP-hard to approximate ω(G) even within a factor of n for some fixed > 0. Specifically we show the following theorem. Theorem 3.13. There exists a constant > 0 with the following property. If there exists a polynomial-time algorithm A whose output on every n-vertex graph G satisfies n− ≤ A(G)/ω(G) ≤ n, then NP = P. This theorem was proven by [ALM98]. Our proof will follow [AFWZ95]. It assumes an algorithm A as in the theorem and creates from it an efficient algorithm B for the problem of approximating the clique number to within a constant factor. 2In fact, the simplest form of the PCP Theorem is almost equivalent to this statement
EXPANDER GRAPHS AND THEIR APPLICATIONS 467 Such a conversion is called a reduction.If the resulting algorithm B is determin- istic.then we can use Theorem 3.12 to conclude that the existence of algorithm A implies P=NP. We will first present a probabilistic reduction (due to Berman and Schnit- ger [BS92])between the two problems,namely an algorithm B which is probabilistic polynomial time.Note that with this weaker reduction,the assumption of Theo- rem 3.12 only implies a probabilistic polynomial time algorithm for all problems in NP(or RP=NP in complexity theoretic lingo).After describing this probabilis- tic reduction,we will show how to eliminate the randomness from the algorithm B. resulting in a deterministic algorithm B'.For this we will again employ walks on expander graphs.This will prove the conclusion P=NP and thus the fact that approximating the clique number to within ne is NP-hard. We should note that a much stronger hardness result is known about approxi- mating the clique number,due to Hastad Has99](whose proof requires much more advanced techniques).He showed that efficiently approximating w(G)to within n1-5,for any 6>0,implies that NP RP (via a probabilistic reduction).This was very recently derandomized by Zuckerman Zuc05,again via more difficult techniques than mere expander walks,to yield that even such an approximation is actually NP-hard.Note that since a factor-n approximation is trivial,this result is surprisingly tight. The probabilistic reduction. The randomized reduction of [BS92]yields a version of Theorem 3.13 where the same assumptions lead to a weaker conclusion: Lemma 3.14 (Theorem 3.13,weak version).If there erists a polynomial-time algorithm A whose output on every n-verter graph G satisfies n-∈≤A(G)/w(G)≤ ne,then NP C RP.Here e>0 is some absolute constant. The proof follows by converting algorithm A into a probabilistic polynomial-time algorithm B that can solve the decision problem considered in Theorem 3.12.This shows that NP C RP.3 In order to apply algorithm B to a given n-vertex graph G=(V,E),consider a graph H,with vertex set Vt,where t =logn.The vertices (v1,...,vt)and (u1,...,ut)in H are adjacent if the subgraph of G induced by the set {v1,...,v}U fu1,...,ut}is a clique.Whether w(G)is below 6an or above 6in,this is significantly amplified in H.The amplification is so strong that a random subset of m=poly(n) vertices in H tends to behave very differently with respect to the clique number of the induced graph. Here is what algorithm B does on input G=(V.E): (1)Pick m random vertices from Vt and compute the subgraph H'of H induced on this set of vertices. (2)Apply algorithm A to H'. (3)Algorithm B returns 1 if A(H)>m,and otherwise it returns 0. We need the following simple combinatorial observation. Claim 3.14.1.Every clique in H is contained in a clique of the form St where S is an inclusion-marimal clique in G.In particular,w(H)=w(G)t. 3This in fact shows only NP C BPP,but it is a standard fact(using the self-reducibility of NP-complete problems)that the inclusion NP C BPP actually implies the stronger conclusion NP C RP
EXPANDER GRAPHS AND THEIR APPLICATIONS 467 Such a conversion is called a reduction. If the resulting algorithm B is deterministic, then we can use Theorem 3.12 to conclude that the existence of algorithm A implies P = NP. We will first present a probabilistic reduction (due to Berman and Schnitger [BS92]) between the two problems, namely an algorithm B which is probabilistic polynomial time. Note that with this weaker reduction, the assumption of Theorem 3.12 only implies a probabilistic polynomial time algorithm for all problems in NP (or RP = NP in complexity theoretic lingo). After describing this probabilistic reduction, we will show how to eliminate the randomness from the algorithm B, resulting in a deterministic algorithm B . For this we will again employ walks on expander graphs. This will prove the conclusion P = NP and thus the fact that approximating the clique number to within n is NP-hard. We should note that a much stronger hardness result is known about approximating the clique number, due to H˚astad [H˚as99] (whose proof requires much more advanced techniques). He showed that efficiently approximating ω(G) to within n1−δ, for any δ > 0, implies that NP = RP (via a probabilistic reduction). This was very recently derandomized by Zuckerman [Zuc05], again via more difficult techniques than mere expander walks, to yield that even such an approximation is actually NP-hard. Note that since a factor-n approximation is trivial, this result is surprisingly tight. The probabilistic reduction. The randomized reduction of [BS92] yields a version of Theorem 3.13 where the same assumptions lead to a weaker conclusion: Lemma 3.14 (Theorem 3.13, weak version). If there exists a polynomial-time algorithm A whose output on every n-vertex graph G satisfies n− ≤ A(G)/ω(G) ≤ n, then NP ⊆ RP. Here > 0 is some absolute constant. The proof follows by converting algorithm A into a probabilistic polynomial-time algorithm B that can solve the decision problem considered in Theorem 3.12. This shows that NP ⊆ RP. 3 In order to apply algorithm B to a given n-vertex graph G = (V,E), consider a graph H, with vertex set V t , where t = log n. The vertices (v1,...,vt) and (u1,...,ut) in H are adjacent if the subgraph of G induced by the set {v1,...,vt}∪ {u1,...,ut} is a clique. Whether ω(G) is below δ2n or above δ1n, this is significantly amplified in H. The amplification is so strong that a random subset of m = poly(n) vertices in H tends to behave very differently with respect to the clique number of the induced graph. Here is what algorithm B does on input G = (V,E): (1) Pick m random vertices from V t and compute the subgraph H of H induced on this set of vertices. (2) Apply algorithm A to H . (3) Algorithm B returns 1 if A(H ) > 1 2 δt 1m, and otherwise it returns 0. We need the following simple combinatorial observation. Claim 3.14.1. Every clique in H is contained in a clique of the form St where S is an inclusion-maximal clique in G. In particular, ω(H) = ω(G)t . 3This in fact shows only NP ⊆ BPP, but it is a standard fact (using the self-reducibility of NP-complete problems) that the inclusion NP ⊆ BPP actually implies the stronger conclusion NP ⊆ RP
468 SHLOMO HOORY.NATHAN LINIAL.AND AVI WIGDERSON Proof.Clearly if S is a clique in G,then the set St is a clique in H,and in particular w(H)>w(G).On the other hand,consider some clique S'in H.Let SCV(G) be the set of those vertices in G that appear as an entry in any of the t-tuples in S'.Clearly S forms a clique in G,and S'<S,whence also w(H)<w(G). We need to show two things: (1)Ifw(G)≥d1n,then almost surely w(H≥1m. (2)If w(G)<62n,then almost surely w(H')<263m. With a proper choice of m poly(n)and t=logn this proves the lemma. For the first claim,consider a clique Q in H of size w(H)=w(G)>(in)'.The expected number of vertices fromiism.By the Chemnoff bound,4 with high probability this intersection is at leastm as stated. For the other claim we need to show that it is very unlikely that the m vertices we sample from H include a large clique.For this analysis it suffices to consider subsets of inclusion-maximal cliques Q in H,of which,by Claim 3.14.1,there are at most 27.The cardinality of Q does not exceed (o2n)',and so we expect to sample<m vertices from Q.We consider it a failure if we sample more than 2.65m vertices from Q.Again by Chernoff's bound,the failure probability does not exceed exp(-(m)).As mentioned,there are at most 2n inclusion- maximal cliques in H,and so the total probability of failure is still o(1),provided that mos>n.This can be guaranteed by a proper choice of m=poly(n). The deterministic reduction.Again we assume there exists a polynomial-time algorithm A distinguishing between the two cases of Theorem 3.13.This time we use A to derive a deterministic polynomial-time algorithm B'that can distinguish between the two cases of Theorem 3.12,whence NP=P. The only difference between algorithm B'and algorithm B is that algorithm B uses a derandomized sampling to construct the graph H'.This is done as follows.Choose some(n,d,a)-expander g on the same vertex set as G.In order to select the vertices in H',we no longer take a random sample of t-tuples from V(G)t.Rather we consider all t-tuples representing a length(t-1)walk in the graph g.The resulting graph H'has m =ndt-1 vertices.Since d is fixed and t=e(logn),it follows that m is polynomial in n.We are already familiar with the idea that length(t-1)random walks on g should behave like random t-tuple in Vt.We need to establish this principle in the present context as well. Claim3.l5.fw(G)≤62n,then w(H)≤(d2+2a)'m. Proof.As before,a clique in H'corresponds to all length-(t-1)walks in G that are confined to some clique in G.Consequently,w(H)/m is the largest probability that such a walk is confined to some clique S in G(i.e.,the maximum of such a probability over all choices of a clique S).By assumption S<w(G)<62n,and our claim follows now by Theorem 3.9. ▣ The complementary statement that we need is: Claim3.16.Ifw(G)≥n,then w(H')≥(61-2a)'m 4This is an upper bound on the tail of a binomial distribution.Let Z be Binomial(N,p), i.e.the sum of N independent 0-1 random variables that are one with probability p.Then PrlZ-E[Z☑l<△·EZ]<2exp(-Np△2/3),for all0<△<1
468 SHLOMO HOORY, NATHAN LINIAL, AND AVI WIGDERSON Proof. Clearly if S is a clique in G, then the set St is a clique in H, and in particular ω(H) ≥ ω(G)t . On the other hand, consider some clique S in H. Let S ⊆ V (G) be the set of those vertices in G that appear as an entry in any of the t-tuples in S . Clearly S forms a clique in G, and |S |≤|S| t , whence also ω(H) ≤ ω(G)t . We need to show two things: (1) If ω(G) ≥ δ1n, then almost surely ω(H ) ≥ 1 2 δt 1m. (2) If ω(G) ≤ δ2n, then almost surely ω(H ) ≤ 2δt 2m. With a proper choice of m = poly(n) and t = log n this proves the lemma. For the first claim, consider a clique Q in H of size ω(H) = ω(G)t ≥ (δ1n)t . The expected number of vertices from Q in H is |Q| · |V (H )| |V (H)| ≥ δt 1m. By the Chernoff bound,4 with high probability this intersection is at least 1 2 δt 1m as stated. For the other claim we need to show that it is very unlikely that the m vertices we sample from H include a large clique. For this analysis it suffices to consider subsets of inclusion-maximal cliques Q in H, of which, by Claim 3.14.1, there are at most 2n. The cardinality of Q does not exceed (δ2n)t , and so we expect to sample |Q|m nt < δt 2m vertices from Q. We consider it a failure if we sample more than 2 · δt 2m vertices from Q. Again by Chernoff’s bound, the failure probability does not exceed exp(−Ω(mδt 2)). As mentioned, there are at most 2n inclusionmaximal cliques in H, and so the total probability of failure is still o(1), provided that mδt 2 n. This can be guaranteed by a proper choice of m = poly(n). The deterministic reduction. Again we assume there exists a polynomial-time algorithm A distinguishing between the two cases of Theorem 3.13. This time we use A to derive a deterministic polynomial-time algorithm B that can distinguish between the two cases of Theorem 3.12, whence NP = P. The only difference between algorithm B and algorithm B is that algorithm B uses a derandomized sampling to construct the graph H . This is done as follows. Choose some (n, d, α)-expander G on the same vertex set as G. In order to select the vertices in H , we no longer take a random sample of t-tuples from V (G)t . Rather we consider all t-tuples representing a length (t − 1) walk in the graph G. The resulting graph H has m = ndt−1 vertices. Since d is fixed and t = Θ(log n), it follows that m is polynomial in n. We are already familiar with the idea that length (t − 1) random walks on G should behave like random t-tuple in |V | t . We need to establish this principle in the present context as well. Claim 3.15. If ω(G) ≤ δ2n, then ω(H ) ≤ (δ2 + 2α)t m. Proof. As before, a clique in H corresponds to all length-(t − 1) walks in G that are confined to some clique in G. Consequently, ω(H )/m is the largest probability that such a walk is confined to some clique S in G (i.e., the maximum of such a probability over all choices of a clique S). By assumption |S| ≤ ω(G) ≤ δ2n, and our claim follows now by Theorem 3.9. The complementary statement that we need is: Claim 3.16. If ω(G) ≥ δ1n, then ω(H ) ≥ (δ1 − 2α)t m. 4This is an upper bound on the tail of a binomial distribution. Let Z be Binomial(N,p), i.e. the sum of N independent 0-1 random variables that are one with probability p. Then Pr[|Z − E[Z]| < ∆ · E[Z]] < 2 exp(−Np∆2/3), for all 0 < ∆ < 1