《理论计算机科学》课程教学资源（阅读文献）Analysis Of Boolean Functions（Ryan O’Donnell）.pdf_P31-P35

36 1.Boolean functions and the Fourier expansion (b)Show that this representation is unique.(Hint:If g as in(1.11)has at least one nonzero coefficient,consider g(a)where a e(0,1)n is the indicator vector of a minimal S with cs #0.) (c)Show that all coefficients cs in the representation(1.11)will be inte- gers in the range [-2",2"]. (d)Let f:(False,True)-(False,True).Let p(x)be f's multilinear rep- resentation when False,True are 1,-1E R(i.e.,p is the Fourier ex- pansion of f)and let g(x)be f's multilinear representation when False,True are 0,1E R.Show that q(x)=-p(1-2x1,...,1-2xn). 1.10 Let f:[-1,1)"-R be not identically 0.The (real)degree of f,denoted deg(f),is defined to be the degree of its multilinear(Fourier)expansion; i.e.,max(lSI:fS)≠0. (a)Show that deg(f)=deg(a+bf)for any a,b R(assuming b#0,a+ bf≠0). (b)Show that deg(f)sk if and only if f is a real linear combination of functions g1,...,gs,each of which depends on at most k input coordi- nates. (c)Which functions in Exercise 1.1 have "nontrivial"degree?(Here f: {-1,1)n-R has "nontrivial"degree if deg(f)<n.) 1.11 Suppose that f (-1,1)-(-1,1)has deg(f)=k1. (a)Show that f's real multilinear representation over (0,1}(see Exer- cise 1.9),call it g(x),also has deg(g)=k. (b)Using Exercise 1.9(c),(d),deduce that f's Fourier spectrum is"21- granular",meaning each f(S)is an integer multiple of 21-k (c)Show that∑ssinlf(S)≤2k-1. 1.12 A Hadamard Matrix is any Nx N real matrix with +1 entries and orthog- onal rows.Particular examples are the Walsh-Hadamard Matrices HN, inductively defined for N=2as follows:H1=1H H2n -H2m (a)Let's index the rows and columns of H2 by the integers (0,1,2,...,2"- 1)rather than [2"].Further,let's identify such an integer i with its binary expansion (io,i1,...,in-1)EF2,where io is the least significant bit and in-1 the most.For example,if n=3,we identify the index i=6 with (0,1,1).Now show that the (y,x)entry of H2m is (-1)r'* (b)Show that if f:F2-R is represented as a column vector in R2"(ac- cording to the indexing scheme from part (a))then 2-"H2f=f.Here we think of f as also being a function 2-R,identifying subsets Ss(0,1,...,n-1)with their indicator vectors. (c)Show how to compute H2f using just n2"additions and subtractions (rather than 22m additions and subtractions as the usual matrix-vector multiplication algorithm would require).This computation is called Copyright@Ryan O'Donnell,2014

36 1. Boolean functions and the Fourier expansion (b) Show that this representation is unique. (Hint: If q as in (1.11) has at least one nonzero coefficient, consider q(a) where a ∈ {0,1} n is the indicator vector of a minimal S with cS 6= 0.) (c) Show that all coefficients cS in the representation (1.11) will be integers in the range [−2 n ,2 n ]. (d) Let f : {False,True} n → {False,True}. Let p(x) be f ’s multilinear representation when False,True are 1,−1 ∈ ❘ (i.e., p is the Fourier expansion of f ) and let q(x) be f ’s multilinear representation when False,True are 0,1 ∈ ❘. Show that q(x) = 1 2 − 1 2 p(1−2x1,...,1−2xn). 1.10 Let f : {−1,1} n → ❘ be not identically 0. The (real) degree of f , denoted deg(f ), is defined to be the degree of its multilinear (Fourier) expansion; i.e., max{|S| : fb(S) 6= 0}. (a) Show that deg(f ) = deg(a + b f ) for any a,b ∈ ❘ (assuming b 6= 0, a + b f 6= 0). (b) Show that deg(f ) ≤ k if and only if f is a real linear combination of functions g1,..., gs, each of which depends on at most k input coordinates. (c) Which functions in Exercise 1.1 have “nontrivial” degree? (Here f : {−1,1} n → ❘ has “nontrivial” degree if deg(f ) < n.) 1.11 Suppose that f : {−1,1} n → {−1,1} has deg(f ) = k ≥ 1. (a) Show that f ’s real multilinear representation over {0,1} (see Exercise 1.9), call it q(x), also has deg(q) = k. (b) Using Exercise 1.9(c),(d), deduce that f ’s Fourier spectrum is “21−k - granular”, meaning each fb(S) is an integer multiple of 21−k . (c) Show that P S⊆[n] |fb(S)| ≤ 2 k−1 . 1.12 A Hadamard Matrix is any N × N real matrix with ±1 entries and orthogonal rows. Particular examples are the Walsh–Hadamard Matrices HN , inductively defined for N = 2 n as follows: H1 = £ 1 ¤ , H2 n+1 = · H2 n H2 n H2 n −H2 n ¸ . (a) Let’s index the rows and columns of H2 n by the integers {0,1,2,...,2 n− 1} rather than [2n ]. Further, let’s identify such an integer i with its binary expansion (i0,i1,...,in−1) ∈ ❋n 2 , where i0 is the least significant bit and in−1 the most. For example, if n = 3, we identify the index i = 6 with (0,1,1). Now show that the (γ, x) entry of H2 n is (−1)γ·x . (b) Show that if f : ❋n 2 → ❘ is represented as a column vector in ❘2 n (according to the indexing scheme from part (a)) then 2−nH2 n f = fb. Here we think of fb as also being a function ❋n 2 → ❘, identifying subsets S ⊆ {0,1,...,n−1} with their indicator vectors. (c) Show how to compute H2 n f using just n2 n additions and subtractions (rather than 22n additions and subtractions as the usual matrix-vector multiplication algorithm would require). This computation is called Copyright © Ryan O’Donnell, 2014

1.7. Exercises and notes 37 the Fast Walsh–Hadamard Transform and is the method of choice for computing the Fourier expansion of a generic function f : ❋n 2 → ❘ when n is large. (d) Show that taking the Fourier transform is essentially an “involution”: f bb= 2 −n f (using the notations from part (b)). 1.13 Let f : {−1,1} n → ❘ and let 0 < p ≤ q < ∞. Show that kf kp ≤ kf kq. (Hint: Use Jensen’s inequality with the convex function t 7→ t q/p .) Extend the inequality to the case q = ∞, where kf k∞ is defined to be maxx∈{−1,1} n {|f (x)|}. 1.14 Compute the mean and variance of each function from Exercise 1.1. 1.15 Let f : {−1,1} n → ❘. Let K ⊆ [n] and let z ∈ {−1,1} K . Suppose g : {−1,1} [n]\K → ❘ is the subfunction of f gotten by restricting the K-coordinates to be z. Show that E[g] = P T⊆K fb(T) z T . 1.16 If f : {−1,1} n → {−1,1}, show that Var[f ] = 4·dist(f ,1)·dist(f ,−1). Deduce Proposition 1.15. 1.17 Extend Fact 1.14 by proving the following: If F is a {−1,1}-valued random variable with mean µ then Var[F] = E[(F −µ) 2 ] = 1 2 E[(F − F 0 ) 2 ] = 2Pr[F 6= F 0 ] = E[|F −µ|], where F 0 is an independent copy of F. 1.18 For any f : {−1,1} n → ❘, show that 〈f =k , f =` 〉 = ( Wk [f ] if k = `, 0 if k 6= `. 1.19 Let f : {−1,1} n → {−1,1}. (a) Suppose W1 [f ] = 1. Show that f (x) = ±χS for some |S| = 1. (b) Suppose W≤1 [f ] = 1. Show that f depends on at most 1 input coordinate. (c) Suppose W≤2 [f ] = 1. Must f depend on at most 2 input coordinates? At most 3 input coordinates? What if we assume W2 [f ] = 1? 1.20 Let f : {−1,1} n → ❘ satisfy f = f =1 . Show that Var[f 2 ] = 2 P i6=j fb(i) 2 fb(j) 2 . 1.21 Prove that there are no functions f : {−1,1} n → {−1,1} with exactly 2 nonzero Fourier coefficients. What about exactly 3 nonzero Fourier coeffi- cients? 1.22 Verify Propositions 1.25 and 1.26. 1.23 In this exercise you will prove some basic facts about “distances” between probability distributions. Let ϕ and ψ be probability densities on ❋n 2 . (a) Show that the total variation distance between ϕ and ψ, defined by dTV(ϕ,ψ) = max A⊆❋n 2 n¯ ¯ ¯Pr y∼ϕ [y ∈ A]− Pr y∼ψ [y ∈ A] ¯ ¯ ¯ o , Copyright © Ryan O’Donnell, 2014

38 1. Boolean functions and the Fourier expansion is equal to 1 2 kϕ−ψk1. (b) Show that the collision probability of ϕ, defined to be Pr y,y 0∼ϕ independently [y = y 0 ], is equal to kϕk 2 2 /2n . (c) The χ 2 -distance of ϕ from ψ is defined by dχ 2 (ϕ,ψ) = E y∼ψ h³ ϕ(y) ψ(y) −1 ´2 i , assuming ψ has full support. Show that the χ 2 -distance of ϕ from uniform is equal to Var[ϕ]. (d) Show that the total variation distance of ϕ from uniform is at most 1 2 p Var[ϕ]. 1.24 Let A ⊆ {−1,1} n have “volume” δ, meaning E[1A] = δ. Suppose ϕ is a probability density supported on A, meaning ϕ(x) = 0 when x ∉ A. Show that kϕk 2 2 ≥ 1/δ with equality if ϕ = ϕA, the uniform density on A. 1.25 Show directly from the definition that the convolution operator is associative and commutative. 1.26 Verify that (1) ⇐⇒ (2) in Definition 1.28. 1.27 Suppose an algorithm is given query access to a linear function f : ❋n 2 → ❋2 and its task is to determine which linear function f is. Show that querying f on n inputs is necessary and sufficient. 1.28 (a) Generalize Exercise 1.5 as follows: Let f : ❋n 2 → {−1,1} and suppose that dist(f ,χS∗ ) = δ. Show that |fb(S)| ≤ 2δ for all S 6= S ∗ . (Hint: Use the union bound.) (b) Deduce that the BLR Test rejects f with probability at least 3δ − 10δ 2 +8δ 3 . (c) Show that this lower bound cannot be improved to cδ−O(δ 2 ) for any c > 3. 1.29 (a) We call f : ❋n 2 → ❋2 an affine function if f (x) = a· x+b for some a ∈ ❋n 2 , b ∈ ❋2. Show that f is affine if and only if f (x)+ f (y)+ f (z) = f (x+ y+z) for all x, y, z,∈ ❋n 2 (b) Let f : ❋n 2 → ❘. Suppose we choose x, y, z ∼ ❋n 2 independently and uniformly. Show that E[f (x)f (y)f (z)f (x+ y+ z)] = P S fb(S) 4 . (c) Give a 4-query test for a function f : ❋n 2 → ❋2 with the following property: if the test accepts with probability 1−² then f is ²-close to being affine. All four query inputs should have the uniform distribution on ❋n 2 (but of course need not be independent). (d) Give an alternate 4-query test for being affine in which three of the query inputs are uniformly distributed and the fourth is not random. Copyright © Ryan O’Donnell, 2014

1.7.Exercises and notes 39 (Hint:Show that f is affine if and only if f(x)+f(y)+f(0)=f(x+y) for all x,yF2.) 1.30 Permutations teSn act on strings xe[-1,1)7 in the natural way:(x")i= x(i).They also act on functions f (-1,1)"-R via f"(x)=f(x")for all xe (-1,1)".We say that functions g,h:(-1,1)7-(-1,1)are (permutation- )isomorphic if g=h"for some ne Sn.We call Aut(f)=(ne Sn:f"=f} the(permutation-)automorphism group of f. (a)Show that fi(S)=f(n(S))for all Ss[n]. For future reference,when we write(f(S))sI=k,we mean the sequence of degree-k Fourier coefficients of f,listed in lexicographic order of the k-sets S. Given complete truth tables of some g and h we might wish to deter- mine whether they are isomorphic.One way to do this would be to define a canonical form can(f):{-1,1)"-(-1,1}for each f:(-1,1)"-(-1,1), meaning that:(i)can(f)is isomorphic to f;(ii)ifg is isomorphic to h then can(g)=can(h).Then we can determine whether g is isomorphic to h by checking whether can(g)=can(h).Here is one possible way to define a canonical form for f: 1.Set Po=Sn. 2.For each k=1,2,3,...,n, 3.Define Pr to be the set of all eP-1 that make the sequence (f(S))s=k maximal in lexicographic order on R(). 4.Let can(f)=f"for (any)ne Pn. (6)Show that this is well-defined,meaning that can(f)is the same func- tion for any choice of E Pn. (c)Show that can(f)is indeed a canonical form;i.e.,it satisfies (i)and (ii) above. (d)Show that if f((1)),...,f((n))are distinct numbers then can(f)can be computed in 0(2")time. (e)We could more generally consider g,h:{-1,1)"-(-1,1)to be isomor- phic if g(x)=h(±xrl,,+xx(n)）for some permutationπon[n]and some choice of signs.Extend the results of this exercise to handle this definition. Notes.The Fourier expansion for real-valued Boolean functions dates back to Walsh [Wal23]who introduced a complete orthonormal basis for L2([0,1]) consisting of+1-valued functions,constant on dyadic intervals.Using the or- dering introduced by Paley [Pal32],the nth Walsh basis function wn:[0,1]- (-1,1)is defined by wn(x)=ori(x)",where n=on2 and ri(x)(the ith Rademacher function at is defined to be(1)with for non-dyadic xe[0,1].Walsh's interest was in comparing and contrasting Copyright Ryan O'Donnell,2014

1.7. Exercises and notes 39 (Hint: Show that f is affine if and only if f (x)+ f (y)+ f (0) = f (x + y) for all x, y ∈ ❋n 2 .) 1.30 Permutations π ∈ Sn act on strings x ∈ {−1,1} n in the natural way: (x π )i = xπ(i) . They also act on functions f : {−1,1} n → ❘ via f π (x) = f (x π ) for all x ∈ {−1,1} n . We say that functions g, h : {−1,1} n → {−1,1} are (permutation- )isomorphic if g = h π for some π ∈ Sn. We call Aut(f ) = {π ∈ Sn : f π = f } the (permutation-)automorphism group of f . (a) Show that cf π(S) = fb(π −1 (S)) for all S ⊆ [n]. For future reference, when we write (fb(S))|S|=k, we mean the sequence of degree-k Fourier coefficients of f , listed in lexicographic order of the k-sets S. Given complete truth tables of some g and h we might wish to determine whether they are isomorphic. One way to do this would be to define a canonical form can(f ) : {−1,1} n → {−1,1} for each f : {−1,1} n → {−1,1}, meaning that: (i) can(f ) is isomorphic to f ; (ii) if g is isomorphic to h then can(g) = can(h). Then we can determine whether g is isomorphic to h by checking whether can(g) = can(h). Here is one possible way to define a canonical form for f : 1. Set P0 = Sn. 2. For each k = 1,2,3,...,n, 3. Define Pk to be the set of all π ∈ Pk−1 that make the sequence (cf π(S))|S|=k maximal in lexicographic order on ❘( n k ) . 4. Let can(f ) = f π for (any) π ∈ Pn. (b) Show that this is well-defined, meaning that can(f ) is the same function for any choice of π ∈ Pn. (c) Show that can(f ) is indeed a canonical form; i.e., it satisfies (i) and (ii) above. (d) Show that if fb({1}),..., fb({n}) are distinct numbers then can(f ) can be computed in Oe(2n ) time. (e) We could more generally consider g,h : {−1,1} n → {−1,1} to be isomorphic if g(x) = h(±xπ(1),...,±xπ(n) ) for some permutation π on [n] and some choice of signs. Extend the results of this exercise to handle this definition. Notes. The Fourier expansion for real-valued Boolean functions dates back to Walsh [Wal23] who introduced a complete orthonormal basis for L 2 ([0,1]) consisting of ±1-valued functions, constant on dyadic intervals. Using the ordering introduced by Paley [Pal32], the nth Walsh basis function wn : [0,1] → {−1,1} is defined by wn(x) = Q∞ i=0 ri(x) ni , where n = P∞ i=0 ni2 i and ri(x) (the “ith Rademacher function at x”) is defined to be (−1)xi , with x = P∞ i=0 xi2 −(i+1) for non-dyadic x ∈ [0,1]. Walsh’s interest was in comparing and contrasting Copyright © Ryan O’Donnell, 2014

40 1.Boolean functions and the Fourier expansion the properties of this basis with the usual basis of trigonometric polynomials and also Haar's basis [Haal0l. The first major study of the Walsh functions came in the remarkable paper of Paley [Pal32],which included strong results on the LP-norms of truncations of Walsh series.Sadly,Paley died in an avalanche one year later (at age 26) while skiing near Banff.The next major development in the study of Walsh series was conceptual,with Vilenkin [Vil47]and Fine [Fin49]independently suggesting the more natural viewpoint of the Walsh functions as characters of the discrete group Z2.There was significant subsequent work in the 1950s and 1960s,but it's somewhat unnatural from our point of view because it relies fundamentally on ordering the Rademacher and Walsh functions ac- cording to binary expansions.Bonami [Bon68]and Kiener [Kie69]seem to have been the first authors to take our viewpoint,treating bits x1,x2,x3,... symmetrically and ordering Fourier characters rs according to ISI rather than max(S).Bonami also obtained the first hypercontractivity result for the Boolean cube.This proved to be a crucial tool for analysis of Boolean func- tions;see Chapter 9.For an early survey on Walsh series,see Balashov and Rubinshtein [BR73]. Turning to Boolean functions and computer science,the idea of using Boolean logic to study "switching functions"(as engineers originally called Boolean functions)dates to the late 1930s and is usually credited to Naka- shima [Nak35],Shannon [Sha37],and Shestakov [She38].Muller [Mul54b] seems to be the first to have used Fourier coefficients in the study of Boolean functions;he mentions computing them while classifying all functions f: (0,1)4-(0,1)up to certain equivalences.The first publication devoted to Boolean Fourier coefficients was by Ninomiya [Nin58],who expanded on Muller's use of Fourier coefficients for the classification of Boolean functions up to various isomorphisms.Golomb [Gol59]independently pursued the same project (his work is the content of Exercise 1.30);he was also the first to recognize the connection to Walsh series.The use of"Fourier-Walsh analysis" in the study of Boolean functions quickly became well known in the early 1960s.Several symposia on applications of Walsh functions took place in the early 1970s,with Lechner's 1971 monograph [Lec71]and Karpovsky's 1976 book [Kar76]becoming the standard references.However,the use of Boolean analysis in theoretical computer science seemed to wane until 1988,when the outstanding work of Kahn,Kalai,and Linial [KKL88]ushered in a new area of sophistication. The original analysis by Blum,Luby,and Rubinfeld [BLR9O]for their linearity test was combinatorial;our proof of Theorem 1.30 is the elegant ana- lytic one due to by Bellare,Coppersmith,Hastad,Kiwi,and Sudan [BCH+96]. In fact,the essence of this analysis appears already in the 1953 work of Copyright Ryan O'Donnell,2014

40 1. Boolean functions and the Fourier expansion the properties of this basis with the usual basis of trigonometric polynomials and also Haar’s basis [Haa10]. The first major study of the Walsh functions came in the remarkable paper of Paley [Pal32], which included strong results on the L p -norms of truncations of Walsh series. Sadly, Paley died in an avalanche one year later (at age 26) while skiing near Banff. The next major development in the study of Walsh series was conceptual, with Vilenkin [Vil47] and Fine [Fin49] independently suggesting the more natural viewpoint of the Walsh functions as characters of the discrete group ❩ n 2 . There was significant subsequent work in the 1950s and 1960s, but it’s somewhat unnatural from our point of view because it relies fundamentally on ordering the Rademacher and Walsh functions according to binary expansions. Bonami [Bon68] and Kiener [Kie69] seem to have been the first authors to take our viewpoint, treating bits x1, x2, x3,... symmetrically and ordering Fourier characters χS according to |S| rather than max(S). Bonami also obtained the first hypercontractivity result for the Boolean cube. This proved to be a crucial tool for analysis of Boolean functions; see Chapter 9. For an early survey on Walsh series, see Balashov and Rubinshtein [BR73]. Turning to Boolean functions and computer science, the idea of using Boolean logic to study “switching functions” (as engineers originally called Boolean functions) dates to the late 1930s and is usually credited to Nakashima [Nak35], Shannon [Sha37], and Shestakov [She38]. Muller [Mul54b] seems to be the first to have used Fourier coefficients in the study of Boolean functions; he mentions computing them while classifying all functions f : {0,1} 4 → {0,1} up to certain equivalences. The first publication devoted to Boolean Fourier coefficients was by Ninomiya [Nin58], who expanded on Muller’s use of Fourier coefficients for the classification of Boolean functions up to various isomorphisms. Golomb [Gol59] independently pursued the same project (his work is the content of Exercise 1.30); he was also the first to recognize the connection to Walsh series. The use of “Fourier–Walsh analysis” in the study of Boolean functions quickly became well known in the early 1960s. Several symposia on applications of Walsh functions took place in the early 1970s, with Lechner’s 1971 monograph [Lec71] and Karpovsky’s 1976 book [Kar76] becoming the standard references. However, the use of Boolean analysis in theoretical computer science seemed to wane until 1988, when the outstanding work of Kahn, Kalai, and Linial [KKL88] ushered in a new area of sophistication. The original analysis by Blum, Luby, and Rubinfeld [BLR90] for their linearity test was combinatorial; our proof of Theorem 1.30 is the elegant analytic one due to by Bellare, Coppersmith, Håstad, Kiwi, and Sudan [BCH+96]. In fact, the essence of this analysis appears already in the 1953 work of Copyright © Ryan O’Donnell, 2014