1.2.The "Fourier expansion":functions as multilinear polynomials 21 f=max2,the“maximum”function on2bits: max2(+1,+1)=+1, max2(-1,+1)=+1, max2(+1,-1)=+1, max2(-1,-1)=-1. Then max2 can be expressed as a multilinear polynomial, max2(x1,x2)=+x1+5x2-x12; (1.1) this is the "Fourier expansion"of max2.As another example,consider the majority function on 3 bits,Majs:(-1,1)3-(-1,1),which outputs the +1 bit occurring more frequently in its input.Then it's easy to verify the Fourier expansion Maj3(x1,x2,x3)=5x1+x2+5x3-5x1x2xg: (1.2) The functions max2 and Majs will serve as running examples in this chapter. Let's see how to obtain such multilinear polynomial representations in general.Given an arbitrary Boolean function f:(-1,1)"-(-1,1)there is a familiar method for finding a polynomial that interpolates the 2"values that f assigns to the points (-1,1)"cR".For each point a=(a1,...,an)E(-1,1) the indicator polynomial 1a6)=(*2+22(25) takes value 1 when x=a and value 0 when xe(-1,1)\(a).Thus f has the polynomial representation fx)=∑f(a)la(x. ae{-1,1m Illustrating with the f=max2 example again,we have max2(x) =(+受 +学到受 (1.3) +( (-1受罗)=壹+1+2-12 Let us make two remarks about this interpolation procedure.First,it works equally well in the more general case of real-valued Boolean functions,f: {-1,1)2-R.Second,since the indicator polynomials are multilinear when expanded out,the interpolation always produces a multilinear polynomial. Indeed,it makes sense that we can represent functions f:{-1,1)n-R with multilinear polynomials:since we only care about inputs x where xi=+1,any factor ofx can be replaced by 1. Copyright@Ryan O'Donnell,2014
1.2. The “Fourier expansion”: functions as multilinear polynomials 21 f = max2, the “maximum” function on 2 bits: max2(+1,+1) = +1, max2(−1,+1) = +1, max2(+1,−1) = +1, max2(−1,−1) = −1. Then max2 can be expressed as a multilinear polynomial, max2(x1, x2) = 1 2 + 1 2 x1 + 1 2 x2 − 1 2 x1x2; (1.1) this is the “Fourier expansion” of max2. As another example, consider the majority function on 3 bits, Maj3 : {−1,1} 3 → {−1,1}, which outputs the ±1 bit occurring more frequently in its input. Then it’s easy to verify the Fourier expansion Maj3 (x1, x2, x3) = 1 2 x1 + 1 2 x2 + 1 2 x3 − 1 2 x1x2x3. (1.2) The functions max2 and Maj3 will serve as running examples in this chapter. Let’s see how to obtain such multilinear polynomial representations in general. Given an arbitrary Boolean function f : {−1,1} n → {−1,1} there is a familiar method for finding a polynomial that interpolates the 2n values that f assigns to the points {−1,1} n ⊂ ❘n . For each point a = (a1,...,an) ∈ {−1,1} n the indicator polynomial 1{a} (x) = ³ 1+a1 x1 2 ´ ³ 1+a2 x2 2 ´ ··· ³ 1+an xn 2 ´ takes value 1 when x = a and value 0 when x ∈ {−1,1} n \ {a}. Thus f has the polynomial representation f (x) = X a∈{−1,1} n f (a)1{a} (x). Illustrating with the f = max2 example again, we have max2(x) = (+1) ³ 1+x1 2 ´ ³ 1+x2 2 ´ + (+1) ³ 1−x1 2 ´ ³ 1+x2 2 ´ (1.3) + (+1) ³ 1+x1 2 ´ ³ 1−x2 2 ´ + (−1) ³ 1−x1 2 ´ ³ 1−x2 2 ´ = 1 2 + 1 2 x1 + 1 2 x2 − 1 2 x1x2. Let us make two remarks about this interpolation procedure. First, it works equally well in the more general case of real-valued Boolean functions, f : {−1,1} n → ❘. Second, since the indicator polynomials are multilinear when expanded out, the interpolation always produces a multilinear polynomial. Indeed, it makes sense that we can represent functions f : {−1,1} n → ❘ with multilinear polynomials: since we only care about inputs x where xi = ±1, any factor of x 2 i can be replaced by 1. Copyright © Ryan O’Donnell, 2014
22 1.Boolean functions and the Fourier expansion We have illustrated that every f:{-1,1)"-R can be represented by a real multilinear polynomial;as we will see in Section 1.3,this representation is unique.The multilinear polynomial for f may have up to 2"terms,corre- sponding to the subsets S[n].We write the monomial corresponding to S as xs=Πx (with x=1 by convention), ieS and we use the following notation for its coefficient: f(S)=coefficient on monomial x in the multilinear representation of f. This discussion is summarized by the Fourier expansion theorem: Theorem 1.1.Every function f:{-1,1yn-R.can be uniquely expressed as a multilinear polynomial, fx)=∑fS)xs (1.4) SsIn] This expression is called the Fourier expansion of f,and the real number f(S) is called the Fourier coefficient of f on S.Collectively,the coefficients are called the Fourier spectrum of f. As examples,from(1.1)and(1.2)we obtain: max2(o)=,max2(《1》=,max2({2)=,max2(《1,2)=-i Majs((1)),Majs((2)),Maja((3))=,Maja((1,2,3))=-,Majs(S)=0 else. We finish this section with some notation.It is convenient to think of the monomial x5 as a function on x=(x1,...,)E R";we write it as Xs()=Πx ieS Thus we sometimes write the Fourier expansion of f:(-1,1)-R as f(x)=∑fS)xs(x). Ss[n] So far our notation makes sense only when representing the Hamming cube by (-1,1)"R".The other frequent representation we will use for the cube is F2.We can define the Fourier expansion for functions f:F2-R by "encoding"input bits 0,1E F2 by the real numbers-1,1E R.We choose the encoding x:F2-R defined by X(0)=+1,X(1)=-1. This encoding is not so natural from the perspective of Boolean logic;e.g.,it means the function max2 we have discussed represents logical AND.But it's mathematically natural because for beF2 we have the formula r(b)=(-1)5 We now extend the xs notation: Copyright Ryan O'Donnell,2014
22 1. Boolean functions and the Fourier expansion We have illustrated that every f : {−1,1} n → ❘ can be represented by a real multilinear polynomial; as we will see in Section 1.3, this representation is unique. The multilinear polynomial for f may have up to 2n terms, corresponding to the subsets S ⊆ [n]. We write the monomial corresponding to S as x S = Y i∈S xi (with x ; = 1 by convention), and we use the following notation for its coefficient: fb(S) = coefficient on monomial x S in the multilinear representation of f . This discussion is summarized by the Fourier expansion theorem: Theorem 1.1. Every function f : {−1,1} n → ❘ can be uniquely expressed as a multilinear polynomial, f (x) = X S⊆[n] fb(S) x S . (1.4) This expression is called the Fourier expansion of f , and the real number fb(S) is called the Fourier coefficient of f on S. Collectively, the coefficients are called the Fourier spectrum of f . As examples, from (1.1) and (1.2) we obtain: max 2(;) = 1 2 , max 2({1}) = 1 2 , max 2({2}) = 1 2 , max 2({1,2}) = − 1 2 ; Maj 3 ({1}), Maj 3 ({2}), Maj 3 ({3}) = 1 2 , Maj 3 ({1,2,3}) = − 1 2 , Maj 3 (S) = 0 else. We finish this section with some notation. It is convenient to think of the monomial x S as a function on x = (x1,..., xn) ∈ ❘n ; we write it as χS(x) = Y i∈S xi . Thus we sometimes write the Fourier expansion of f : {−1,1} n → ❘ as f (x) = X S⊆[n] fb(S)χS(x). So far our notation makes sense only when representing the Hamming cube by {−1,1} n ⊆ ❘n . The other frequent representation we will use for the cube is ❋n 2 . We can define the Fourier expansion for functions f : ❋n 2 → ❘ by “encoding” input bits 0,1 ∈ ❋2 by the real numbers −1,1 ∈ ❘. We choose the encoding χ : ❋2 → ❘ defined by χ(0❋2 ) = +1, χ(1❋2 ) = −1. This encoding is not so natural from the perspective of Boolean logic; e.g., it means the function max2 we have discussed represents logical AND. But it’s mathematically natural because for b ∈ ❋2 we have the formula χ(b) = (−1)b . We now extend the χS notation: Copyright © Ryan O’Donnell, 2014
1.3.The orthonormal basis of parity functions 23 Definition 1.2.For S[n]we define xs:F2-R by Xs(x)=ΠXx)=(-1)正es which satisfies xs(x+y)=xs(x)xs(y). (1.5) In this way,given any function f:F2R it makes sense to write its Fourier expansion as fx)=∑fS)xsx. SsIn] In fact,if we are really thinking of F2 the n-dimensional vector space over F2,it makes sense to identify subsets S[n]with vectorsr.This will be discussed in Chapter 3.2. 1.3.The orthonormal basis of parity functions For xE{-1,1”,the number xs(x)=Πiesxi is in{-l,1h.Thus xs:{-1,1”一 (-1,1)is a Boolean function;it computes the logical parity,or exclusive-or (XOR),of the bits (xi)ies.The parity functions play a special role in the analysis of Boolean functions:the Fourier expansion f=∑fS)xs (1.6) Ss[n] shows that any f can be represented as a linear combination of parity func- tions (over the reals). It's useful to explore this idea further from the perspective of linear alge- bra.The set of all functions f:{-1,1)"-R forms a vector space V,since we can add two functions(pointwise)and we can multiply a function by a real scalar.The vector space V is 2"-dimensional:if we like we can think of the functions in this vector space as vectors in R2,where we stack the 2 values f(x)into a tall column vector (in some fixed order).Here we illustrate the Fourier expansion(1.1)of the max2 function from this perspective: +1 +1] +1 +1 +1 +1 +1 +1 max2= (1/2) (1/2 (1/2 -1/2) +1 +1 (1.7) +1 More generally,the Fourier expansion(1.6)shows that every function f:(-1,1)"-R in V is a linear combination of the parity functions;i.e.,the parity functions are a spanning set for V.Since the number of parity functions is 2n=dimV,we can deduce that they are in fact a linearly independent basis for V.In particular this justifies the uniqueness of the Fourier expansion stated in Theorem 1.1. Copyright Ryan O'Donnell,2014
1.3. The orthonormal basis of parity functions 23 Definition 1.2. For S ⊆ [n] we define χS : ❋n 2 → ❘ by χS(x) = Y i∈S χ(xi) = (−1) P i∈S xi , which satisfies χS(x+ y) = χS(x)χS(y). (1.5) In this way, given any function f : ❋n 2 → ❘ it makes sense to write its Fourier expansion as f (x) = X S⊆[n] fb(S)χS(x). In fact, if we are really thinking of ❋n 2 the n-dimensional vector space over ❋2, it makes sense to identify subsets S ⊆ [n] with vectors γ ∈ ❋n 2 . This will be discussed in Chapter 3.2. 1.3. The orthonormal basis of parity functions For x ∈ {−1,1} n , the number χS(x) = Q i∈S xi is in {−1,1}. Thus χS : {−1,1} n → {−1,1} is a Boolean function; it computes the logical parity, or exclusive-or (XOR), of the bits (xi)i∈S. The parity functions play a special role in the analysis of Boolean functions: the Fourier expansion f = X S⊆[n] fb(S)χS (1.6) shows that any f can be represented as a linear combination of parity functions (over the reals). It’s useful to explore this idea further from the perspective of linear algebra. The set of all functions f : {−1,1} n → ❘ forms a vector space V, since we can add two functions (pointwise) and we can multiply a function by a real scalar. The vector space V is 2n -dimensional: if we like we can think of the functions in this vector space as vectors in ❘2 n , where we stack the 2n values f (x) into a tall column vector (in some fixed order). Here we illustrate the Fourier expansion (1.1) of the max2 function from this perspective: max2 = +1 +1 +1 −1 = (1/2) +1 +1 +1 +1 +(1/2) +1 −1 +1 −1 +(1/2) +1 +1 −1 −1 +(−1/2) +1 −1 −1 +1 . (1.7) More generally, the Fourier expansion (1.6) shows that every function f : {−1,1} n → ❘ in V is a linear combination of the parity functions; i.e., the parity functions are a spanning set for V. Since the number of parity functions is 2n = dimV, we can deduce that they are in fact a linearly independent basis for V. In particular this justifies the uniqueness of the Fourier expansion stated in Theorem 1.1. Copyright © Ryan O’Donnell, 2014
24 1.Boolean functions and the Fourier expansion We can also introduce an inner product on pairs of function f,g:(-1,1)"- Rin V.The usual inner product on R2"would correspond to Exe(-1.1)f(x)g(x), but it's more convenient to scale this by a factor of 2-",making it an average rather than a sum.In this way,a Boolean function f:{-1,1)n-(-1,1)will have(f,f)=l,i.e.,bea“unit vector” Definition 1.3.We define an inner product (,on pairs of function f,g: {-1,1-Rby (f:g)=2-n f(x)g(x)=E.[f(a)g(x)]. (1.8) xe{-1,1}m x-{-1,1n We also use the notation llfll2=v(f,f),and more generally, lIfllp E[lf(x)IP]VP Here we have introduced probabilistic notation that will be used heavily throughout the book: Notation 1.4.We writex~(-1,1)7 to denote that x is a uniformly chosen ran- dom string from(-1,1)".Equivalently,the n coordinatesxi are independently chosen to be +1 with probability 1/2 and-1 with probability 1/2.We always write random variables in boldface.Probabilities Pr and expectations E will always be with respect to a uniformly randomx~(-1,1)"unless otherwise specified.Thus we might write the expectation in (1.8)as Er[f(x)g(x)]or E[f(x)g(x)]or even E[fgl. Returning to the basis of parity functions for V,the crucial fact underlying all analysis of Boolean functions is that this is an orthonormal basis. Theorem 1.5.The 2"parity functions xs:(-1,1y"-(-1,1)form an orthonor- mal basis for the vector space V of functions (-1,1yn-R;i.e., (1 ifS=T, (XS,XT〉= 0f$≠T. Recalling the definition (xs,IT)=E[xs(x)xT(x)],Theorem 1.5 follows imme- diately from two facts: Fact 1.6.For xE(-1,1)n it holds that xs(x)xr(x)=XSAT(x),where SAT denotes symmetric difference. Proofs=n几n听-=aa Fact1.7.EIxs(*)]= -688 Copyright @Ryan O'Donnell,2014
24 1. Boolean functions and the Fourier expansion We can also introduce an inner product on pairs of function f , g : {−1,1} n → ❘ in V. The usual inner product on ❘2 n would correspond to P x∈{−1,1} n f (x)g(x), but it’s more convenient to scale this by a factor of 2−n , making it an average rather than a sum. In this way, a Boolean function f : {−1,1} n → {−1,1} will have 〈f , f〉 = 1, i.e., be a “unit vector”. Definition 1.3. We define an inner product 〈·,·〉 on pairs of function f , g : {−1,1} n → ❘ by 〈f , g〉 = 2 −n X x∈{−1,1} n f (x)g(x) = E x∼{−1,1} n [f (x)g(x)]. (1.8) We also use the notation kf k2 = p 〈f , f〉, and more generally, kf kp = E[|f (x)| p ] 1/p . Here we have introduced probabilistic notation that will be used heavily throughout the book: Notation 1.4. We write x ∼ {−1,1} n to denote that x is a uniformly chosen random string from {−1,1} n . Equivalently, the n coordinates xi are independently chosen to be +1 with probability 1/2 and −1 with probability 1/2. We always write random variables in boldface. Probabilities Pr and expectations E will always be with respect to a uniformly random x ∼ {−1,1} n unless otherwise specified. Thus we might write the expectation in (1.8) as Ex[f (x)g(x)] or E[f (x)g(x)] or even E[f g]. Returning to the basis of parity functions for V, the crucial fact underlying all analysis of Boolean functions is that this is an orthonormal basis. Theorem 1.5. The 2 n parity functions χS : {−1,1} n → {−1,1} form an orthonormal basis for the vector space V of functions {−1,1} n → ❘; i.e., 〈χS,χT〉 = ( 1 if S = T, 0 if S 6= T. Recalling the definition 〈χS,χT〉 = E[χS(x)χT(x)], Theorem 1.5 follows immediately from two facts: Fact 1.6. For x ∈ {−1,1} n it holds that χS(x)χT(x) = χS4T(x), where S4T denotes symmetric difference. Proof. χS(x)χT(x) = Y i∈S xi Y i∈T xi = Y i∈S4T xi Y i∈S∩T x 2 i = Y i∈S4T xi = χS4T(x). Fact 1.7. E[χS(x)] = E hY i∈S xi i = ( 1 if S = ;, 0 if S 6= ;. Copyright © Ryan O’Donnell, 2014.
1.4.Basic Fourier formulas 25 Proof.IfS=o then E[xs(x)]=E[1]=1.Otherwise, - ies because the random bits 1,...,n are independent.But each of the factors E[xi]in the above (nonempty)product is (1/2)(+1)+(1/2)(-1)=0. ▣ 1.4.Basic Fourier formulas As we have seen,the Fourier expansion of f:(-1,1)"-R can be thought of as the representation of f over the orthonormal basis of parity functions (xs)scin].In this basis,f has 2""coordinates",and these are precisely the Fourier coefficients of f.The“coordinate”of f in the xs“direction”isf,xs; i.e.,we have the following formula for Fourier coefficients: Proposition 1.8.For f:(-1,1yn-R and S[n],the Fourier coefficient of f on S is given by fS)=f,Xs〉=x-昌()xs6x We can verify this formula explicitly: f,Xs〉=( ∑fT)x红,xs〉=∑fT)x,Xs)=fS), (1.9) \TSIn] TsIn] where we used the Fourier expansion of f,the linearity of(,),and finally Theorem 1.5.This formula is the simplest way to calculate the Fourier coef- ficients of a given function;it can also be viewed as a streamlined version of the interpolation method illustrated in (1.3).Alternatively,this formula can be taken as the definition of Fourier coefficients. The orthonormal basis of parities also lets us measure the squared "length" (2-norm)of f:{-1,1)7-R efficiently:it's just the sum of the squares of f's "coordinates"-i.e.,Fourier coefficients.This simple but crucial fact is called Parseval's Theorem. Parseval's Theorem.For any f:(-1,1)n-R, f,=-fw=∑fS2. Ssin] In particular,if f:(-1,1yn-(-1,1)is Boolean-valued then fS)2=1. Sc[n] As examples we can recall the Fourier expansions of max2 and Majg: max2(x)=+2x1x2,Majg(x)=+x+xx1x2x3. In both cases the sum of squares of Fourier coefficients is 4 x(1/4)=1. Copyright@Ryan O'Donnell,2014
1.4. Basic Fourier formulas 25 Proof. If S = ; then E[χS(x)] = E[1] = 1. Otherwise, E hY i∈S xi i = Y i∈S E[xi] because the random bits x1,..., xn are independent. But each of the factors E[xi] in the above (nonempty) product is (1/2)(+1)+(1/2)(−1) = 0. 1.4. Basic Fourier formulas As we have seen, the Fourier expansion of f : {−1,1} n → ❘ can be thought of as the representation of f over the orthonormal basis of parity functions (χS)S⊆[n] . In this basis, f has 2n “coordinates”, and these are precisely the Fourier coefficients of f . The “coordinate” of f in the χS “direction” is 〈f ,χS〉; i.e., we have the following formula for Fourier coefficients: Proposition 1.8. For f : {−1,1} n → ❘ and S ⊆ [n], the Fourier coefficient of f on S is given by fb(S) = 〈f ,χS〉 = E x∼{−1,1} n [f (x)χS(x)]. We can verify this formula explicitly: 〈f ,χS〉 = * X T⊆[n] fb(T)χT,χS + = X T⊆[n] fb(T)〈χT,χS〉 = fb(S), (1.9) where we used the Fourier expansion of f , the linearity of 〈·,·〉, and finally Theorem 1.5. This formula is the simplest way to calculate the Fourier coef- ficients of a given function; it can also be viewed as a streamlined version of the interpolation method illustrated in (1.3). Alternatively, this formula can be taken as the definition of Fourier coefficients. The orthonormal basis of parities also lets us measure the squared “length” (2-norm) of f : {−1,1} n → ❘ efficiently: it’s just the sum of the squares of f ’s “coordinates” – i.e., Fourier coefficients. This simple but crucial fact is called Parseval’s Theorem. Parseval’s Theorem. For any f : {−1,1} n → ❘, 〈f , f〉 = E x∼{−1,1} n [f (x) 2 ] = X S⊆[n] fb(S) 2 . In particular, if f : {−1,1} n → {−1,1} is Boolean-valued then X S⊆[n] fb(S) 2 = 1. As examples we can recall the Fourier expansions of max2 and Maj3 : max2(x) = 1 2 + 1 2 x1 + 1 2 x2 − 1 2 x1x2, Maj3 (x) = 1 2 x1 + 1 2 x2 + 1 2 x3 − 1 2 x1x2x3. In both cases the sum of squares of Fourier coefficients is 4×(1/4) = 1. Copyright © Ryan O’Donnell, 2014.