Moitra's work relies on connections to the Lovasz local lemma.His work improves our A <k bound in Theorem 3 and removes the monotonicity assumption.Despite this recent progress, Theorem 2 remains the best algorithmic result for k=3. 1.2 Outline of Paper In Section 2,we first give some preliminaries.We give a formal definition of strong spatial mixing (Section 2.1)and a reformulation of #HyperlndSet(k,A)as the problem of counting satisfying assignments in monotone CNF formulas (Section 2.2).This will allow us to use the computation tree used by Liu and Lu [12].A formal description of the computation tree of [12]is given in Section 2.3. In Section 3,we give an overview of our proof approach,i.e.,the main idea behind our new amortisation technique.Section 3.2 concludes the proof of Theorems 2 and 3,using two (not yet proved)technical lemmas(Lemma 21 and 22)which solve a complicated multivariate optimisation problem,and represent the bulk of the technical work of the paper.Section 3.3 gives the proof of Corollary 4. In Section 4,we give the proof of Lemma 21,which applies to the large-A setting of Theorem 3 and is by far the technically simpler of the two lemmas.Section 5 contains the proof of the technically more challenging Lemma 22 which applies to the k=3,A=6 setting of Theorem 2. Section 6 gives the formal statements and proofs of the hardness results stated in the Introduction;Section 6.1 has the hardness results for independent sets in hypergraphs and Section 6.2 has the hardness results for dominating sets in graphs.Also,Section 7 studies the uniqueness threshold on the k-uniform A-regular hypertree (and gives the proofs of the uniqueness statements made in the Introduction).Finally,Section 8 gives the proof for several technical inequalities used in Section 5. Throughout the paper we use computer algebra to prove multivariate polynomial inequal- ities over the field of real numbers (the coefficients of the polynomials are rational).More specifically,we use the RESOLVE command in Mathematica.The underlying quantifier elim- ination algorithm(described in [21])provides a rigorous decision procedure that determines feasibility of a collection of polynomial inequalities. 2 Preliminaries 2.1 Strong Spatial Mixing For the purposes of this section,it will be convenient to view the independent set model as a 2-spin model.Namely,if H=(V,F)is a hypergraph,each independent set I can be viewed as a f0,1)-assignment o to the vertices in V,where a vertex v is assigned the spin 1 under o ifv∈Iand0 otherwise. We denote by H the set of all independent sets in H.The Gibbs distribution u()is the uniform distribution over H.The Gibbs distribution of H can clearly be viewed as the uniform distribution over those assignments o:V>{0,1}which encode a valid independent set of H.For an assignment o:V{0,1}and a subset A C V,we denote by o the restriction of a to the subset A. For a hypergraph H=(V,F)and a subset A C V,we denote by HA the subgraph of H induced by A,i.e.,HA:=(A,Ueer(enA)).Also,for a vertex vV and A CV,we denote 6
Moitra’s work relies on connections to the Lovász local lemma. His work improves our ∆ ≤ k bound in Theorem 3 and removes the monotonicity assumption. Despite this recent progress, Theorem 2 remains the best algorithmic result for k = 3. 1.2 Outline of Paper In Section 2, we first give some preliminaries. We give a formal definition of strong spatial mixing (Section 2.1) and a reformulation of #HyperIndSet(k, ∆) as the problem of counting satisfying assignments in monotone CNF formulas (Section 2.2). This will allow us to use the computation tree used by Liu and Lu [12]. A formal description of the computation tree of [12] is given in Section 2.3. In Section 3, we give an overview of our proof approach, i.e., the main idea behind our new amortisation technique. Section 3.2 concludes the proof of Theorems 2 and 3, using two (not yet proved) technical lemmas (Lemma 21 and 22) which solve a complicated multivariate optimisation problem, and represent the bulk of the technical work of the paper. Section 3.3 gives the proof of Corollary 4. In Section 4, we give the proof of Lemma 21, which applies to the large-∆ setting of Theorem 3 and is by far the technically simpler of the two lemmas. Section 5 contains the proof of the technically more challenging Lemma 22 which applies to the k = 3, ∆ = 6 setting of Theorem 2. Section 6 gives the formal statements and proofs of the hardness results stated in the Introduction; Section 6.1 has the hardness results for independent sets in hypergraphs and Section 6.2 has the hardness results for dominating sets in graphs. Also, Section 7 studies the uniqueness threshold on the k-uniform ∆-regular hypertree (and gives the proofs of the uniqueness statements made in the Introduction). Finally, Section 8 gives the proof for several technical inequalities used in Section 5. Throughout the paper we use computer algebra to prove multivariate polynomial inequalities over the field of real numbers (the coefficients of the polynomials are rational). More specifically, we use the Resolve command in Mathematica. The underlying quantifier elimination algorithm (described in [21]) provides a rigorous decision procedure that determines feasibility of a collection of polynomial inequalities. 2 Preliminaries 2.1 Strong Spatial Mixing For the purposes of this section, it will be convenient to view the independent set model as a 2-spin model. Namely, if H = (V, F) is a hypergraph, each independent set I can be viewed as a {0, 1}-assignment σ to the vertices in V , where a vertex v is assigned the spin 1 under σ if v ∈ I and 0 otherwise. We denote by ΩH the set of all independent sets in H. The Gibbs distribution µH(·) is the uniform distribution over ΩH. The Gibbs distribution of H can clearly be viewed as the uniform distribution over those assignments σ : V → {0, 1} which encode a valid independent set of H. For an assignment σ : V → {0, 1} and a subset Λ ⊂ V , we denote by σΛ the restriction of σ to the subset Λ. For a hypergraph H = (V, F) and a subset Λ ⊂ V , we denote by HΛ the subgraph of H induced by Λ, i.e., HΛ := (Λ, ∪ e∈F (e ∩ Λ)). Also, for a vertex v ∈ V and Λ ⊂ V , we denote 6
by dist(v,A)the length of the shortest path'between v and a vertex of A. Definition 5.Let 6:Z0,1].The independent set model exhibits strong spatial mixing on a hypergraph H=(V,F)with decay rate 6()iff for every vEV,for every A CV;for any two configurations n,n:A->10,1}encoding independent sets of HA,it holds that uH(a()=1|aA=)-H(a(u)=1|oA=7)≤6(dist(,A), where A'denotes the set of vertices in A such that n and n'differ 2.2 Reformulation in terms of Monotone CNF formulas The problem of counting the independent sets of a hypergraph has an equivalent formulation in terms of monotone CNF formulas.In order to describe the equivalent formulation,we first describe the problem of counting satisfying assignments of monotone CNF formulas. A monotone CNF formula C consists of a set of variables V and a set of clauses {ci,c2,...}. Each clause ci is associated with some subset S;of V and is the disjunction of all variables in Si.The arity of a clause ci,denoted cil,is defined to be Sil.For a variable x V,its degree d(C)is the number of clauses where z appears.The marimum degree of C is given by maxrev d(C). Definition 6.Let Ck.be the set of all monotone CNF formulas which have marimum degree at most A and whose clauses have arity at least k. Note that a formula in Ck.A may have some clauses with arbitrarily large arities.A satisfying assignment of the formula is an assignment of truth values to the variables which nakes the formula evaluate to“true”. Suppose that H =(V,F)is a hypergraph with maximum degree at most A where each hyperedge has arity at least k.Let C be the corresponding formula in Ck.with variable set V. The correspondence is that each hyperedge Si of H is associated with exactly one clause ci of C.Independent sets of H are in one-to-one correspondence with satisfying assignments of C-a variable is assigned value "true"in an assignment if and only if it is out of the corresponding independent set. Going the other direction,any monotone CNF formula can be viewed as a hypergraph. In the technical sections of this paper,we use the monotone CNF formulation. In this article,when we consider a monotone CNF formula C we will typically use n to denote VI.Variables in V will be denoted by x1,2,...When z and C are clear from context, we will sometimes use d to denote d(C).When C is clear from context,we will sometimes use A to denote maxrev d(C). 2.3 The Computation Tree In this section,we set up relevant notation and give an exposition of the computation tree of Liu and Lu [12]which will also be used in our proof(though our analysis will be different). The computation tree of Liu and Lu is given in terms of the monotone CNF version of the problem.Below we give the relevant definitions and notation;our notation aligns as much as possible with that of [12]. 5A path in a hypergraph with hyperedge set F is a sequence of edges eo,...,eF such that eine for all i=0,...,-1
by dist(v,Λ) the length of the shortest path5 between v and a vertex of Λ. Definition 5. Let δ : Z+ → [0, 1]. The independent set model exhibits strong spatial mixing on a hypergraph H = (V, F) with decay rate δ(·) iff for every v ∈ V , for every Λ ⊂ V , for any two configurations η, η′ : Λ → {0, 1} encoding independent sets of HΛ, it holds that µH(σ(v) = 1 | σΛ = η) − µH(σ(v) = 1 | σΛ = η ′ ) ≤ δ ( dist(v,Λ ′ ) ) , where Λ ′ denotes the set of vertices in Λ such that η and η ′ differ. 2.2 Reformulation in terms of Monotone CNF formulas The problem of counting the independent sets of a hypergraph has an equivalent formulation in terms of monotone CNF formulas. In order to describe the equivalent formulation, we first describe the problem of counting satisfying assignments of monotone CNF formulas. A monotone CNF formula C consists of a set of variables V and a set of clauses {c1, c2, . . .}. Each clause ci is associated with some subset Si of V and is the disjunction of all variables in Si . The arity of a clause ci , denoted |ci |, is defined to be |Si |. For a variable x ∈ V , its degree dx(C) is the number of clauses where x appears. The maximum degree of C is given by maxx∈V dx(C). Definition 6. Let Ck,∆ be the set of all monotone CNF formulas which have maximum degree at most ∆ and whose clauses have arity at least k. Note that a formula in Ck,∆ may have some clauses with arbitrarily large arities. A satisfying assignment of the formula is an assignment of truth values to the variables which makes the formula evaluate to “true”. Suppose that H = (V, F) is a hypergraph with maximum degree at most ∆ where each hyperedge has arity at least k. Let C be the corresponding formula in Ck,∆ with variable set V . The correspondence is that each hyperedge Si of H is associated with exactly one clause ci of C. Independent sets of H are in one-to-one correspondence with satisfying assignments of C — a variable is assigned value “true” in an assignment if and only if it is out of the corresponding independent set. Going the other direction, any monotone CNF formula can be viewed as a hypergraph. In the technical sections of this paper, we use the monotone CNF formulation. In this article, when we consider a monotone CNF formula C we will typically use n to denote |V |. Variables in V will be denoted by x1, x2, . . . When x and C are clear from context, we will sometimes use d to denote dx(C). When C is clear from context, we will sometimes use ∆ to denote maxx∈V dx(C). 2.3 The Computation Tree In this section, we set up relevant notation and give an exposition of the computation tree of Liu and Lu [12] which will also be used in our proof (though our analysis will be different). The computation tree of Liu and Lu is given in terms of the monotone CNF version of the problem. Below we give the relevant definitions and notation; our notation aligns as much as possible with that of [12]. 5A path in a hypergraph with hyperedge set F is a sequence of edges e0, . . . , eℓ ∈ F such that ei ∩ ei+1 ̸= ∅ for all i = 0, . . . , ℓ − 1. 7
Our goal is to approximately count the number of satisfying assignments of a formula C∈Ck.△,which we denote by Z(C).Since C is monotone,.an assignment o:V→{0,l}is satisfying if,for every clause in C,there is at least one variable x Ec with a(r)=1.Note that Z(C)>0 since the all-1 assignment satisfies every monotone CNF formula.For convenience, we will use the simplified notation "x=1"to denote (the set of)satisfying assignments of C in which z is set to 1,and we similarly use "x =0".We associate the formula C with a probability distribution in which each satisfying assignment has probability mass 1/Z(C). We will denote probabilities with respect to this distribution by Prc(). Let be avariable in V.Define R(C.),this is well-defined since Prc( 1)>0 by the monotonicity of C.In fact,the monotonicity of C also implies that 0< R(C,x)<1,where the upper bound follows from the fact that,for every satisfying assignment with x=0,flipping the assignment of x to 1 does not affect satisfiability.Our interest in the quantity R(C,x)stems from the following simple lemma from [12]. Lemma 7 ([12]).Let k and A be positive integers.Suppose that there is a polynomial-time algorithm (in n and 1/e)that takes an n-variable formula CECk.A,a variable r of C,and an >0 and computes a quantity R(C,x)satisfying R(C,)-R(C,)<Then,there is an FPTAS which approrimates Z(C)for every CECk.A. Proof.The proof is actually identical to the argument in [12,Appendix A].We include the proof for completeness,and also because an examination of the proof is necessary to check that the FPTAS for approximating Z(C)invokes the algorithm that computes R(C,x)only on formulas C whose clauses have arity at least k (and maximum degree A). Let e>0 and C be a monotone CNF formula C with maximum degree A whose clauses have arity at least k.Let 1,...,n be the variables in C.Let Ci be the formula obtained from C by setting 1=...=xi=1 and removing all the clauses that are satisfied (i.e.,all clauses that contain a variable from x1,...,xi).We have zZC=Pcm1==n=)=ⅡPrc(c+1=1|1=…=4=1) (1) -PIG(I=1)-1+R() 0 Note that every Ci is a monotone CNF formula with maximum degree A whose clauses have arity at least k.By the assumption in the lemma,we can compute (in poly(n,1/e)time) quantities R(Ci,+)such that R(C,xi+1)-R(C,x+1)≤e/(100m)for all i=0,,n-1. (2) Let Π(1+(C,x+1) (3) =( It is not hard to conclude from (1),(2)and (3)that (1-)Z(C)<Z(C)<(1+)Z(C).This completes the proof. ▣ Liu and Lu [12]established that a computation tree approach gives a recursive procedure for eractly calculating R(C,x)for any monotone CNF formula C and any variable x E C. 8
Our goal is to approximately count the number of satisfying assignments of a formula C ∈ Ck,∆, which we denote by Z(C). Since C is monotone, an assignment σ : V → {0, 1} is satisfying if, for every clause in C, there is at least one variable x ∈ c with σ(x) = 1. Note that Z(C) > 0 since the all-1 assignment satisfies every monotone CNF formula. For convenience, we will use the simplified notation “x = 1” to denote (the set of) satisfying assignments of C in which x is set to 1, and we similarly use “x = 0”. We associate the formula C with a probability distribution in which each satisfying assignment has probability mass 1/Z(C). We will denote probabilities with respect to this distribution by PrC(·). Let x be a variable in V . Define R(C, x) := PrC (x=0) PrC (x=1) , this is well-defined since PrC(x = 1) > 0 by the monotonicity of C. In fact, the monotonicity of C also implies that 0 ≤ R(C, x) ≤ 1, where the upper bound follows from the fact that, for every satisfying assignment with x = 0, flipping the assignment of x to 1 does not affect satisfiability. Our interest in the quantity R(C, x) stems from the following simple lemma from [12]. Lemma 7 ([12]). Let k and ∆ be positive integers. Suppose that there is a polynomial-time algorithm (in n and 1/ε) that takes an n-variable formula C ∈ Ck,∆, a variable x of C, and an ε > 0 and computes a quantity Rb(C, x) satisfying |Rb(C, x) − R(C, x)| ≤ ε. Then, there is an FPTAS which approximates Z(C) for every C ∈ Ck,∆. Proof. The proof is actually identical to the argument in [12, Appendix A]. We include the proof for completeness, and also because an examination of the proof is necessary to check that the FPTAS for approximating Z(C) invokes the algorithm that computes Rb(C, x) only on formulas C whose clauses have arity at least k (and maximum degree ∆). Let ε > 0 and C be a monotone CNF formula C with maximum degree ∆ whose clauses have arity at least k. Let x1, . . . , xn be the variables in C. Let Ci be the formula obtained from C by setting x1 = · · · = xi = 1 and removing all the clauses that are satisfied (i.e., all clauses that contain a variable from x1, . . . , xi). We have 1 Z(C) = PrC(x1 = . . . = xn = 1) = n∏−1 i=0 PrC(xi+1 = 1 | x1 = · · · = xi = 1) = n∏−1 i=0 PrCi (xi+1 = 1) = n∏−1 i=0 1 1 + R(Ci , xi+1) . (1) Note that every Ci is a monotone CNF formula with maximum degree ∆ whose clauses have arity at least k. By the assumption in the lemma, we can compute (in poly(n, 1/ε) time) quantities Rb(Ci , xi+1) such that Rb(Ci , xi+1) − R(Ci , xi+1) ≤ ε/(100n) for all i = 0, . . . , n − 1. (2) Let Zb(C) = n∏−1 i=0 ( 1 + Rb(Ci , xi+1) ) . (3) It is not hard to conclude from (1), (2) and (3) that (1−ε)Z(C) ≤ Zb(C) ≤ (1 +ε)Z(C). This completes the proof. Liu and Lu [12] established that a computation tree approach gives a recursive procedure for exactly calculating R(C, x) for any monotone CNF formula C and any variable x ∈ C. 8
We next give the details of this recursive procedure (see [12,Lemma 5]).First,we introduce the following definitions. Definition 8.Let C be a monotone CNF formula and let x be a variable in C.We call the variable r forced (in C)if x appears in a clause of arity 1 in C (note that in every satisfying assignment of C it must be the case that r=1 and hence R(C,x)=0).We call the variable x free if x does not appear in any clause of C (note that R(C,)=1 in this case). Definition 9.Let C be a monotone CNF formula and let c be a clause in C.We call the clause c redundant (in C)if there is a clause c in C such that c is a (strict)superset of c (note that removing c from C does not affect the set of satisfying assignments of C). We next give the details of the computation tree.The nodes in the computation tree will be pairs (C,z)such that C is a monotone CNF formula and x is a variable which is not forced in C. (4) Let (C,x)satisfy (4).We first perform a pre-processing step on C which involves (i)initially removing all of the redundant clauses,(ii)then,removing all clauses of arity 1.Note that part (ii)of the preprocessing step removes all forced variables that were present in C;at the time of the removal,forced variables appear only in clauses of arity 1 since part (i)of the preprocessing step has already removed all redundant clauses in C (and hence all clauses of arity greater than 1 that contain forced variables).Denote the formula after the completion of the preprocessing step by C.Note that every clause in C is also a clause in the initial formula C.It follows that x is not forced in C.Further,since removing redundant clauses does not change the set of satisfying assignments of C and x is not forced in C,we have that R(C,x)=R(C,E). If z is free in C(the formula after the pre-processing step),then the start node (C,z)is (declared)a leaf of the computation tree (note that in this case R(C,x)=1).In the sequel, we assume that z is not free in C.Denote by {cilield the clauses where x occurs in C and let wi=cil-1 (note that d >1).We will use w to denote the vector (w1,...,wd).The variables in clause ci other than will be denoted byi.For the pair (C,),we next construct pairs(Cii,ij)for i[d and j [wi],where Cij is an appropriate subformula obtained from C,roughly,by hard-coding (some of)the occurrences of the variables in C to either 1 or 0(this will be explained below and will henceforth be referred to as pinning)6. Precisely,for i [d,let Ci be the formula obtained from C by removing clauses c1,...,ci-1 (note that this has the same effect as pinning the occurrences of r in these clauses to 1)and pinning the occurrences of x in ci+1,...,ca to 0(this corresponds to removing x from these clauses,and thus reducing their arities).For j [wil,the formula Cii is obtained from Ci by further removing clause ci and pinning all the occurrences of xi1,...,xij-1 to 0. Before proceeding,let us argue that the pairs (Ci.j,xij)satisfy (4)for all ie[d and j[wi].For such i,j,we first prove that Cii is a (satisfiable)monotone CNF formula. That is,we prove that the various pinnings in the construction of Cii from C do not pin all variables of some clause of C to 0.For the sake of contradiction,assume otherwise.Observe that Ci;is obtained from C by either removing some clauses or by pinning some occurrences of the variables to 0.Clearly,removal of clauses does not affect satisfiability,so we may focus on the effect of pinning.For i[d and j [wil,the only variables whose (some of the) 6Note that our notation for i,j is different from the one in [12];there,the roles of i,j are interchanged. 9
We next give the details of this recursive procedure (see [12, Lemma 5]). First, we introduce the following definitions. Definition 8. Let C be a monotone CNF formula and let x be a variable in C. We call the variable x forced (in C) if x appears in a clause of arity 1 in C (note that in every satisfying assignment of C it must be the case that x = 1 and hence R(C, x) = 0). We call the variable x free if x does not appear in any clause of C (note that R(C, x) = 1 in this case). Definition 9. Let C be a monotone CNF formula and let c be a clause in C. We call the clause c redundant (in C) if there is a clause c ′ in C such that c is a (strict) superset of c ′ (note that removing c from C does not affect the set of satisfying assignments of C). We next give the details of the computation tree. The nodes in the computation tree will be pairs (C, x) such that C is a monotone CNF formula and x is a variable which is not forced in C. (4) Let (C, x) satisfy (4). We first perform a pre-processing step on C which involves (i) initially removing all of the redundant clauses, (ii) then, removing all clauses of arity 1. Note that part (ii) of the preprocessing step removes all forced variables that were present in C; at the time of the removal, forced variables appear only in clauses of arity 1 since part (i) of the preprocessing step has already removed all redundant clauses in C (and hence all clauses of arity greater than 1 that contain forced variables). Denote the formula after the completion of the preprocessing step by Ce. Note that every clause in Ce is also a clause in the initial formula C. It follows that x is not forced in Ce. Further, since removing redundant clauses does not change the set of satisfying assignments of C and x is not forced in C, we have that R(C, x e ) = R(C, x). If x is free in Ce (the formula after the pre-processing step), then the start node (C, x) is (declared) a leaf of the computation tree (note that in this case R(C, x) = 1). In the sequel, we assume that x is not free in Ce. Denote by {ci}i∈[d] the clauses where x occurs in Ce and let wi = |ci | − 1 (note that d ≥ 1). We will use w to denote the vector (w1, . . . , wd). The variables in clause ci other than x will be denoted by xi,1, . . . , xi,wi . For the pair (C, x), we next construct pairs (Ci,j , xi,j ) for i ∈ [d] and j ∈ [wi ], where Ci,j is an appropriate subformula obtained from Ce, roughly, by hard-coding (some of) the occurrences of the variables in Ce to either 1 or 0 (this will be explained below and will henceforth be referred to as pinning)6 . Precisely, for i ∈ [d], let Ci be the formula obtained from Ce by removing clauses c1, . . . , ci−1 (note that this has the same effect as pinning the occurrences of x in these clauses to 1) and pinning the occurrences of x in ci+1, . . . , cd to 0 (this corresponds to removing x from these clauses, and thus reducing their arities). For j ∈ [wi ], the formula Ci,j is obtained from Ci by further removing clause ci and pinning all the occurrences of xi,1, . . . , xi,j−1 to 0. Before proceeding, let us argue that the pairs (Ci,j , xi,j ) satisfy (4) for all i ∈ [d] and j ∈ [wi ]. For such i, j, we first prove that Ci,j is a (satisfiable) monotone CNF formula. That is, we prove that the various pinnings in the construction of Ci,j from Ce do not pin all variables of some clause of Ce to 0. For the sake of contradiction, assume otherwise. Observe that Ci,j is obtained from Ce by either removing some clauses or by pinning some occurrences of the variables to 0. Clearly, removal of clauses does not affect satisfiability, so we may focus on the effect of pinning. For i ∈ [d] and j ∈ [wi ], the only variables whose (some of the) 6Note that our notation for i, j is different from the one in [12]; there, the roles of i, j are interchanged. 9
occurrences in C get pinned to 0 are i1,...,ij-1.Since we assumed (for contradiction) that Cii is unsatisfiable,it must be the case that there exists a clause c in C all of whose variables are (a subset of),.1,...1.It follows that ci is redundant in C since it is a strict superset of clause c.This gives a contradiction,since the pre-processing operation ensures that C has no redundant clauses.Thus,Cij is satisfiable as wanted.Next,we show that rij is not forced in Cij.First,observe that xij is not forced in C since the second part of the preprocessing step ensures that C does not contain forced variables.Thus,the only way that zij can be forced in Cij is if there existed a clause d in C whose variables were ij together with a subset of i,1,...,ij-1.Since C includes ci and C does not have redundant clauses,it must be the case that c'=ci.It remains to observe that Cij does not include (any subclause of)ci,from which it follows that rij is not forced in Cij. We are now ready to state the relation between R(C,z)and the quantities R(Cij,ij) with i∈[d and j∈[ul. Lemma 10 ([12,Lemma 5).It holds that R(C,x - R(Cjxi】 1+R(Cij,ij) (5) Proof.The proof is identical to the proof of [12,Lemma 5](which in turn builds on the technique of [22]),we give the proof for completeness. Recall that C is the formula after the preprocessing step and that R(C,)=R(C,z).We may assume that x is not free in C(otherwise,it holds that R(C,r)=1,which coincides with the evaluation of the right hand side of (5)under the standard convention that the empty product evaluates to 1). Equation(5)follows immediately from the following two equalities. R(C,x)=II R(Ci,)R(Ci,x)=1- R(Ci,c) (6) ield j=1 1+R(Ci.j;Tij) Prc(x=0) The first equality in (6)is a consequence of a telescoping expansion ofTo see this, let C be the formula obtained from C by replacing,for all i [d,the occurrence of the variable x in clause ci by a new variable z.We have that Prc(x=0)Prc(=0,.,4=0) C,=Pre=司=m=1%可 Pra(=1,-1=1,2=0…4=0) -Ⅱ8-14-141-0…4=0 = Prc;(=0) ield ΠPcx=D iel which yields the first equlty (6)after substitutingR(C) PrC;(z=0) For the second equality in(6),observe that x appears only in clause ci of the formula Ci, and thus (denoting by Ci\ci the formula which is obtained from Ci by deleting clause ci) R(Ci,) PCz=1=1-PrCe(1=0,4,=0)=】 PrC(x=0) IIPrc.=0), j=1 which proves the desired equality after substituting Prc(0)= R(C,x,) 10
occurrences in Ce get pinned to 0 are x, xi,1, . . . , xi,j−1. Since we assumed (for contradiction) that Ci,j is unsatisfiable, it must be the case that there exists a clause c ′ in Ce all of whose variables are (a subset of) x, xi,1, . . . , xi,j−1. It follows that ci is redundant in Ce since it is a strict superset of clause c ′ . This gives a contradiction, since the pre-processing operation ensures that Ce has no redundant clauses. Thus, Ci,j is satisfiable as wanted. Next, we show that xi,j is not forced in Ci,j . First, observe that xi,j is not forced in Ce since the second part of the preprocessing step ensures that Ce does not contain forced variables. Thus, the only way that xi,j can be forced in Ci,j is if there existed a clause c ′ in Ce whose variables were xi,j together with a subset of x, xi,1, . . . , xi,j−1. Since Ce includes ci and Ce does not have redundant clauses, it must be the case that c ′ = ci . It remains to observe that Ci,j does not include (any subclause of) ci , from which it follows that xi,j is not forced in Ci,j . We are now ready to state the relation between R(C, x) and the quantities R(Ci,j , xi,j ) with i ∈ [d] and j ∈ [wi ]. Lemma 10 ([12, Lemma 5]). It holds that R(C, x) = ∏ d i=1 ( 1 − ∏wi j=1 R(Ci,j , xi,j ) 1 + R(Ci,j , xi,j ) ) . (5) Proof. The proof is identical to the proof of [12, Lemma 5] (which in turn builds on the technique of [22]), we give the proof for completeness. Recall that Ce is the formula after the preprocessing step and that R(C, x e ) = R(C, x). We may assume that x is not free in Ce (otherwise, it holds that R(C, x e ) = 1, which coincides with the evaluation of the right hand side of (5) under the standard convention that the empty product evaluates to 1). Equation (5) follows immediately from the following two equalities. R(C, x e ) = ∏ i∈[d] R(Ci , x), R(Ci , x) = 1 − ∏wi j=1 R(Ci,j , xi,j ) 1 + R(Ci,j , xi,j ) . (6) The first equality in (6) is a consequence of a telescoping expansion of PrCe(x=0) PrCe(x=1) . To see this, let Ce′ be the formula obtained from Ce by replacing, for all i ∈ [d], the occurrence of the variable x in clause ci by a new variable x ′ i . We have that R(C, x e ) = PrCe(x = 0) PrCe(x = 1) = PrCe′(x ′ 1 = 0, . . . , x′ d = 0) PrCe′(x ′ 1 = 1, . . . , x′ d = 1) = ∏ i∈[d] PrCe′(x ′ 1 = 1, . . . , x′ i−1 = 1, x′ i = 0, . . . x′ d = 0) PrCe′(x ′ 1 = 1, . . . , x′ i = 1, x′ i+1 = 0, . . . x′ d = 0) = ∏ i∈[d] PrCi (x = 0) PrCi (x = 1), which yields the first equality in (6) after substituting R(Ci , x) = PrCi (x=0) PrCi (x=1) . For the second equality in (6), observe that x appears only in clause ci of the formula Ci , and thus (denoting by Ci\ci the formula which is obtained from Ci by deleting clause ci) R(Ci , x) = PrCi (x = 0) PrCi (x = 1) = 1 − PrCi\ci (xi,1 = 0, . . . , xi,wi = 0) = 1 − ∏wi j=1 PrCi,j (xi,j = 0), which proves the desired equality after substituting PrCi,j (xi,j = 0) = R(Ci,j ,xi,j ) 1+R(Ci,j ,xi,j ) . 10