the two chains is small,in the sense that they have almost the same execution logs except for about O(TL/n)steps,where L is the difference between I and I'. To simplify the exposition of such coupling,for now we restrict ourselves to the cases where the update to the instance I does not change the set of variables.Without loss of generality,we only consider the following two basic update operations that modifies I to I'. .Graph update.The update only adds or deletes some edges,while all vertex potentials and the potentials of unaffected edges are not changed. Hamiltonian update.The update changes (possibly all)potentials of vertices and edges,while the underlying graph remains unchanged. The general update of graphical model can be obtained by combining these two basic operations. Then the new chain (Y)o can be coupled with(X)o by using the same initial configuration Yo Xo and the same sequence v1,v2....,Ur EV of randomly picked vertices.And for t =1,2,...,T, the transition (vt,Yi(vt))of the new chain can be generated using the same vertex vt as in the original (Xchain,and a random Ygenerated according to a coupling of the marginal distributions of )and Y(,conditioning respectively on the current states of the neighborhood ofv in(X) and (Y)Note that these two marginal distributions must be identical unless()Xand Y-differ from each other over the neighborhood of vr or(II)the vr itself is incident to where the models I and I'differ.The event (II)occurs rarely due to the following reasons. For graph update,the event (II)occurs only if vr is incident to an updated edge.Since only L edges are updated,the event occurs in at most O(TL/n)steps. For Hamiltonian update,all the potentials of vertices and edges can be changed,thus l,I'may differ everywhere.The key observation is that,as the total difference between the current and updated potentials is bounded by L,we can apply a filter to first select all candidate steps where the coupling may actually fail due to the difference between I and I',which can be as small as O(TL/n),and the actual coupling between(X)and (Y)is constructed with such prior. Finally,when I and I'both satisfy the Dobrushin-Shlosman condition,the percolation of disagree- ments between (X)and (Y)is bounded,and we show that the two chains are almost always identically coupled as (vr,X(v))=(v,Y(v)),with exceptions at only O(TL/n)steps.The original chain(X)can then be updated to the new chain (Y)Tby only editing these O(TL/n)local tran- sitions (vt,Y(vr))which are different from (vt,X(vt)).This is aided by the dynamic data structure for the execution log of the chain,which is of independent interests. 6.DYNAMIC GIBBS SAMPLING In this section,we give the dynamic sampling algorithm that updates the sample sequences. In the following theorem,we use I=(V,E,)where n =VI,to denote the current MRF instance and T'=(V',E',O,')where n'=V'l,to denote the updated MRF instance.And define dgraph(I,I)兰IV⊕V'I+lE⊕E' dram(,I)±∑lpe-6,+∑lMe-9dl,: DEVnV eEEnE' Note that d(,I)=dgraph(I,I)+dHamil(I,I),where d(I,I')is defined in (2). Theorem 6.1 (dynamic sampling algorithm).Let NN+N+and e:N+(0,1)be two functions satisfying the bounded difference condition in Definition 2.3.Assume that I and I'both satisfy Dobrushin-Shlosman condition,dgraph(I,I)<Lgraph=o(n)and dHami(I,I')<LHamil. There is an algorithm that maintains a sequence ofN(n)independent samples(.(N()V where dry(μr,X≤e(n)for al1≤i≤N(n,using O(nN(n)logn)memory words,,each ofO(log n) bits,such that when I is updated to I',the algorithm updates the sequence to N(n')independent samples Y(1)...,Y(N(n)Ov where drv (ur,Y(i))se(n')for all1sisN(n'),within expected time cost (5) O(A2(Lgraph+LHamil))N(n))log3n+△n, where△=max{△G,△c},and△G,△Gdenote the maximum degree ofG=(V,E)andG'=(WV',E'). 9
the two chains is small, in the sense that they have almost the same execution logs except for about O(T L/n) steps, where L is the difference between I and I 0 . To simplify the exposition of such coupling, for now we restrict ourselves to the cases where the update to the instance I does not change the set of variables. Without loss of generality, we only consider the following two basic update operations that modifies I to I 0 . • Graph update. The update only adds or deletes some edges, while all vertex potentials and the potentials of unaffected edges are not changed. • Hamiltonian update. The update changes (possibly all) potentials of vertices and edges, while the underlying graph remains unchanged. The general update of graphical model can be obtained by combining these two basic operations. Then the new chain (Yt ) T t=0 can be coupled with (Xt ) T t=0 by using the same initial configuration Y0 = X0 and the same sequence v1,v2, . . . ,vT ∈ V of randomly picked vertices. And fort = 1, 2, . . . ,T , the transition hvt ,Yt (vt )i of the new chain can be generated using the same vertexvt as in the original (Xt ) T t=0 chain, and a random Yt (vt ) generated according to a coupling of the marginal distributions of Xt (vt ) and Yt (vt ), conditioning respectively on the current states of the neighborhood of vt in (Xt ) T t=0 and (Yt ) T t=0 . Note that these two marginal distributions must be identical unless (I) Xt−1 and Yt−1 differ from each other over the neighborhood of vt or (II) the vt itself is incident to where the models I and I 0 differ. The event (II) occurs rarely due to the following reasons. • For graph update, the event (II) occurs only if vt is incident to an updated edge. Since only L edges are updated, the event occurs in at most O(T L/n) steps. • For Hamiltonian update, all the potentials of vertices and edges can be changed, thus I, I 0 may differ everywhere. The key observation is that, as the total difference between the current and updated potentials is bounded by L, we can apply a filter to first select all candidate steps where the coupling may actually fail due to the difference between I and I 0 , which can be as small as O(T L/n), and the actual coupling between (Xt ) ∞ t=0 and (Yt ) ∞ t=0 is constructed with such prior. Finally, when I and I 0 both satisfy the Dobrushin-Shlosman condition, the percolation of disagreements between (Xt ) T t=0 and (Yt ) T t=0 is bounded, and we show that the two chains are almost always identically coupled as hvt ,Xt (vt )i = hvt ,Yt (vt )i, with exceptions at only O(T L/n) steps. The original chain (Xt ) T t=0 can then be updated to the new chain (Yt ) T t=0 by only editing these O(T L/n) local transitions hvt ,Yt (vt )i which are different from hvt ,Xt (vt )i. This is aided by the dynamic data structure for the execution log of the chain, which is of independent interests. 6. Dynamic Gibbs sampling In this section, we give the dynamic sampling algorithm that updates the sample sequences. In the following theorem, we use I = (V, E,Q, Φ), where n = |V |, to denote the current MRF instance and I 0 = (V 0 , E 0 ,Q, Φ 0 ), where n 0 = |V 0 |, to denote the updated MRF instance. And define dgraph(I, I 0 ) ≜ |V ⊕ V 0 | + |E ⊕ E 0 | dHamil(I, I 0 ) ≜ Õ v ∈V ∩V 0 φv − φ 0 v 1 + Õ e ∈E∩E 0 φe − φ 0 e 1 . Note that d(I, I 0 ) = dgraph(I, I 0 ) + dHamil(I, I 0 ), where d(I, I 0 ) is defined in (2). Theorem 6.1 (dynamic sampling algorithm). Let N : N + → N + and ϵ : N + → (0, 1) be two functions satisfying the bounded difference condition in Definition 2.3. Assume that I and I 0 both satisfy Dobrushin-Shlosman condition, dgraph(I, I 0 ) ≤ Lgraph = o(n) and dHamil(I, I 0 ) ≤ LHamil. There is an algorithm that maintains a sequence of N(n) independent samples X (1) , . . . ,X (N (n)) ∈ Q V where dTV µI,X (i) ≤ ϵ(n) for all 1 ≤ i ≤ N(n), using O (nN(n) logn) memory words, each of O(logn) bits, such that when I is updated to I 0 , the algorithm updates the sequence to N(n 0 ) independent samples Y (1) , . . . ,Y (N (n 0 )) ∈ Q V 0 where dTV µI0,Y (i) ≤ ϵ(n 0 ) for all 1 ≤ i ≤ N(n 0 ), within expected time cost O ∆ 2 (Lgraph + LHamil)N(n) log3 n + ∆n (5) , where ∆ = max{∆G, ∆G0}, and ∆G, ∆G0 denote the maximum degree of G = (V, E) and G 0 = (V 0 , E 0 ). 9
Our algorithm is based on the Gibbs sampling algorithm.Let N:Nt-N+and e:N+(0,1) be two functions in Theorem 6.1.We first give the single-sample dynamic Gibbs sampling algorithm (Algorithm 2)that maintains a single sample X O for the current MRF instance T =(V,E,, where n =IVI such that drv(X,ur)se(n).We then use this algorithm to obtain the multi-sample dynamic Gibbs sampling algorithm that maintains N(n)independent samples for the current instance. Given the error function e:N(0,1),suppose that T()is an easy-to-compute integer-valued function that upper bounds the mixing time on instance I,such that (6) T(I)≥tmix(I,e(n), where Tmix(T,e(n))denotes the mixing rate for the Gibbs sampling chain(Xr)tzo on instance I.By Proposition 4.3,if the Dobrushin-Shlosman condition is satisfied,we can set T(T)=3log c(m) n (7) Our algorithm for single-sample dynamic Gibbs sampling maintains a random process(X which is a Gibbs sampling chain on instance I of length T=T(T),where T(T)satisfies(6).Clearly Xr is a sample forμr with drv(Xr,μr)≤e(nm). When the current instance I is updated to a new instance I'=(V,E',O,')where n'=V'l,the original process(X)is transformed to a new process ()such that the following holds as an invariant:(Yisa Gibbs sampling chain on T'withT=T().Hence Yr is a sample for the new instance I'with drv (Yr,ur)se(n').This is achieved through the following two steps: (1)We construct couplings between(Xand (Y)so that the new process(Y)for I'can be obtained by making small changes to the original process(X)o for I. (2)We give a data structure which represents(X)oincrementally and supports various updates and queries to(X)so that the above coupling can be generated efficiently. 6.1.Coupling for dynamic instances.The Gibbs sampling chain(X)ocan be uniquely and fully recovered from:the initial state Xo Q,and the pairs (vt,Xi(vr)=that record the transitions.We call (t,X(vr)the execution-log for the chain (Xt).and denote it with Exe-Log(I,T)(,X,(》1 The following invariants are assumed for the random execution-log with an initial state. Condition 6.2(invariants for Exe-Log).Fixed an initial state Xo QV,the followings hold for the random execution-log Exe-Log(,T)=X)for the Gibbs sampling chain (X)o on instance I=(V,E,Q,Φ: .T=T(I)where T(I)satisfies(6); .the random process (X)uniquely recovered from the transitions (Xand the initial state Xo,is identically distributed as the Gibbs sampling(Algorithm 1)on instance starting from initial state Xo with vr as the vertex picked at the t-th step. Such invariants guarantee that Xr provides a sample for ur with drv(Xr,ur)se(V). Suppose the current instance I is updated to a new instance I'.We construct couplings between the execution-log Exe-Log(I,T)=(vt,X(v:))=1 with initial state Xo EO for I and the execution-log Exe-Log)with initial state YforOur goal is as followsssum ing Condition 6.2 for Xo and Exe-Log(T,T),the same condition should hold invariantly for Yo and Exe-Log(I',T). Unlike traditional coupling of Markov chains for the analysis of mixing time,where the two chains start from arbitrarily distinct initial states but proceed by the same transition rule,here the two chains (X)and(Y)start from similar states but have to obey different transition rules due to differences between instances I and I'. 10
Our algorithm is based on the Gibbs sampling algorithm. Let N : N + → N + and ϵ : N + → (0, 1) be two functions in Theorem 6.1. We first give the single-sample dynamic Gibbs sampling algorithm (Algorithm 2) that maintains a single sample X ∈ Q V for the current MRF instance I = (V, E,Q, Φ) where n = |V | such that dTV (X, µI) ≤ ϵ(n). We then use this algorithm to obtain the multi-sample dynamic Gibbs sampling algorithm that maintains N(n) independent samples for the current instance. Given the error function ϵ : N + → (0, 1), suppose that T (I) is an easy-to-compute integer-valued function that upper bounds the mixing time on instance I, such that (6) T (I) ≥ τmix(I, ϵ(n)), where τmix(I, ϵ(n)) denotes the mixing rate for the Gibbs sampling chain (Xt )t ≥0 on instance I. By Proposition 4.3, if the Dobrushin-Shlosman condition is satisfied, we can set T (I) = n δ log n ϵ(n) (7) . Our algorithm for single-sample dynamic Gibbs sampling maintains a random process (Xt ) T t=0 , which is a Gibbs sampling chain on instance I of length T = T (I), where T (I) satisfies (6). Clearly XT is a sample for µI with dTV (XT , µI) ≤ ϵ(n). When the current instance I is updated to a new instance I 0 = (V 0 , E 0 ,Q, Φ 0 ) where n 0 = |V 0 |, the original process (Xt ) T t=0 is transformed to a new process (Yt ) T 0 t=0 such that the following holds as an invariant: (Yt ) T 0 t=0 is a Gibbs sampling chain on I 0 with T 0 = T (I0 ). Hence YT is a sample for the new instance I 0 with dTV (YT , µI0) ≤ ϵ(n 0 ). This is achieved through the following two steps: (1) We construct couplings between (Xt ) T t=0 and (Yt ) T 0 t=0 , so that the new process (Yt ) T 0 t=0 for I 0 can be obtained by making small changes to the original process (Xt ) T t=0 for I. (2) We give a data structure which represents (Xt ) T t=0 incrementally and supports various updates and queries to (Xt ) T t=0 so that the above coupling can be generated efficiently. 6.1. Coupling for dynamic instances. The Gibbs sampling chain (Xt ) T t=0 can be uniquely and fully recovered from: the initial state X0 ∈ Q V , and the pairs hvt ,Xt (vt )i T t=1 that record the transitions. We call hvt ,Xt (vt )i T t=1 the execution-log for the chain (Xt ) T t=0 , and denote it with Exe-Log(I,T ) ≜ hvt ,Xt (vt )i T t=1 . The following invariants are assumed for the random execution-log with an initial state. Condition 6.2 (invariants for Exe-Log). Fixed an initial state X0 ∈ Q V , the followings hold for the random execution-log Exe-Log(I,T ) = hvt ,Xt (vt )i T t=1 for the Gibbs sampling chain (Xt ) T t=0 on instance I = (V, E,Q, Φ): • T = T (I) where T (I) satisfies (6); • the random process (Xt ) T t=0 uniquely recovered from the transitions hvt ,Xt (vt )i T t=1 and the initial state X0, is identically distributed as the Gibbs sampling (Algorithm 1) on instance I starting from initial state X0 with vt as the vertex picked at the t-th step. Such invariants guarantee that XT provides a sample for µI with dTV (XT , µI) ≤ ϵ(|V |). Suppose the current instance I is updated to a new instance I 0 . We construct couplings between the execution-log Exe-Log(I,T ) = hvt ,Xt (vt )i T t=1 with initial state X0 ∈ Q V for I and the execution-log Exe-Log(I0 ,T 0 ) = v 0 t ,Yt (v 0 t ) T 0 t=1 with initial state Y0 ∈ Q V 0 for I 0 . Our goal is as follows: assuming Condition 6.2 for X0 and Exe-Log(I,T ), the same condition should hold invariantly for Y0 and Exe-Log(I0 ,T 0 ). Unlike traditional coupling of Markov chains for the analysis of mixing time, where the two chains start from arbitrarily distinct initial states but proceed by the same transition rule, here the two chains (Xt ) T t=0 and (Yt ) T t=0 start from similar states but have to obey different transition rules due to differences between instances I and I 0 . 10
Due to the technical reason,we divide the update from I =(V,E,O,)to I'=(V,E',O,'into two steps:we first update I=(V,E,O,to (8) Imid=(V,E,g,Φmid). where the potentials omid=(mid)avuE in the middle instance Imid are defined as a∈VUE, φ8d±{ d ifa∈V'UE' a if a生VUE': then we update Imid =(V,E,,omid)to I'=(V,E,,).In other words,the update from I to Imid is only caused by updating the potentials of vertices and edges,while the underlying graph remains unchanged;and the update from Imid to I'is only caused by updating the underlying graph,i.e.adding vertices,deleting vertices,adding edges and deleting edges. The dynamic Gibbs sampling algorithm can be outlined as follows. .UpdateHamiltonian:update Xo and (Xto a new initial state Zo and a new execu- tion log Exe-Log(Imid.T)=(urZ(usuch that the random process(Z)is the Gibbs sampling on instance Imid. UpdateGraph:update Zo and (ur,Z(ur))to a new initial state Yo and a new execution log Exe-Log(T)=(Ysuch that the random process (Y)is the Gibbs sampling on instance I'. ·LengthFix::change the length of the execution log(似,y,(》,fromTtoT',where T'= T(I')and T(I')satisfies (6). The dynamic Gibbs sampling algorithm is given in Algorithm 2. Algorithm 2:Dynamic Gibbs sampling Data:Xo∈Q'and Exe--Log(I,T)=(,Xr(r)》i-1 for current I=(V,E,Q,Φ), Update:an update that modifies I to I'=(V',E,,). 1 compute T=T(T)satisfying(6)and construct Imid =(V,E,Q,mid)as in (8); 2(Z,(u,Z,(ur)》-i←-UpdateHamiltonian(亿,Imid,Xo,(u,Xz(r)》-l月 /update the potentials:Imid 3(,,Yeg》ai)←UpdateGraph(ndI,Zo,u,Zu,》i月 /update the underlying graph:ImidT' 4(.(.Y,g》)←LengthFix(T,.,gi,T')whereT'=TW): /change the length of the execution log from T to T'=T(T) 5 update the data to Yo and Exe-Log(TT)=(Y( Algorithm 3:LengthFix (I.Xo.vX(vr1.T' Data XoQV and Exe-Log(I,T)=()=for current I=(V,E.Q.). Input the new length T'>0. 1 if T'<T then 2 truncate(e,X(o)》1to(,X,()》zi 3 else 4 extend (to by simulating the Gibbs sampling chain on I for T-T'more steps; s update the data to Xand Exe-Log(.T)=(v 11
Due to the technical reason, we divide the update from I = (V, E,Q, Φ) to I 0 = (V 0 , E 0 ,Q, Φ 0 ) into two steps: we first update I = (V, E,Q, Φ) to Imid = (V, E,Q, Φ mid (8) ), where the potentials Φ mid = (φ mid a )a ∈V ∪E in the middle instance Imid are defined as ∀a ∈ V ∪ E, φ mid a ≜ ( φ 0 a if a ∈ V 0 ∪ E 0 φa if a < V 0 ∪ E 0 ; then we update Imid = (V, E,Q, Φ mid) to I 0 = (V 0 , E 0 ,Q, Φ 0 ). In other words, the update from I to Imid is only caused by updating the potentials of vertices and edges, while the underlying graph remains unchanged; and the update from Imid to I 0 is only caused by updating the underlying graph, i.e. adding vertices, deleting vertices, adding edges and deleting edges. The dynamic Gibbs sampling algorithm can be outlined as follows. • UpdateHamiltonian: update X0 and hvt ,Xt (vt )i T t=1 to a new initial state Z0 and a new execution log Exe-Log(Imid,T ) = hut , Zt (ut )i T t=1 such that the random process (Zt ) T t=0 is the Gibbs sampling on instance Imid. • UpdateGraph: update Z0 and hut , Zt (ut )i T t=1 to a new initial state Y0 and a new execution log Exe-Log(I0 ,T ) = v 0 t ,Yt (v 0 t ) T t=1 such that the random process (Yt ) T t=0 is the Gibbs sampling on instance I 0 . • LengthFix: change the length of the execution log v 0 t ,Yt (v 0 t ) T t=1 from T to T 0 , where T 0 = T (I0 ) and T (I0 ) satisfies (6). The dynamic Gibbs sampling algorithm is given in Algorithm 2. Algorithm 2: Dynamic Gibbs sampling Data : X0 ∈ Q V and Exe-Log(I,T ) = hvt ,Xt (vt )i T t=1 for current I = (V, E,Q, Φ). Update: an update that modifies I to I 0 = (V 0 , E 0 ,Q, Φ 0 ). 1 compute T 0 = T (I0 ) satisfying (6) and construct Imid = (V 0 , E 0 ,Q, Φ mid) as in (8); 2 Z0, hut , Zt (ut )i T t=1 ← UpdateHamiltonian I, Imid,X0, hvt ,Xt (vt )i T t=1 ; // update the potentials: I → Imid 3 Y0, v 0 t ,Yt (v 0 t ) T t=1 ← UpdateGraph Imid, I 0 ,Z0, hut , Zt (ut )i T t=1 ; // update the underlying graph: Imid → I0 4 Y0, v 0 t ,Yt (v 0 t ) T 0 t=1 ← LengthFix I 0 ,Y0, v 0 t ,Yt (v 0 t ) T t=1 ,T 0 , where T 0 = T (I0 ) ; // change the length of the execution log from T to T 0 = T (I0 ) 5 update the data to Y0 and Exe-Log(I0 ,T 0 ) = v 0 t ,Yt (v 0 t ) T 0 t=1 ; Algorithm 3: LengthFix I,X0, hvt ,Xt (vt )i T t=1 ,T 0 Data : X0 ∈ Q V and Exe-Log(I,T ) = hvt ,Xt (vt )i T t=1 for current I = (V, E,Q, Φ). Input : the new length T 0 > 0. 1 if T 0 < T then 2 truncate hvt ,Xt (vt )i T t=1 to hvt ,Xt (vt )i T 0 t=1 ; 3 else 4 extend hvt ,Xt (vt )i T t=1 to hvt ,Xt (vt )i T 0 t=1 by simulating the Gibbs sampling chain on I for T −T 0 more steps; 5 update the data to X0 and Exe-Log(I,T 0 ) = hvt ,Xt (vt )i T 0 t=1 11
The subroutine LengthFix is given in Algorithm 3.We then describe UpdateHamiltonian(Sec- tion 6.1.1)and UpdateGraph(Section 6.1.2). 6.1.1.Coupling for Hamiltonian update.We consider the update of changing potentials of vertices and edges.The update do not change the underlying graph.Let I =(V,E,)be the current MRF instance.Let Xo and (Xbe the current initial state and execution log such that the random process(X)ois the Gibbs sampling on instance I.Upon such an update,the new instance becomes I'=(V,E,,).The algorithm UpdateHamiltonian(I,I',Xo,(vt,X(v))11)updates the data to Yo and Y(such that the random process (Y)is the Gibbs sampling on instanceT We transform the pair of Xo∈Q'and(u,X,(u)》-1 to a new pair of Yo∈Q'and(,yY,(ur》Il for I.This is achieved as follows:the vertex sequence(is identically coupled and the chain (X )is transformed to (Y)toby the following one-step local coupling between X and Y. Definition 6.3 (one-step local coupling for Hamiltonian update).The two chains(X)o on instance I =(V,E,,)and (Yr)on instance I=(V,E,,)are coupled as: ·Initially Xo=YoeQ'; for t=1,2,...,the two chains X and Y jointly do: (1)pick the same vr EV,and let (X(u),Y(u))(X-1(u),Y-1(u))for all uvr}; ()②sample,eo,hy,{o,l》from a coupling D2,(,)of the marginaldistibution4e.rtl o)and Hur.r(I r)with =Xt-1(IG(vt))and r =Y1-1(IG(vt)),where G=(V,E). The local couplingD(for Hamiltonian update is specified as follows. Definition 6.4 (local coupling D)for Hamiltonian update).Let Vbe vertex and o,r eOc(o)two configurations,where G=(V,E).We say a random pair (c,c')EQ2 is drawn from the coupling D)if(c)is generated by the following two steps: sampling step:sample(cjointly from an optimal coupling Dpof the marginal distributions Ho.r(o)and Hu.r(r),such that c ~Ho.r(.)and c'~Ho.r(.r); resampling step:flip a coin independently with the probability of HEADS being ifμu,r(c'|t)≤μu,rr(c'|th (9) pi,(c')兰 Hv.I(c'lr)-Hu.r(c'lr) “,i(cIr) otherwise; if the outcome of coin flipping is HEADS,resamplefrom the distributionindepen- dently,where the distributionis defined as (10) beQ:吃.6)±、maxf0erb1)-erb1} Exeo max (0.Hu.I(x)-Ho.r(x) Lemma65.)in Definition is a valid coupling between )and) By Lemma 6.5,the resulting(Y)fois a faithful copy of the Gibbs sampling on instance T',assuming that(X)is such a chain on instance I. Next we give an upper bound for the probability p()defined in(9). Lemma 6.6.For any two instances I =(V,E,and I'=(V,E,Q,'of MRF model,and any v∈V,c∈Q ando∈Qreo,it holds that (11) p≤2。-%h+∑6.-2 =(u,vhEE where-ilh=∑ceQ ou(c)-p(c)and lloe-plh=∑c,eeQ le(c,c)-φ2(c,c')儿 12
The subroutine LengthFix is given in Algorithm 3. We then describe UpdateHamiltonian (Section 6.1.1) and UpdateGraph (Section 6.1.2). 6.1.1. Coupling for Hamiltonian update. We consider the update of changing potentials of vertices and edges. The update do not change the underlying graph. Let I = (V, E,Q, Φ) be the current MRF instance. Let X0 and hvt ,Xt (vt )i T t=1 be the current initial state and execution log such that the random process (Xt ) T t=0 is the Gibbs sampling on instance I. Upon such an update, the new instance becomes I 0 = (V, E,Q, Φ 0 ). The algorithm UpdateHamiltonian(I, I 0 ,X0, hvt ,Xt (vt )i T t=1 ) updates the data to Y0 and v 0 t ,Yt (v 0 t ) T t=1 such that the random process (Yt ) T t=0 is the Gibbs sampling on instance I 0 . We transform the pair of X0 ∈ Q V and hvt ,Xt (vt )i T t=1 to a new pair of Y0 ∈ Q V and hvt ,Yt (vt )i T t=1 for I 0 . This is achieved as follows: the vertex sequence (vt ) T t=1 is identically coupled and the chain (Xt ) T t=0 is transformed to (Yt ) T t=0 by the following one-step local coupling between X and Y. Definition 6.3 (one-step local coupling for Hamiltonian update). The two chains (Xt ) ∞ t=0 on instance I = (V, E,Q, Φ) and (Yt ) ∞ t=0 on instance I 0 = (V, E,Q, Φ 0 ) are coupled as: • Initially X0 = Y0 ∈ Q V ; • for t = 1, 2, . . ., the two chains X and Y jointly do: (1) pick the same vt ∈ V , and let (Xt (u),Yt (u)) ← (Xt−1(u),Yt−1(u)) for all u ∈ V \ {vt }; (2) sample (Xt (vt ),Yt (vt )) from a coupling D σ ,τ Ivt ,I 0 vt (·, ·) of the marginal distributions µvt ,I(· | σ) and µvt ,I0(· | τ ) with σ = Xt−1(ΓG (vt )) and τ = Yt−1(ΓG (vt )), where G = (V, E). The local coupling D σ ,τ Iv ,I 0 v (·, ·) for Hamiltonian update is specified as follows. Definition 6.4 (local coupling D σ ,τ Iv ,I 0 v (·, ·) for Hamiltonian update). Let v ∈ V be vertex and σ, τ ∈ Q ΓG (v) two configurations, where G = (V, E). We say a random pair (c,c 0 ) ∈ Q 2 is drawn from the coupling D σ ,τ Iv ,I 0 v (·, ·) if (c,c 0 ) is generated by the following two steps: • sampling step: sample (c,c 0 ) ∈ Q 2 jointly from an optimal coupling D σ ,τ opt,Iv of the marginal distributions µv,I(· | σ) and µv,I(· | τ ), such that c ∼ µv,I(· | σ) and c 0 ∼ µv,I(· | τ ); • resampling step: flip a coin independently with the probability of HEADS being p τ Iv ,I 0 v (c 0 ) ≜ ( 0 if µv,I(c 0 | τ ) ≤ µv,I0(c 0 | τ ), µv,I(c 0 |τ )−µv,I0(c 0 |τ ) µv,I(c 0 |τ ) otherwise ; (9) if the outcome of coin flipping is HEADS, resample c 0 from the distribution ν τ Iv ,I 0 v independently, where the distribution ν τ Iv ,I 0 v is defined as ∀b ∈ Q : ν τ Iv ,I 0 v (b) ≜ max 0, µv,I0(b | τ ) − µv,I(b | τ ) Í x ∈Q max 0, µv,I(x | τ ) − µv,I0(x | τ ) (10) . Lemma 6.5. D σ ,τ Iv ,I 0 v (·, ·) in Definition 6.4 is a valid coupling between µv,I(· | σ) and µv,I0(· | τ ). By Lemma 6.5, the resulting (Yt ) T t=0 is a faithful copy of the Gibbs sampling on instance I 0 , assuming that (Xt ) T t=0 is such a chain on instance I. Next we give an upper bound for the probability p τ Iv ,I 0 v (·) defined in (9). Lemma 6.6. For any two instances I = (V, E,Q, Φ) and I 0 = (V, E,Q, Φ 0 ) of MRF model, and any v ∈ V,c ∈ Q and σ ∈ Q ΓG (v) , it holds that p τ Iv ,I 0 v (c) ≤ 2 © « kφv − φ 0 v k1 + Õ e={u,v }∈E kφe − φ 0 e k1 ª ® ¬ (11) , where kφv − φ 0 v k1 = Í c ∈Q |φv (c) − φ 0 v (c)| and kφe − φ 0 e k1 = Í c,c 0∈Q |φe (c,c 0 ) − φ 0 e (c,c 0 )|. 12