Markoy Random Fields (MRF) Gibbs distribution u:Voe[g] network GE): H(o)Ae(ou,)b(o) e=(u,w)∈E ●proper q-coloring: X∈[q] ⑨bv 10 1 independent set: 4-且司-日 X∈[aly follows u local conflict colorings: [Fraigniaud,Heinrich,Kosowski FOCS'16] Ae∈{0,1}9×9,bm∈{0,1}9
Markov Random Fields network G(V,E): Ae bv Xv∈[q] u v X ~ 2 [q] V follows µ (MRF) • Gibbs distribution µ : ∀σ∈[q]V µ() / Y e=(u,v)2E Ae(u, v) Y v2V bv(v) • proper q-coloring: Ae = 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 1 1 bv = 2 6 4 1 . . . 1 3 7 5 • independent set: bv = 1 1 Ae = 1 1 1 0 • local conflict colorings: [Fraigniaud, Heinrich, Kosowski FOCS’16] Ae 2 {0, 1}q⇥q, bv 2 {0, 1}q
A Motivation: Distributed Machine Learning ●Data are stored in a distributed system. 0 ·Sampling from a probabilistic graphical model (e.g.the Markov random field))by distributed algorithms
A Motivation: Distributed Machine Learning • Data are stored in a distributed system. • Sampling from a probabilistic graphical model (e.g. the Markov random field) by distributed algorithms
Glauber Dynamics starting from an arbitrary Xo E [g] G(V,E) transition for XX+1: pick a uniform random vertex v; resample X(v)according to the marginal distribution induced by u at vertex v conditioning on X(N(v)); marginal distribution: bo()IIEN()A(.)(Xu:) PrlX,=z|Xwol=ehb.IexoAeX, MRF:o∈g', u(o)xΠAe(o,o)Πb(a) v∈V stationary distribution:u e=(u,v)∈E mixing time::Tmix=max min{t|drv(Xt,))≤2e} Xo
Glauber Dynamics G(V,E): pick a uniform random vertex v; resample X(v) according to the marginal distribution induced by µ at vertex v conditioning on Xt(N(v)); starting from an arbitrary X0 ∈ [q]V transition for Xt → Xt+1 : marginal distribution: Pr[Xv = x | XN(v)] = bv(x) Q u2N(v) A(u,v)(Xu, x) P y2[q] bv(y) Q u2N(v) A(u,v)(Xu, y) Ae bv v µ() / Y e=(u,v)2E Ae(u, v) Y v2V bv(v) MRF: 8 2 [q] V , stationary distribution: µ mixing time: ⌧mix = max X0 min t | dTV(Xt, µ) 1 2e
Mixing of Glauber Dynamics influence matrix [Po,u,uEv: Pv.u:max discrepancy (in total variation distance)of marginal distributions at v caused by any pair o,t of boundary conditions that differ only at u Dobrushin's condition: contraction of one-step Ipl=∑Pee≤1-e optimal coupling in the worst u∈V case w.r.t.Hamming distance Theorem (Dobrushin'70:Salas,Sokal'97): Dobrushin's Tmix =O(nlogn) condition for Glauber dynamics for g-coloring: Dobrushin's 92(2+ε)△ condition △=max-degree
Mixing of Glauber Dynamics for q-coloring: Dobrushin’s q≥(2+ε)Δ condition Δ = max-degree u v influence matrix {⇢v,u }v,u2V : ρv,u: max discrepancy (in total variation distance) of marginal distributions at v caused by any pair σ,τ of boundary conditions that differ only at u Dobrushin’s condition: k⇢k1 = max v2V X u2V ⇢v,u 1 ✏ contraction of one-step optimal coupling in the worst case w.r.t. Hamming distance Theorem (Dobrushin ’70; Salas, Sokal ’97): Dobrushin’s condition for Glauber dynamics ⌧mix = O (n log n)
Parallelization Glauber dynamics: starting from an arbitrary Xo E [g] G(V,E): transition for XX+1: pick a uniform random vertex v; resample X(v)according to the marginal distribution induced by u at vertex v conditioning on Xi(N(v)); Parallelization: ● Chromatic scheduler [folklore][Gonzalez et al.,AISTAT'11]: Vertices in the same color class are updated in parallel. "Hogwild!"[Niu,Recht,Re,Wright,NIPS'11]De Sa,Olukotun,Re,ICML16]: All vertices are updated in parallel,ignoring concurrency issues
Parallelization G(V,E): v Glauber dynamics: Parallelization: • Chromatic scheduler [folklore] [Gonzalez et al., AISTAT’11]: Vertices in the same color class are updated in parallel. • “Hogwild!” [Niu, Recht, Ré, Wright, NIPS’11][De Sa, Olukotun, Ré, ICML’16]: All vertices are updated in parallel, ignoring concurrency issues. pick a uniform random vertex v; resample X(v) according to the marginal distribution induced by µ at vertex v conditioning on Xt(N(v)); starting from an arbitrary X0 ∈ [q]V transition for Xt → Xt+1 :