THE GEOMETRIC FOUNDATIONS OF HAMILTONIAN MONTE CARLO 11 This differential structure allows us to define calculus on manifolds by applying concepts from real analysis in each chart.The differential properties of a function f:R,for example,can be studied by considering the entirely-real functions, fog1:Rn→R because the charts are smooth in their overlap,these local properties define a consistent global definition of ctor fields pecify c ections and magn to R.If we linearly-in r fields ng in eds at every p nt in spa en the orms prold es a loca r,when e m le we sider the pro comet. measure ifolds these voh forms,h n gen dditiona fiber b nifolds,and symplectic manifolds.Symp d c manit olds be pe de measure-preserving flows.Proofs of interme 2.1 Smooth Measures on Generic Smooth Manifolds Formally,volume forms are defined as positive,top-rank differential forms M(Q)={μ∈2"(Q)|>0,g∈Q}, where "(Q)is the space,we can s on Q(Figure 6). LEMMA 1.If Q is a positively iented,smooth manifold then M(Q)is non-empty and its elements are a-finit measures on Q. We will refer to elements of M(Q)as smooth measures on Q. Because of the local compactness of Q,the elements of M(Q)are not just measures but also Radon measures.As expected from the Riesz Representation Theorem (Folland, 1999),any such element also serves as a linear functional via the usual geometric definition of integration, 4:L(Q,)→R
THE GEOMETRIC FOUNDATIONS OF HAMILTONIAN MONTE CARLO 11 This differential structure allows us to define calculus on manifolds by applying concepts from real analysis in each chart. The differential properties of a function f : Q → R, for example, can be studied by considering the entirely-real functions, f ◦ ψ −1 α : R n → R; because the charts are smooth in their overlap, these local properties define a consistent global definition of smoothness. Ultimately these properties manifest as geometric objects on Q, most importantly vector fields and differential k-forms. Informally, vector fields specify directions and magnitudes at each point in the manifold while k-forms define multilinear, antisymmetric maps of k such vectors to R. If we consider n linearly-independent vector fields as defining infinitesimal parallelepipeds at every point in space, then the action of n-forms provides a local sense of volume and, consequently, integration. In particular, when the manifold is orientable we can define n-forms that are everywhere positive and a geometric notion of a measure. Here we consider the probabilistic interpretation of these volume forms, first on smooth manifolds in general and then on smooth manifolds with additional structure: fiber bundles, Riemannian manifolds, and symplectic manifolds. Symplectic manifolds will be particularly important as they naturally provide measure-preserving flows. Proofs of intermediate lemmas are presented in Appendix A. 2.1 Smooth Measures on Generic Smooth Manifolds Formally, volume forms are defined as positive, top-rank differential forms, M(Q) ≡ {µ ∈ Ω n (Q) | µq > 0, ∀q ∈ Q} , where Ωn (Q) is the space of n-forms on Q. By leveraging the local equivalence to Euclidean space, we can show that these volume forms satisfy all of the properties of σ-finite measures on Q (Figure 6). Lemma 1. If Q is a positively-oriented, smooth manifold then M(Q) is non-empty and its elements are σ-finite measures on Q. We will refer to elements of M(Q) as smooth measures on Q. Because of the local compactness of Q, the elements of M(Q) are not just measures but also Radon measures. As expected from the Riesz Representation Theorem (Folland, 1999), any such element also serves as a linear functional via the usual geometric definition of integration, µ :L 1 (Q, µ) → R f 7→ Z Q fµ
12 BETANCOURT ET AL ucQ 购6)cR2 (a) (b) Fic6.(a)In the neighborhood of a chart,any top-rank differential form is specified by its density. patching together these equivalences.Lemma I demonstrates that these forms are in fact measures. Consequently,(Q.M(Q))is also a Radon space,which guarantees the existence of various probabilistic objects such as disintegrations as discussed below. Ultimately we are not interested in the whole of M(Q)but rather P(Q),the subset of volume forms with unit integral, P(Q)-EMQ)-1 which serve res.Because we can always normalize measures,P(Q)is elements of M(Q). MQ)={oeM@bm<∞} modulo their normalizations CoROLLARY 2.If Q is a positively-oriented,smooth manifold then M(Q),and hence P(Q),is non-empty PRooF.Because the manifold is paracompact,the prototypical measure constructed in Lemma I can always be chosen such that the measure of the entire manifold is finite
12 BETANCOURT ET AL. U2 ⊂ Q (a) ψ2(U2) ⊂ R 2 (b) Fig 6. (a) In the neighborhood of a chart, any top-rank differential form is specified by its density, µ q 1 , . . . , qn , with respect to the coordinate volume, µ = µ q 1 , . . . , qn dq 1 ∧. . .∧dq n , (b) which pushes forward to a density with respect to the Lebesgue measure in the image of the corresponding chart. By smoothly patching together these equivalences, Lemma 1 demonstrates that these forms are in fact measures. Consequently, (Q,M(Q)) is also a Radon space, which guarantees the existence of various probabilistic objects such as disintegrations as discussed below. Ultimately we are not interested in the whole of M(Q) but rather P(Q), the subset of volume forms with unit integral, P(Q) = ̟ ∈ M(Q) Z Q ̟ = 1 , which serve as probability measures. Because we can always normalize measures, P(Q) is equivalent the finite elements of M(Q), Mf(Q) = ̟ ∈ M(Q) Z Q ̟ < ∞ , modulo their normalizations. Corollary 2. If Q is a positively-oriented, smooth manifold then Mf(Q), and hence P(Q), is non-empty. Proof. Because the manifold is paracompact, the prototypical measure constructed in Lemma 1 can always be chosen such that the measure of the entire manifold is finite
THE GEOMETRIC FOUNDATIONS OF HAMILTONIAN MONTE CARLO 13 2.2 Smooth Measures on Fiber Bundles Although conditional probability measures are ubiquitous in statistical methodology, they are notoriously subtle objects to rigorously construct in theory (Halmos,1950).For- mally,a conditional probability measure appeals to a measurable function between two generic spaces,F:RS,to define measures on R for some subsets of S along with an abundance of technicalities.It is only when S is endowed with the quotient topology relative to F(Folland,1999;Lee,2011)that we can define reqular conditional probabil- ity measures that shed many of the technicalities and align with common intuition.In practice,regular conditional probability measures are most conveniently constructed as distintegrations (Chang and Pollard,1997:Leao Jr.Fragoso and Ruffino,2004). Fiber bundles are smooth manifolds endowed with a canonical map and the quotient topology nece ary to admit canonical disintegrations and,consequently,the gec metric equivalent of conditional and marginal probability measures. 2.2.1 Fiber Bundles A smooth fiber bundle.:ZO.combines an (n+k)-dimensional total space.Z.an n-dimensional base space.O.and a smooth projection.a.that submerses the total space into the base space.We will refer to a positively-oriented fiber bundle as a fiber bundle in which both the total space and the base space are positively-oriented and the projection operator is orientation-preserving. Each fiber, Zg=-(q) is itself a k-dimensional manifold isomorphic to a common fiber space,F,and is naturally immersed into the total space, g:2g4Z a黑p.borate chata where unity,where the corresponding total space is isomorphic to a trivial product (Figures7, 8), r-l(a)≈a×F. Vector fields on Zare classified by their action under the projection operator.Vertical vector fields,,lie in the kernel of the projection operator T,Y=0. while horizontal vector fields,Xi,pushforward to the tangent space of the base space, π.X(2)=X(π(2)∈Tx(aQ where zZ and (z)EQ.Horizontal forms are forms on the total space that vanish then contracted against one or more vertical vector fields
THE GEOMETRIC FOUNDATIONS OF HAMILTONIAN MONTE CARLO 13 2.2 Smooth Measures on Fiber Bundles Although conditional probability measures are ubiquitous in statistical methodology, they are notoriously subtle objects to rigorously construct in theory (Halmos, 1950). Formally, a conditional probability measure appeals to a measurable function between two generic spaces, F : R → S, to define measures on R for some subsets of S along with an abundance of technicalities. It is only when S is endowed with the quotient topology relative to F (Folland, 1999; Lee, 2011) that we can define regular conditional probability measures that shed many of the technicalities and align with common intuition. In practice, regular conditional probability measures are most conveniently constructed as distintegrations (Chang and Pollard, 1997; Le˜ao Jr, Fragoso and Ruffino, 2004). Fiber bundles are smooth manifolds endowed with a canonical map and the quotient topology necessary to admit canonical disintegrations and, consequently, the geometric equivalent of conditional and marginal probability measures. 2.2.1 Fiber Bundles A smooth fiber bundle, π : Z → Q, combines an (n + k)-dimensional total space, Z, an n-dimensional base space, Q, and a smooth projection, π, that submerses the total space into the base space. We will refer to a positively-oriented fiber bundle as a fiber bundle in which both the total space and the base space are positively-oriented and the projection operator is orientation-preserving. Each fiber, Zq = π −1 (q), is itself a k-dimensional manifold isomorphic to a common fiber space, F, and is naturally immersed into the total space, ιq : Zq ֒→ Z, where ιq is the inclusion map. We will make heavy use of the fact that there exists a trivializing cover of the base space, {Uα}, along with subordinate charts and a partition of unity, where the corresponding total space is isomorphic to a trivial product (Figures 7, 8), π −1 (Uα) ≈ Uα × F. Vector fields on Z are classified by their action under the projection operator. Vertical vector fields, Yi , lie in the kernel of the projection operator, π∗Yi = 0, while horizontal vector fields, X˜ i , pushforward to the tangent space of the base space, π∗X˜ i(z) = Xi(π(z)) ∈ Tπ(z)Q, where z ∈ Z and π(z) ∈ Q. Horizontal forms are forms on the total space that vanish then contracted against one or more vertical vector fields
14 BETANCOURT ET AL a×F≈-(4a)CZ FIG 7.In a local neighbe hood,the total space ofa fibe 5ogmohe7agecgoaa 8×R '(Ua)~Ua x R.Ua CS' (a) 6) FIc 8.(a)Th
14 BETANCOURT ET AL. Uα × F ≈ π −1 (Uα) ⊂ Z Zq = π −1 (q) ≈ F π Uα ⊂ Q q Fig 7. In a local neighborhood, the total space of a fiber bundle, π −1 (Uα) ⊂ Z, is equivalent to attaching a copy of some common fiber space, F, to each point of the base space, q ∈ Uα ⊂ Q. Under the projection operator each fiber projects back to the point at which it is attached. S 1 × R (a) π −1 (Uα) ∼ Uα × R, Uα ⊂ S 1 . (b) Fig 8. (a) The canonical projection, π : S 1 × R → S 1 , gives the cylinder the structure of a fiber bundle with fiber space F = R. (b) The domain of each chart becomes isomorphic to the product of a neighborhood of the base space, Uα ⊂ S 1 , and the fiber, R
THE GEOMETRIC FOUNDATIONS OF HAMILTONIAN MONTE CARLO 15 Note that vector fields on the base space do not uniquely define horizontal vector fields on the total space;a choice of;consistent with Xi is called a horizontal lift of Xi.More generally we will refer to the lift of an object on the base space as the selection of some object on the total space that pushes forward to the corresponding object on the base space. 2.2.2 Disintegrating Fiber Bundles Because both Z and Q are both smooth manifolds and hence Radon spaces,the structure of the fiber bundle guarantees the existence of dis- integrations with respect to the projection operator (Leao Jr,Fragoso and Ruffino,2004; Simmons,2012;Censor and Grandini,2014)and,under certain regularity conditions,reg- ular conditional probability measures.A substantial benefit of working with smooth man- ifolds is that we can not only prove the existence of disintegrations but also explicitly construct their geometric equivalents and utilize them in practice. v:S×B()→R+, such that iv(s,)is a B(R)-finite measure concentrating onon the level set F-(s),i.e.for us- ost all s v(s,A)=0,VAEB(R)An F-1(s)=0. and for any positive,measurable function fL(R,uR), 量6-2网然"e8 In other words,a disintegration is an unnormalized Markov kernel that concentrates on the level of the whoe). proper then the pushforward measure, us =T.UR us(B)=uR(F-1(B)),VBE B(S), is a-finite and known as the marginalization of uR with respect to F.In this case the disin- tegration of uR with respect to its pushforward measure becomes a normalized kernel and exactly a regular conditional probability measure.The classic marginalization paradoxes of measure theory (Dawid,Stone and Zidek,1973)occur when the pushforward of uR is not a-finite and the corresponding disintegration,let alone a regular conditional probability measure,does not exist;we will be careful to explicitly exclude such cases here. For the smooth manifolds of interest we do not need the full generality of disintegrations, and instead consider the equivalent object restricted to smooth measures
THE GEOMETRIC FOUNDATIONS OF HAMILTONIAN MONTE CARLO 15 Note that vector fields on the base space do not uniquely define horizontal vector fields on the total space; a choice of X˜ i consistent with Xi is called a horizontal lift of Xi . More generally we will refer to the lift of an object on the base space as the selection of some object on the total space that pushes forward to the corresponding object on the base space. 2.2.2 Disintegrating Fiber Bundles Because both Z and Q are both smooth manifolds, and hence Radon spaces, the structure of the fiber bundle guarantees the existence of disintegrations with respect to the projection operator (Le˜ao Jr, Fragoso and Ruffino, 2004; Simmons, 2012; Censor and Grandini, 2014) and, under certain regularity conditions, regular conditional probability measures. A substantial benefit of working with smooth manifolds is that we can not only prove the existence of disintegrations but also explicitly construct their geometric equivalents and utilize them in practice. Definition 1. Let (R,B(R)) and (S,B(S)) be two measurable spaces with the respective σ-finite measures µR and µS, and a measurable map, F : R → S, between them. A disintegration of µR with respect to F and µS is a map, ν : S × B(R) → R +, such that i ν(s, ·) is a B(R)-finite measure concentrating on on the level set F −1 (s), i.e. for µSalmost all s ν(s, A) = 0, ∀A ∈ B(R)|A ∩ F −1 (s) = 0, and for any positive, measurable function f ∈ L 1 (R, µR), ii s 7→ R R f(r) ν(s, dr) is a measurable function for all s ∈ S. iii R R f(r) µR(dr) = R S R F −1(s) f(r) ν(s, dr) µS(ds). In other words, a disintegration is an unnormalized Markov kernel that concentrates on the level sets of F instead of the whole of R (Figure 9). Moreover, if µR is finite or F is proper then the pushforward measure, µS = T∗µR µS(B) = µR F −1 (B) , ∀B ∈ B(S), is σ-finite and known as the marginalization of µR with respect to F. In this case the disintegration of µR with respect to its pushforward measure becomes a normalized kernel and exactly a regular conditional probability measure. The classic marginalization paradoxes of measure theory (Dawid, Stone and Zidek, 1973) occur when the pushforward of µR is not σ-finite and the corresponding disintegration, let alone a regular conditional probability measure, does not exist; we will be careful to explicitly exclude such cases here. For the smooth manifolds of interest we do not need the full generality of disintegrations, and instead consider the equivalent object restricted to smooth measures