structures and properties,will include many cross-terms (e.g.stretch-bend):MM311-13 and MM414-16 are examples.Quantum mechanical calculations may give a guide to the"best" molecular force-field;also comparison of simulation results with thermophysical proper- ties and vibration frequencies is invaluable in force-field development and refinement.A separate family of force fields,such as AMBER7.18,CHARMM19 and OPLS20 are geared more to larger molecules(proteins,polymers)in condensed phases;their functional form is simpler,closer to that of Eq.(5),and their parameters are typically determined by quan- tum chemical calculations combined with thermophysical and phase coexistence data.This field is too broad to be reviewed here;several molecular modelling texts21-23(albeit target- ted at biological applications)should be consulted by the interested reader.The modelling 200 150 FENE+WCA 50 FENE ··WCA --harmonic 0 0.5 1.5 r/σ Figure 4.The FENE+WCA potential,with its separate FENE (attractive)and WCA (repulsive)components, between bonded atoms in a coarse-grained polymer chain.Also shown is the equivalent harmonic potential. Unlike the harmonic spring,the FENE potential cannot be extended beyond a specified limit,here Ro=1.5. For more details see Ref.24. of long chain molecules will be of particular interest to us,especially as an illustration of the scope for progressively simplifying and"coarse-graining"the potential model.Var- ious explicit-atom potentials have been devised for the n-alkanes25.More approximate potentials have also been constructed26-28 in which the CH2 and CHa units are represented by single "united atoms".These potentials are typically less accurate and less transfer- able than the explicit-atom potentials,but significantly less expensive;comparisons have been made between the two approaches29.For more complicated molecules this approach may need to be modified.In the liquid crystal field,for instance,a compromise has been suggested30:use the united-atom approach for hydrocarbon chains,but model phenyl ring hydrogens explicitly. In polymer simulations,there is frequently a need to economize further and coarse- grain the interactions more dramatically:significant progress has been made in recent years in approaching this problem systematically31.32.Finally,the most fundamental prop- erties,such as the entanglement length in a polymer melt33,may be investigated using a 5
structures and properties, will include many cross-terms (e.g. stretch-bend): MM311–13 and MM414–16 are examples. Quantum mechanical calculations may give a guide to the “best” molecular force-field; also comparison of simulation results with thermophysical properties and vibration frequencies is invaluable in force-field development and refinement. A separate family of force fields, such as AMBER17, 18 , CHARMM19 and OPLS20 are geared more to larger molecules (proteins, polymers) in condensed phases; their functional form is simpler, closer to that of Eq. (5), and their parameters are typically determined by quantum chemical calculations combined with thermophysical and phase coexistence data. This field is too broad to be reviewed here; several molecular modelling texts21–23 (albeit targetted at biological applications) should be consulted by the interested reader. The modelling 0 0.5 1 1.5 r / σ 0 50 100 150 200 v / ε FENE+WCA FENE WCA harmonic Figure 4. The FENE+WCA potential, with its separate FENE (attractive) and WCA (repulsive) components, between bonded atoms in a coarse-grained polymer chain. Also shown is the equivalent harmonic potential. Unlike the harmonic spring, the FENE potential cannot be extended beyond a specified limit, here R0 = 1.5σ. For more details see Ref. 24. of long chain molecules will be of particular interest to us, especially as an illustration of the scope for progressively simplifying and “coarse-graining” the potential model. Various explicit-atom potentials have been devised for the n-alkanes25 . More approximate potentials have also been constructed26–28 in which the CH2 and CH3 units are represented by single “united atoms”. These potentials are typically less accurate and less transferable than the explicit-atom potentials, but significantly less expensive; comparisons have been made between the two approaches29 . For more complicated molecules this approach may need to be modified. In the liquid crystal field, for instance, a compromise has been suggested30: use the united-atom approach for hydrocarbon chains, but model phenyl ring hydrogens explicitly. In polymer simulations, there is frequently a need to economize further and coarsegrain the interactions more dramatically: significant progress has been made in recent years in approaching this problem systematically31, 32 . Finally, the most fundamental properties, such as the entanglement length in a polymer melt33 , may be investigated using a 5
simple chain of pseudo-atoms or beads(modelled using the WCA potential of Fig.2,and each representing several monomers),joined by an attractive finitely-extensible non-linear elastic (FENE)potential24 which is illustrated in Fig.4. FENE(r)= -kRo In(1-(r/Ro)2)r<Ro (6) T≥Ro The key feature of this potential is that it cannot be extended beyond r=Ro,ensuring(for suitable choices of the parameters k and Ro)that polymer chains cannot move through one another. 2.3 Force Calculation Having specified the potential energy function (rN).the next step is to calculate the atomic forces =-0uy For site-site potentials this is a simple exercise.For the intramolecular part of the potential, it is a little more involved,but still a relatively straightforward application of the chain rule.Examples of how to do it are given in appendix C of Ref.34.As a simple illustration, consider one of the bending potential terms for the polymer of Fig.3.supposing that it may be written v=-kc0s0234=-k(r23·r23)-1/2(r34·r34)-1/2(r23·r34) This will contribute to the forces on all three atoms.To calculate these,we need: 0 -(r23·r34)=T34 0r -(r23T34)=T23-T34 0 O -(r23·T34)=-T23 gr23=2r2 分r23·r23)=-2r23 mr23r2a)=0 元4r3=0 8r94·r34)=2r3别 [r94r3a)=-2r3 and hence cos0234=r23r3 T23·T3 0r T34- T -T23 0 ∂r c0s0234=r23r3 T23:T34r23 T T23T34r31+T23-T34 T34 COs0234=23 T34 T23·T34 0r4 2T34-T23 T34 A similar approach applied to the torsional potential gives the forces on all four involved atoms. 6
simple chain of pseudo-atoms or beads (modelled using the WCA potential of Fig. 2, and each representing several monomers), joined by an attractive finitely-extensible non-linear elastic (FENE) potential24 which is illustrated in Fig. 4. v FENE(r) = ( − 1 2 kR2 0 ln 1 − (r/R0) 2 r < R0 ∞ r ≥ R0 (6) The key feature of this potential is that it cannot be extended beyond r = R0, ensuring (for suitable choices of the parameters k and R0) that polymer chains cannot move through one another. 2.3 Force Calculation Having specified the potential energy function U(r N ), the next step is to calculate the atomic forces fi = − ∂ ∂ri U(r N ) For site-site potentials this is a simple exercise. For the intramolecular part of the potential, it is a little more involved, but still a relatively straightforward application of the chain rule. Examples of how to do it are given in appendix C of Ref. 34. As a simple illustration, consider one of the bending potential terms for the polymer of Fig. 3, supposing that it may be written v = −k cos θ234 = −k(r23 · r23) −1/2 (r34 · r34) −1/2 (r23 · r34) This will contribute to the forces on all three atoms. To calculate these, we need: ∂ ∂r2 (r23 · r34) = r34 ∂ ∂r3 (r23 · r34) = r23 − r34 ∂ ∂r4 (r23 · r34) = −r23 ∂ ∂r2 (r23 · r23) = 2r23 ∂ ∂r3 (r23 · r23) = −2r23 ∂ ∂r4 (r23 · r23) = 0 ∂ ∂r2 (r34 · r34) = 0 ∂ ∂r3 (r34 · r34) = 2r34 ∂ ∂r4 (r34 · r34) = −2r34 and hence ∂ ∂r2 cos θ234 = r −1 23 r −1 34 r34 − r23 · r34 r 2 23 r23 ∂ ∂r3 cos θ234 = r −1 23 r −1 34 r23 · r34 r 2 23 r23 − r23 · r34 r 2 34 r34 + r23 − r34 ∂ ∂r4 cos θ234 = r −1 23 r −1 34 r23 · r34 r 2 34 r34 − r23 A similar approach applied to the torsional potential gives the forces on all four involved atoms. 6
3 The MD Algorithm Solving Newton's equations of motion does not immediately suggest activity at the cutting edge of research.The molecular dynamics algorithm in most common use today may even have been known to Newton35.Nonetheless,the last decade has seen a rapid develop- ment in our understanding of numerical algorithms;a forthcoming review36 and a book37 summarize the present state of the field. Continuing to discuss,for simplicity,a system composed of atoms with coordinates rN(r2...N)and potential energy u().we introduce the atomic momenta pN-(p P2,...PN),in terms of which the kinetic energy may be written K(pN)- mThen the energy.or hamiltonian.may be written as a sum of kinetic and potential terms H=K+u.Write the classical equations of motion as ri=pi/mi and pi=fi (7) This is a system of coupled ordinary differential equations.Many methods exist to perform step-by-step numerical integration of them.Characteristics of these equations are:(a)they are 'stiff',i.e.there may be short and long timescales,and the algorithm must cope with both;(b)calculating the forces is expensive,typically involving a sum over pairs of atoms, and should be performed as infrequently as possible.Also we must bear in mind that the advancement of the coordinates fulfils two functions:(i)accurate calculation of dynamical properties,especially over times as long as typical correlation times Ta of properties a of interest (we shall define this later);(ii)accurately staying on the constant-energy hypersur- face,for much longer times Trun>Ta,in order to sample the correct ensemble. To ensure rapid sampling of phase space,we wish to make the timestep as large as possible consistent with these requirements.For these reasons,simulation algorithms have tended to be of low order (i.e.they do not involve storing high derivatives of positions, velocities etc.):this allows the time step to be increased as much as possible without jeop- ardizing energy conservation.It is unrealistic to expect the numerical method to accurately follow the true trajectory for very long times Tn.The 'ergodic'and'mixing'proper- ties of classical trajectories,i.e.the fact that nearby trajectories diverge from each other exponentially quickly,make this impossible to achieve. All these observations tend to favour the Verlet algorithm in one form or another,and we look closely at this in the following section.For historical reasons only,we mention the more general class of predictor-corrector methods which have been optimized for classical mechanical equations38.39;further details are available elsewhere34.40.41. 3.1 The Verlet Algorithm There are various,essentially equivalent,versions of the Verlet algorithm,including the original method7.42 and a'leapfrog'form43.Here we concentrate on the 'velocity Verlet' algorithm44,which may be written Pi(t+t)=pi(t)+tfi(t) (8a) ri(t+8t)=ri(t)+8tpi(t+t)/mi (8b) pP:(t+6t)=p:(t+6t)+6tf,(t+t) (8c) 7
3 The MD Algorithm Solving Newton’s equations of motion does not immediately suggest activity at the cutting edge of research. The molecular dynamics algorithm in most common use today may even have been known to Newton35 . Nonetheless, the last decade has seen a rapid development in our understanding of numerical algorithms; a forthcoming review 36 and a book37 summarize the present state of the field. Continuing to discuss, for simplicity, a system composed of atoms with coordinates r N = (r1, r2, . . . rN ) and potential energy U(r N ), we introduce the atomic momenta p N = (p1 , p2 , . . . pN ), in terms of which the kinetic energy may be written K(p N ) = PN i=1 pi 2 /2mi . Then the energy, or hamiltonian, may be written as a sum of kinetic and potential terms H = K + U. Write the classical equations of motion as r˙ i = pi/mi and p˙ i = fi (7) This is a system of coupled ordinary differential equations. Many methods exist to perform step-by-step numerical integration of them. Characteristics of these equations are: (a) they are ‘stiff’, i.e. there may be short and long timescales, and the algorithm must cope with both; (b) calculating the forces is expensive, typically involving a sum over pairs of atoms, and should be performed as infrequently as possible. Also we must bear in mind that the advancement of the coordinates fulfils two functions: (i) accurate calculation of dynamical properties, especially over times as long as typical correlation times τa of properties a of interest (we shall define this later); (ii) accurately staying on the constant-energy hypersurface, for much longer times τrun τa, in order to sample the correct ensemble. To ensure rapid sampling of phase space, we wish to make the timestep as large as possible consistent with these requirements. For these reasons, simulation algorithms have tended to be of low order (i.e. they do not involve storing high derivatives of positions, velocities etc.): this allows the time step to be increased as much as possible without jeopardizing energy conservation. It is unrealistic to expect the numerical method to accurately follow the true trajectory for very long times τrun. The ‘ergodic’ and ‘mixing’ properties of classical trajectories, i.e. the fact that nearby trajectories diverge from each other exponentially quickly, make this impossible to achieve. All these observations tend to favour the Verlet algorithm in one form or another, and we look closely at this in the following section. For historical reasons only, we mention the more general class of predictor-corrector methods which have been optimized for classical mechanical equations38, 39; further details are available elsewhere34, 40, 41 . 3.1 The Verlet Algorithm There are various, essentially equivalent, versions of the Verlet algorithm, including the original method7, 42 and a ‘leapfrog’ form43 . Here we concentrate on the ‘velocity Verlet’ algorithm44 , which may be written pi (t + 1 2 δt) = pi (t) + 1 2 δtfi (t) (8a) ri(t + δt) = ri(t) + δtpi (t + 1 2 δt)/mi (8b) pi (t + δt) = pi (t + 1 2 δt) + 1 2 δtfi (t + δt) (8c) 7
After step (8b).a force evaluation is carried out,to give fi(t+ot)for step (8c).This scheme advances the coordinates and momenta over a timestep 6t.A piece of pseudo-code illustrates how this works: do step =1,nstep pp 0.5*dt*f r =r dt*p/m f force(r) p =p +0.5*dt*f enddo As we shall see shortly there is an interesting theoretical derivation of this version of the algorithm.Important features of the Verlet algorithm are:(a)it is exactly time reversible; (b)it is symplectic (to be discussed shortly);(c)it is low order in time,hence permitting long timesteps:(d)it requires just one(expensive)force evaluation per step:(e)it is easy to program. 3.2 Constraints It is quite common practice in classical computer simulations not to attempt to represent intramolecular bonds by terms in the potential energy function,because these bonds have very high vibration frequencies (and arguably should be treated in a quantum mechanical way rather than in the classical approximation).Instead,the bonds are treated as being constrained to have fixed length.In classical mechanics,constraints are introduced through the Lagrangian5 or Hamiltonian46 formalisms.Given an algebraic relation between two atomic coordinates,for example a fixed bond length b between atoms 1 and 2,one may write a constraint equation,plus an equation for the time derivative of the constraint X(r1,T2)=(r1-T2)·(r1-T2)-b2=0 (9a) X(T1,T2)=2(v1-v2)·(r1-T2)=0. (9b) In the Lagrangian formulation,the constraint forces acting on the atoms will enter thus: miri=fi+Agi where A is the undetermined multiplier and 0y=-2(r1-r2) g1=-r1 _0X=2(r1-r2) 92=-0r2 It is easy to derive an exact expression for the multiplier A from the above equations; if several constraints are imposed,a system of equations(one per constraint)is obtained. However,this exact solution is not what we want:in practice,since the equations of motion are only solved approximately.in discrete time steps,the constraints will be increasingly violated as the simulation proceeds.The breakthrough in this area came with the pro- posal to determine the constraint forces in such a way that the constraints are satisfied exactly at the end of each time step47-49.For the original Verlet algorithm,this scheme is called SHAKE.The appropriate version of this scheme for the velocity Verlet algorithm is called RATTLES0.Formally,we wish to solve the following scheme,in which we combine 8
After step (8b), a force evaluation is carried out, to give f i (t + δt) for step (8c). This scheme advances the coordinates and momenta over a timestep δt. A piece of pseudo-code illustrates how this works: do step = 1, nstep p = p + 0.5*dt*f r = r + dt*p/m f = force(r) p = p + 0.5*dt*f enddo As we shall see shortly there is an interesting theoretical derivation of this version of the algorithm. Important features of the Verlet algorithm are: (a) it is exactly time reversible; (b) it is symplectic (to be discussed shortly); (c) it is low order in time, hence permitting long timesteps; (d) it requires just one (expensive) force evaluation per step; (e) it is easy to program. 3.2 Constraints It is quite common practice in classical computer simulations not to attempt to represent intramolecular bonds by terms in the potential energy function, because these bonds have very high vibration frequencies (and arguably should be treated in a quantum mechanical way rather than in the classical approximation). Instead, the bonds are treated as being constrained to have fixed length. In classical mechanics, constraints are introduced through the Lagrangian45 or Hamiltonian46 formalisms. Given an algebraic relation between two atomic coordinates, for example a fixed bond length b between atoms 1 and 2, one may write a constraint equation, plus an equation for the time derivative of the constraint χ(r1, r2) = (r1 − r2) · (r1 − r2) − b 2 = 0 (9a) χ˙(r1, r2) = 2(v1 − v2) · (r1 − r2) = 0 . (9b) In the Lagrangian formulation, the constraint forces acting on the atoms will enter thus: mir¨i = fi + Λgi where Λ is the undetermined multiplier and g1 = − ∂χ ∂r1 = −2(r1 − r2) g2 = − ∂χ ∂r2 = 2(r1 − r2) It is easy to derive an exact expression for the multiplier Λ from the above equations; if several constraints are imposed, a system of equations (one per constraint) is obtained. However, this exact solution is not what we want: in practice, since the equations of motion are only solved approximately, in discrete time steps, the constraints will be increasingly violated as the simulation proceeds. The breakthrough in this area came with the proposal to determine the constraint forces in such a way that the constraints are satisfied exactly at the end of each time step47–49 . For the original Verlet algorithm, this scheme is called SHAKE. The appropriate version of this scheme for the velocity Verlet algorithm is called RATTLE50 . Formally, we wish to solve the following scheme, in which we combine 8
(r1,r2)into r,(p,p2)into p,etc.for simplicity: p(t+8t)=p(t)+tf(t)+xg(t) r(t+6t)=r(t)8tp(t+t)/m choosing A such that:0=x(r(t +6t)) (10a) p(t+t)=p(t+号6t)+6tf(t+6t)+μg(t+6t) choosing u such that:0=(r(t+6t),p(t +6t)) (10b) Step(10a)may be implemented by defining unconstrained variables (t+26t)=p(t)+26tf(t),(t+6t)=r(t)+6tpt+2t)/m then solving the nonlinear equation for A X(t+6t)=x((t+t)+λ6tg(t)/m)=0 and substituting back p(t+t)=p(t+t)+xg(t);r(t+6t)=(t+6t)+6tAg(t)/m Step(10b)may be handled by defining p(t+t)=p(t+号t)+号tf(t+6t) solving the equation for the second Lagrange multiplier u X(t+6t)=(r(t+6t),(t+t)+g(t+6t))=0 (which is actually linear,since x(r,p)=-g(r).p/m)and substituting back p(t+6t)=(6t)+ug(t+t) In pseudo-code this scheme may be written do step 1,nstep pp (dt/2)*f r =r dt*p/m lambda_g shake(r) pp lambda g r=r dt*lambda_g/m f force(r) p =p (dt/2)*f mu_g rattle(r,p) p =p mug enddo The routine called shake here calculates the constraint forces Ag;necessary to ensure that the end-of-step positions ri satisfy Eq.(9a).For a system of many constraints,this calculation is usually performed in an iterative fashion,so as to satisfy each constraint in turn until convergence;the original SHAKE algorithm was framed in this way.These constraint forces are incorporated into both the end-of-step positions and the mid-step mo- menta.The routine called rattle calculates a new set of constraint forces ug;to ensure that the end-of-step momenta satisfy Eq.(9b).This also may be carried out iteratively
(r1, r2) into r, (p1 , p2 ) into p, etc. for simplicity: p(t + 1 2 δt) = p(t) + 1 2 δtf(t) + λg(t) r(t + δt) = r(t) + δtp(t + 1 2 δt)/m choosing λ such that: 0 = χ(r(t + δt)) (10a) p(t + δt) = p(t + 1 2 δt) + 1 2 δtf(t + δt) + µg(t + δt) choosing µ such that: 0 = χ˙(r(t + δt), p(t + δt)) (10b) Step (10a) may be implemented by defining unconstrained variables p¯(t + 1 2 δt) = p(t) + 1 2 δtf(t), r¯(t + δt) = r(t) + δtp¯(t + 1 2 δt)/m then solving the nonlinear equation for λ χ(t + δt) = χ r¯(t + δt) + λδtg(t)/m = 0 and substituting back p(t + 1 2 δt) = p¯(t + 1 2 δt) + λg(t), r(t + δt) = r¯(t + δt) + δtλg(t)/m Step (10b) may be handled by defining p¯(t + δt) = p(t + 1 2 δt) + 1 2 δtf(t + δt) solving the equation for the second Lagrange multiplier µ χ˙(t + δt) = χ˙ r(t + δt), p¯(t + δt) + µg(t + δt) = 0 (which is actually linear, since χ˙(r, p) = −g(r) · p/m) and substituting back p(t + δt) = p¯(δt) + µg(t + δt) In pseudo-code this scheme may be written do step = 1, nstep p = p + (dt/2)*f r = r + dt*p/m lambda_g = shake(r) p = p + lambda_g r = r + dt*lambda_g/m f = force(r) p = p + (dt/2)*f mu_g = rattle(r,p) p = p + mu_g enddo The routine called shake here calculates the constraint forces λgi necessary to ensure that the end-of-step positions ri satisfy Eq. (9a). For a system of many constraints, this calculation is usually performed in an iterative fashion, so as to satisfy each constraint in turn until convergence; the original SHAKE algorithm was framed in this way. These constraint forces are incorporated into both the end-of-step positions and the mid-step momenta. The routine called rattle calculates a new set of constraint forces µgi to ensure that the end-of-step momenta satisfy Eq. (9b). This also may be carried out iteratively. 9