Best Practices
Best Practices
Here, we provide an overview of some of what we believe are the best practices for free energy calculations, with references, wherever possible. This document assumes you already have a basic idea of what these calculations are and what they do (if not, start with our FreeEnergyIntroduction), and that you already have a working knowledge of molecular simulations and basic terms like convergence and equilibration (if not, start with a textbook like Leach's Molecular Modelling: Principles and Applications). Simulation-package specific issues will not be addressed here.
Feel free to edit this document or the attached discussion pages. We hope this to be a place storing the community consensus on these issues. Do back up any edits with appropriate references.
Introduction
Free energy calculations are appealing, in that, in principle, they allow calculation of rigorously correct free energies, given a particular set of parameters and physical assumptions. We firmly believe that the goal of these calculations, then, should be to obtain these correct free energies given the particular assumptions -- and not necessarily to match experiment. Only then can the underlying parameters and physical assumptions really be tested. In other words, there exists a single right answer for a free energy calculation, given this particular set of parameters and physical assumptions, and our goal is to obtain it. Here, we will call these free energies "correct" free energies. "Correct", then, implies at the very least that the computed free energies have converged, and that there are no underlying methodological problems.
It is also important to remember in what follows that free energy is a function of state, so there are many possible choices of pathway for a thermodynamic cycle connecting the same two endpoints. Some pathways may be more efficient than others (sometimes by many orders of magnitude) so methods which are in principle correct may not always be practical.
Here, we discuss best practices in the context of several practical examples: Solvation free energies, and binding free energies. Solvation free energies provide a basic starting point for considering a number of the key issues, and, for binding free energies, there are many more choices of pathway, which raises additional complications. In both cases, there are also different methods for computing the relevant free energy differences, which may differ in efficiency.
Guidelines for free energy calculation pathways
Alchemical free energies almost involve some insertion or deletion of atoms -- both in hydration free energy calculations, and in absolute and relative binding free energy calculations. By insertion and deletion, we mean decoupling or annihilation (see Decoupling and annihilation for definitions) of the interactions of the atoms in question. We believe that a basic list of rules and guidelines should be followed in any calculation that involves insertion or deletion:
- Rule 1: Always use soft-core potentials while decoupling or annihilating Lennard-Jones interactions
- Rule 2: Never leave a partial atomic charge on an atom while its Lennard-Jones interactions are being removed
- Guideline 3: It is usually more efficient to perform electrostatic and Lennard-Jones transformations separately
- Guideline 4: Inserting or deleting atoms is usually less efficient than mutating them, so transformations should involve as few insertions and deletions as possible.
It is worth looking at each of these in more detail.
- Guideline 5: Keep configuration space in mind and think about convergence.
Rule 1: Soft core potentials
Rule 1 -- that soft core potentials should always be used when turning on or off Lennard-Jones interactions -- really should be a rule observed by all free energy calculations, we believe. Lennard-Jones interactions between particles have a really steep ([math]\displaystyle{ 1/r^{12} }[/math]) rise in potential energy. This prevents particles from overlapping. However, to delete particles (atoms or molecules) these interactions need to be turned off somehow, and this is not as straightforward as it might seem.
One relatively-commonly used choice (Fowler et al., 2005, Chipot, Rozanska and Dixit, 2005, AMBER, and others) for turning these off is simple linear-scaling of that term in the potential energy or Hamiltonian: That is, [math]\displaystyle{ V(\lambda) = (1 - \lambda) V_0 + \lambda V_1 }[/math], where [math]\displaystyle{ V_0 }[/math] is the potential energy with full Lennard-Jones interactions, and [math]\displaystyle{ V_1 }[/math] is the potential energy where the Lennard-Jones interactions have been turned off for the atoms which are being deleted. This means that, for the atoms being deleted, Lennard-Jones interactions scale at small [math]\displaystyle{ r }[/math] as [math]\displaystyle{ (1-\lambda)/r^{12} }[/math]. This has two unfortunate and interconnected consequences. First, there is a discontinuous change in the form of the interaction potential when going from [math]\displaystyle{ \lambda=1-\epsilon }[/math] (where [math]\displaystyle{ \epsilon }[/math] is a very small number) to [math]\displaystyle{ \lambda=1 }[/math], as the [math]\displaystyle{ 1/r^{12} }[/math] term still is fairly important even at [math]\displaystyle{ \lambda }[/math] very near 1, but is entirely turned off at [math]\displaystyle{ \lambda=1 }[/math]. Secondly, it leads to large forces, numerical instabilities, and other problems in simulations near [math]\displaystyle{ \lambda=1 }[/math]. Formally, it has been shown that this leads to a integrable singularity in [math]\displaystyle{ dV/(d\lambda) }[/math], which means that computing correct free energies with this scheme using thermodynamic integration is impossible using numerical techniques (Mruzik et al., 1976,Mezei and Beveridge, 1986, Resat and Mezei, 1993 and especially Beutler et al., 1994, Pitera and van Gunsteren, 2002 and Steinbrecher et al., 2007 and references therein) and similar problems plague free energy perturbation schemes.
In an attempt to get around this, some have suggested scaling the potential energies with [math]\displaystyle{ (1-\lambda)^k }[/math], where [math]\displaystyle{ k }[/math] is an integer greater than 1. It can be shown that, for [math]\displaystyle{ k \gt = 4 }[/math], this leads to an integrable singularity in [math]\displaystyle{ dV/d\lambda }[/math], so thermodynamic integration can in principle be done (Mezei and Beveridge, 1986, Beutler et al., 1994). But integrable singularities still pose very substantial problems for molecular simulation, and this approach can still lead to large forces, numerical instabilities and energy conservation problems (Beutler et al., 1994 and Steinbrecher et al., 2007) and make free energy differences extremely difficult to converge ([D. Mobley, unpub. data]).
Since free energies are path-independent, an elegant solution to this problem was developed (Beutler et al., 1994) – to modify the Lennard-Jones functional form to gradually smooth out the [math]\displaystyle{ 1/r^{12} }[/math] term as a function of [math]\displaystyle{ \lambda }[/math], rather than simply multiplying it by a prefactor. This removes problems with numerical instabilities and singularities, and improves convergence properties (Beutler et al., 1994, Zacharias et al., 1994, Pitera and van Gunsteren, 2002). The basic idea is that it allows particles to gradually begin to overlap as [math]\displaystyle{ \lambda }[/math] is changed, rather than saving a drastic change in interactions for the point going from [math]\displaystyle{ \lambda=1-epsilon\lt math\gt to \lt math\gt \lambda=1 }[/math]. This approach is known as soft core potentials (and, alternately, "separation-shifted scaling"), and has subsequently been shown to be a nearly optimal path for modifying Lennard-Jones interactions (Blondel, 2004. In some work, several groups have further tested this approach and found a slightly modified functional form and set of parameters from that originally proposed (Beutler et al., 1994) which leads to improved efficiency for free energy calculations (Shirts and Pande, 2005, [D. Mobley, unpublished data]); we recommend that the soft core potentials and parameters from that work be employed in all free energy calculations involving insertion or deletion of particles.
Some testing has suggested that the [math]\displaystyle{ (1-\lambda)^k }[/math] scaling approach may be essentially adequate for hydration free energy calculations ([D. Mobley, unpublished data],Steinbrecher et al., 2007) but it still less efficient there than soft-core potentials, so this does not affect our recommendation.
In summary: Linearly scaling Lennard-Jones interactions back as a function of [math]\displaystyle{ \lambda }[/math] for insertion/deletion of particles is formally incorrect for numerical integration and leads to wrong estimates of free energy differences. While more complicated schemes involving [math]\displaystyle{ \lambda^k }[/math] scaling can be formally correct, there are serious concerns regarding their accuracy. Soft-core potentials provide a rigorously correct, efficient alternative to these and should be used whenever particles are inserted or deleted, preferably with the functional form and parameters of ((Shirts and Pande, 2005), unless future work finds a still more efficient set of parameters.
Rule 2: Turn off partial charges
Rule 2 states that a partial atomic charge should never be allowed to remain on an atom while its Lennard-Jones interactions are being removed. To understand the reason for this, consider two atoms of opposite charge, A and B. Lennard-Jones interactions of atom A are being scaled back. Regardless of the scaling scheme used, at some lambda value, atoms A and B will begin to overlap occasionally, since the final state allows A and B to overlap totally. If A has a remaining partial atomic charge when these overlaps become possible, the two point charges assigned to A and B can actually overlap as well. Since the potential energy of Coulomb interactions between point charges scales as [math]\displaystyle{ q_{A} q_{B}/r_{AB} }[/math], where [math]\displaystyle{ r_{AB} }[/math] is the distance between A and B, this presents significant problems when [math]\displaystyle{ q_A }[/math] and [math]\displaystyle{ q_B }[/math] have opposite signs. In particular, there is an infinite energy minimum at [math]\displaystyle{ r_{AB}=0 }[/math], so the two particles would in principle get trapped on top of one another.
In practice, what usually happens in molecular dynamics simulations in these circumstances is that the forces get extremely large as A and B begin to overlap, and the simulation will crash. Constraint algorithms are often the first to fail, so this may lead to a warning about constraints (i.e. LINCS or SHAKE) and then a crash. This issue is discussed briefly by Pitera and van Gunsteren and in more detail by Anwar and Heyes.
In view of this problem, we recommend always turning off partial charges for any atoms for which Lennard-Jones interactions are being removed before doing the Lennard-Jones transformation. Additionally, when Lennard-Jones parameters for an atom are being substantially modified during a free energy calculation (i.e. for relative free energy calculations involving mutation of an atom) and soft-core potentials are employed, similar problems may arise, so it may be useful to remove partial charges on atoms which are being mutated, as well.
Several groups have developed modified electrostatics scaling methods in an attempt to bypass this problem and allow electrostatics interactions and Lennard-Jones interactions to be turned off in only one set of calculations (for example, Anwar and Heyes), but since electrostatics transformations are usually so smooth a function of [math]\displaystyle{ \lambda }[/math] and need only few [math]\displaystyle{ \lambda }[/math] values for good overlap (Shirts et al., 2005; Mobley et al., 2007, and others) it is unclear that this results in any significant efficiency gain over performing the transformations separately.
In view of this, our recommendation is that either (a) partial charges on any particles being inserted or deleted be turned off prior to the use of soft core potentials for those particles, or (b) a soft core scheme for electrostatics be implemented to allow simultaneous changes.
Guideline 3: Perform electrostatics transformations separately from Lennard-Jones
As noted in Rule 2, above, electrostatics transformations are typically smooth functions of lambda with good phase-space overlap between even coarsely-spaced lambda values(Shirts et al., 2005; Mobley et al., 2007, and others). As a consequence, these are quite efficient compared to Lennard-Jones calculations. As established above, when particles are being inserted or deleted, the electrostatic interactions of these particles should be set to zero before turning off their Lennard-Jones interactions. But what about electrostatic interactions on atoms which are merely being mutated (i.e. a change of partial charge and Lennard-Jones radius), as in relative free energy calculations?
We are not aware of any study which has looked at this in detail, but given the efficiency of free energy calculations modifying electrostatics interactions relative to those significantly modifying Lennard-Jones interactions, we believe it makes sense to perform the two sets of calculations separately. Given that the two transformations have different lambda-dependences, it might actually be less efficient to perform them together than separately. Performing them separately has an additional advantage, as well: Uncertainties in the two components can be assessed separately, and computational effort focused on reducing the largest uncertainty (i.e. by extending some simulations to get additional sampling).
Further testing should be focused in this area, to determine whether alternative scaling approaches which can modify Lennard-Jones and electrostatic interactions simultaneously (Anwar and Heyes) are actually more efficient than the approach of separate modification that we propose.
Guideline 4: Use few insertions/deletions
Electrostatics transformations are usually smooth functions of lambda, and require few lambda values, while Lennard-Jones transformations – especially those involving insertion and deletions – are difficult transformations which require substantially more lambda values to obtain good phase-space overlap and accurate free energy differences (Shirts et al., 2005; Mobley et al., 2007, Mobley et al., 2007b, and others). Thus, insertions and deletions of particles can be thought of as “difficult” transformations (i.e. Jarzynski, 2006). Consequently, it is far more computationally efficient to modify existing particles (atoms) than to insert or delete new atoms; this should be kept in mind when constructing mutation pathways for relative free energy calculations, since multiple choices of mutation pathways between a set of molecules are typically possible.
This guideline is not at all helpful for absolute free energy calculations, since these by design involve inserting or deleting entire molecules.
Guideline 5: Think about configuration space and convergence
Given that many choices of pathway are possible, it can often be helpful to think about whether a particular choice of pathway makes convergence easier or more difficult.
For example, in absolute binding free energies, one can incorporate the standard state using either simple distance restraints between the ligand and the protein, or by restraining the ligand orientation as well. At the fully noninteracting state, the amount of configuration space the ligand will need to sample is dictated by this choice. Hence, a ligand with only a single reference distance restrained relative to the protein will need to sample a spherical shell in configuration space, while a ligand with all six relative degrees of freedom restrained would need to sample only a very small region of configuration space. These two can take drastically different amounts of time, so in fact it can be much more efficient, at least in some cases, to use the additional restraints (http://dx.doi.org/10.1063/1.2221683 Mobley et al., 2006]).