Difference between revisions of "Test System Repository"

From AlchemistryWiki
Jump to navigation Jump to search
 
(103 intermediate revisions by 6 users not shown)
Line 1: Line 1:
Free Energy Benchmarks
+
=Purpose of Test Sets=
 +
One of the biggest challenges to carefully validating and comparing free energy methods is defining and sharing well-defined test cases (molecular systems and force field parameters) with reliably known numerical results. If one is not sure of the value of the free energy dictated by the energy model and other physical parameters, it is impossible to make fine comparisons among methods. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way, can cause legitimate small differences between computed free energies, obscuring differences in the methods. The goal of this Repository is to help define and disseminate a stable set of test systems of varied nature and complexity for use by the free energy simulation community. Note that the free energies provided by these systems may not agree particularly well with experiment, but this is not necessary, because the purpose here is to test the numerical performance of the methods.
  
One of the biggest problems in careful comparisons and validations of methods is the difficulty of trying to agree on a single number for the free energy. If one is not sure of the value of the free energy, fine comparisons of methods are very difficult. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way can have legitimate small differences in the free energies, obscuring differences in the methods. Therefore, except for the simplest system mentioned here (UA methane in water), these are all zero total free energy.
+
To join a mailing list for a discussion of protein-ligand binding benchmarks, email michael.shirts at virginia.edu. If you have signed up previously, you can log into the discussion (password protected) at https://collab.itc.virginia.edu/portal/xlogin
  
With a zero transformation, then it is necessary to partially specify the path, so that particularly clever methods do not manage to solve the problem in ways that would not be valid in general simulations. Because of this, the zero transformations here are defined with the endpoints and one midpoint along the transformation. This ensures that a large transformation is performed, but allows these systems to be used to improve free energy pathways as well.
+
= Specifications of the content of binding benchmark tests =
  
Problem 1) Is the method at all valid for molecular systems?
+
''Current standards version is 0.5, dated Sept 27, 2013''
  
* System: Simplest molecular free energy system = UA methane in TIP3P water.
+
There will be three types of depositions for the binding benchmark test sets:
  
* Notes: There are no bond, angle or torsions terms, or solute/solvent charge: this is the simplest system that can be truly defined as realistic.
+
* [[System specifications]]
 +
* [[Potential energy results]]
 +
* [[Free energy results]]
  
Problem 2) Can the free energy method handle multiple atomic sites efficiently?
+
All tests consist of a system specification and at least one potential energy result from a specified software version.  After that, multiple people can contribute free energy results for the same system specification and potential energy result, or contribute potential energy results of the system for different simulation codes. They also might propose a new potential energy result based on their own preferences for simulations of the system (different cutoffs, etc).  Importantly, the "free energy results" should be an attempt to be independent of any such nonphysical approximations.
  
* System: Napthalene null transform with benzene as intermediate, in TIP3P water
+
= Test Sets =
 +
== Small Molecule Solvation Benchmark Sets  ==
  
* Notes: No ligand degrees of freedom to complicate the analysisWe are deciding whether this could also perform the anthracine->benzene->anthracine transformation.
+
* [[The Simple Small Molecule Solvation Benchmark Test Set]]: This test set was designed to test methods for computing hydration free energies of small molecules. It comprises a series of small molecules, parameter sets for three different software codes, and reference energies {{Cite|Paliwal2011}}.
 +
* [http://www.escholarship.org/uc/item/6sd403pz FreeSolv (Mobley) Hydration Set]: This is an extensive (640+) molecule database of experimental and calculated hydration free energies for small neutral molecules. It includes GROMACS topology and coordinate files as well.
  
Problem 3) Can the method handle water rearrangement around charges?
+
== Host-Guest Binding ==
 +
* Cucurbit[7]uril with benzene (partial charges artificially set to zero).  This tests binding of a nonpolar guest that encounters little barrier to exiting a rigid host.
 +
* Cucurbit[7]uril with guest B5 {{Cite|Moghaddam2011}}. This tests binding of a bulky cationic guest that encounters a substantial energy barrier to exiting a rigid host.
 +
* Some guest binding beta-cyclodextrin.  This would test binding to a much more flexible host.
 +
* Octa-acid with benzoic acid guest derivatives (from SAMPL4 and SAMPL5 blind prediction challenge){{Cite|Olsson2016}}.
  
* System: two LJ spheres, tethered together, with +1/-1 charges, with reversal the charges.
+
== Protein-Ligand Binding ==
  
* Notes: This setup allows avoidance of computing free energies of ions directly, which is still not handled completely correctly in many codes.
+
The following test systems were proposed at the [[2012_Workshop_on_Free_Energy_Methods_in_Drug_Design| 2012 Workshop on Free Energy Methods in Drug Design]]. One proposal would be to include 5-10 ligands. However, we should discuss how many ligands are needed for numerical evaluation of methods.
  
Problem 4) Can the method handle torsional degrees of freedom?
+
* T4 Lysozyme, polar and apolar sites (methods should be able to get this). [[Media:Minimal.tar.gz|GROMACS format minimal set of input files]]{{Cite|Boyce2009}}. (A full set of topology/coordinate files for this set is also available, though the minimal set is probably adequate for most purposes. If desired the full set is available [https://dl.dropboxusercontent.com/u/3409095/paper_support/fullL99AM102Q.tar.gz here (511MB)])
 +
* FKBP (rock solid, well-studied). Files in both GROMACS format and DESMOND-compatible .cms files, validated to give equivalent energies (up to energy calculation method differences)
 +
**  [[Media:FKBP_AMBER_GAFF.tgz|AMBER parameterized input files in GROMACS format]]
 +
**  [[Media:FKBP_desmond.tgz|The same input parameters converted into DESMOND format]]
 +
* Trypsin (well studied, potential issues with sampling and charges it would be good for people to swing at)
 +
* DNA gyrase (from Vertex's data collection curated by Richard Dixon).
 +
* CCP model binding site{{Cite|Rocklin2013}}. [[Media:CCP.zip|GROMACS format minimal set of input files]].
 +
* Absolute free energies - Diverse-ligands to bromodomain BRD4{{Cite|Aldeghi2016}}.  Download a complete zip from: [http://dx.doi.org/10.5281/zenodo.57131 http://dx.doi.org/10.5281/zenodo.57131].
  
* System: 1-octanol -> ethane -> 1-octanol in TIP3P water.
+
=References=
 
+
<references>
* Notes: Topologically, the system would be set up as HO-(CH2)_14-OH, with the middle two carbons remaining coupled to the system for the entire transformation. The h-bonds between alcohols and water might hopefully slow down the torsional sampling).
+
{{Cite|Paliwal2011|Paliwal, H and Shirts, M. R. (2011) An efficient method for the calculation of quantum mechanics/molecular mechanics free energies. J. Chem. Theory Comp. 7(12): 4115-4134, J. Chem. Theory Comput.|http://www.citeulike.org/group/14929/article/10029023}}
 
+
{{Cite|Moghaddam2011|Moghaddam,S., Yang,C., Rekharsky,M., Ko,Y.H., Kim,K., Inoue,Y., and Gilson,M.K. (2011) New Ultrahigh Affinity Host - Guest Complexes of Cucurbit[7]uril with Bicyclo[2.2.2]octane and Adamantane Guests: Thermodynamic Analysis and Evaluation of M2 Affinity Calculations. J.Am.Chem.Soc. 133:3570-3581.}}
Problem 5) Can the method handle complications caused by putting all together in complex systems?
+
{{Cite|Boyce2009|Boyce, S. E., Mobley, D. L., Rocklin, G. J., Graves, A. P., Dill, K. A. and Shoichet, B. K. (2009) Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J. Mol. Biol. 394:747-763.}}
 
+
{{Cite|Rocklin2013|Rocklin, G. J., Boyce, S. E., Fischer, M., Fish, I, Mobley, D. L., Shoichet, B. K., Dill, K. A. (2013) Blind prediction of charged ligand binding affinities in a model binding site. J. Mol. Biol. 425:4569-4583.}}
* System: Complicated substituted aromatic like Imatinib, with three substituted positions, with the transformation to cycle the substituents to different positions along the aromatic with benzene as the intermediate.
+
{{Cite|Aldeghi2016|Aldeghi, M., Heifetz, A., Bodkin, M. J., Knapp, S., Biggin, P.C (2016).  Accurate calculation of the absolute free energy of binding for drug molecules.  Chem Sci. 7:207-218.}}
 
+
{{Cite|Olsson2016| Olsson, M. A., Söderhjelm, P., Ryde U. (2016). J. Comp. Chem. 37:1589-1600.}}
Remaining questions for setup:
+
</references>
 
 
* Size of system? As small as we can make it for each system. Provide 10 A from the edge of interacting molecules.
 
 
 
* Simulation parameters: what electrostatic and other cutoff parameter should be set? What temperature and pressure control methods should be used?
 
 
 
Estimators of the uncertainty should be validated against uncertainty generated directly from runs from independent configurations (at least 40), and should include the computation of the correlation time of the observable used to calculate the free energies (such as the potential energy differences or dH/dL).
 
 
 
1. Input topology and parameter files in a number of different formats:
 
 
 
* GROMOS
 
 
 
* CHARMM
 
 
 
* GROMACS
 
 
 
* AMBER
 
 
 
* NAMD
 
 
 
* DESMOND
 
 
 
* DL_POLY
 
 
 
* TINKER
 
 
 
* LAMMPS
 
 
 
2) Independent prequilibrated starting configurations for each system (40-100). We will specify initial box size, velocities, positions.
 
 
 
3) Exact energies of the starting configurations to make it easy to verify input files for additional programs.
 
 
 
4) Results from a number of different methods (TI, BAR, WHAM, Wang-Landau recursion).
 
 
 
* TI
 
 
 
* BAR, EXP, MBAR:
 
 
 
* Wang-Landau:
 
 
 
* Transition Matrix approaches
 
 
 
5) Optimization of variables
 
 
 
5a) Equilibrium methods
 
 
 
* For all: Spacing of states, pathway
 
 
 
** TI: numerical integration methods
 
 
 
** BAR: no others
 
 
 
** MBAR:no others
 
 
 
** DCMBAR: size of blocks, approximations in the dimension reduction of control variates.
 
 
 
5b) Equilibrium-at-limit methods
 
 
 
* For all : Spacing of states, pathway, MC or Gibbs sampling step type used
 
 
 
** Wang-Landau: The degree of flatness for decrementing the weight step, the magnitude of the weight step
 
 
 
** Transition state approaches: The transition kernel used
 

Latest revision as of 09:55, 10 August 2016

Purpose of Test Sets

One of the biggest challenges to carefully validating and comparing free energy methods is defining and sharing well-defined test cases (molecular systems and force field parameters) with reliably known numerical results. If one is not sure of the value of the free energy dictated by the energy model and other physical parameters, it is impossible to make fine comparisons among methods. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way, can cause legitimate small differences between computed free energies, obscuring differences in the methods. The goal of this Repository is to help define and disseminate a stable set of test systems of varied nature and complexity for use by the free energy simulation community. Note that the free energies provided by these systems may not agree particularly well with experiment, but this is not necessary, because the purpose here is to test the numerical performance of the methods.

To join a mailing list for a discussion of protein-ligand binding benchmarks, email michael.shirts at virginia.edu. If you have signed up previously, you can log into the discussion (password protected) at https://collab.itc.virginia.edu/portal/xlogin

Specifications of the content of binding benchmark tests

Current standards version is 0.5, dated Sept 27, 2013

There will be three types of depositions for the binding benchmark test sets:

All tests consist of a system specification and at least one potential energy result from a specified software version. After that, multiple people can contribute free energy results for the same system specification and potential energy result, or contribute potential energy results of the system for different simulation codes. They also might propose a new potential energy result based on their own preferences for simulations of the system (different cutoffs, etc). Importantly, the "free energy results" should be an attempt to be independent of any such nonphysical approximations.

Test Sets

Small Molecule Solvation Benchmark Sets

  • The Simple Small Molecule Solvation Benchmark Test Set: This test set was designed to test methods for computing hydration free energies of small molecules. It comprises a series of small molecules, parameter sets for three different software codes, and reference energies [1].
  • FreeSolv (Mobley) Hydration Set: This is an extensive (640+) molecule database of experimental and calculated hydration free energies for small neutral molecules. It includes GROMACS topology and coordinate files as well.

Host-Guest Binding

  • Cucurbit[7]uril with benzene (partial charges artificially set to zero). This tests binding of a nonpolar guest that encounters little barrier to exiting a rigid host.
  • Cucurbit[7]uril with guest B5 [2]. This tests binding of a bulky cationic guest that encounters a substantial energy barrier to exiting a rigid host.
  • Some guest binding beta-cyclodextrin. This would test binding to a much more flexible host.
  • Octa-acid with benzoic acid guest derivatives (from SAMPL4 and SAMPL5 blind prediction challenge)[3].

Protein-Ligand Binding

The following test systems were proposed at the 2012 Workshop on Free Energy Methods in Drug Design. One proposal would be to include 5-10 ligands. However, we should discuss how many ligands are needed for numerical evaluation of methods.

References

  1. Paliwal, H and Shirts, M. R. (2011) An efficient method for the calculation of quantum mechanics/molecular mechanics free energies. J. Chem. Theory Comp. 7(12): 4115-4134, J. Chem. Theory Comput. - Find at Cite-U-Like
  2. Moghaddam,S., Yang,C., Rekharsky,M., Ko,Y.H., Kim,K., Inoue,Y., and Gilson,M.K. (2011) New Ultrahigh Affinity Host - Guest Complexes of Cucurbit[7]uril with Bicyclo[2.2.2]octane and Adamantane Guests: Thermodynamic Analysis and Evaluation of M2 Affinity Calculations. J.Am.Chem.Soc. 133:3570-3581.
  3. Olsson, M. A., Söderhjelm, P., Ryde U. (2016). J. Comp. Chem. 37:1589-1600.
  4. Boyce, S. E., Mobley, D. L., Rocklin, G. J., Graves, A. P., Dill, K. A. and Shoichet, B. K. (2009) Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J. Mol. Biol. 394:747-763.
  5. Rocklin, G. J., Boyce, S. E., Fischer, M., Fish, I, Mobley, D. L., Shoichet, B. K., Dill, K. A. (2013) Blind prediction of charged ligand binding affinities in a model binding site. J. Mol. Biol. 425:4569-4583.
  6. Aldeghi, M., Heifetz, A., Bodkin, M. J., Knapp, S., Biggin, P.C (2016). Accurate calculation of the absolute free energy of binding for drug molecules. Chem Sci. 7:207-218.