Difference between revisions of "Test System Repository"

From AlchemistryWiki
Jump to navigation Jump to search
 
(95 intermediate revisions by 6 users not shown)
Line 1: Line 1:
Direct editing is disabled for this page for clarity of presentation; please visit [http://www.alchemistry.org/wiki/index.php/Talk:Test_System_Repository the discussion page] to add comments:
+
=Purpose of Test Sets=
 +
One of the biggest challenges to carefully validating and comparing free energy methods is defining and sharing well-defined test cases (molecular systems and force field parameters) with reliably known numerical results. If one is not sure of the value of the free energy dictated by the energy model and other physical parameters, it is impossible to make fine comparisons among methods. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way, can cause legitimate small differences between computed free energies, obscuring differences in the methods. The goal of this Repository is to help define and disseminate a stable set of test systems of varied nature and complexity for use by the free energy simulation community. Note that the free energies provided by these systems may not agree particularly well with experiment, but this is not necessary, because the purpose here is to test the numerical performance of the methods.
  
One of the biggest problems in careful comparisons and validations of methods is the difficulty of trying to agree on a single number for the free energy. If one is not sure of the value of the free energy, fine comparisons of methods are very difficult. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way can have legitimate small differences in the free energies, obscuring differences in the methods.  
+
To join a mailing list for a discussion of protein-ligand binding benchmarks, email michael.shirts at virginia.edu. If you have signed up previously, you can log into the discussion (password protected) at https://collab.itc.virginia.edu/portal/xlogin
  
Ideally test systems should have zero free energy, though this is not generally possible  With a zero transformation, then it is necessary to partially specify the path, so that particularly clever methods do not manage to solve the problem in ways that would not be valid in general simulations. Because of this, the zero transformations here are defined with the endpoints and one midpoint along the transformation. This ensures that a large transformation is performed, but allows these systems to be used to improve free energy pathways as well.
+
= Specifications of the content of binding benchmark tests =
  
Problem 1) Is the method at all valid for molecular systems?
+
''Current standards version is 0.5, dated Sept 27, 2013''
  
* System: Simplest molecular free energy system = UA methane in TIP3P water.
+
There will be three types of depositions for the binding benchmark test sets:
  
* Notes: There are no bond, angle or torsions terms, or solute/solvent charge: this is the simplest system that can be truly defined as realistic.
+
* [[System specifications]]
 +
* [[Potential energy results]]
 +
* [[Free energy results]]
  
Problem 2) Can the method handle water rearrangement around charges?
+
All tests consist of a system specification and at least one potential energy result from a specified software version.  After that, multiple people can contribute free energy results for the same system specification and potential energy result, or contribute potential energy results of the system for different simulation codes. They also might propose a new potential energy result based on their own preferences for simulations of the system (different cutoffs, etc).  Importantly, the "free energy results" should be an attempt to be independent of any such nonphysical approximations.
  
* System: Charged dipole on two LJ spheres tethered together. Currently testing +2/-2 charges, with 5 angstrom separation.  The intermediate state is neutral.
+
= Test Sets =
 +
== Small Molecule Solvation Benchmark Sets  ==
  
* Notes: This setup allows avoidance of computing free energies of ions directly, which is still not handled completely correctly in many codes.
+
* [[The Simple Small Molecule Solvation Benchmark Test Set]]: This test set was designed to test methods for computing hydration free energies of small molecules. It comprises a series of small molecules, parameter sets for three different software codes, and reference energies {{Cite|Paliwal2011}}.
 +
* [http://www.escholarship.org/uc/item/6sd403pz  FreeSolv (Mobley) Hydration Set]: This is an extensive (640+) molecule database of experimental and calculated hydration free energies for small neutral molecules. It includes GROMACS topology and coordinate files as well.
  
Problem 3) Can the free energy method handle multiple atomic sites efficiently?
+
== Host-Guest Binding ==
 +
* Cucurbit[7]uril with benzene (partial charges artificially set to zero).  This tests binding of a nonpolar guest that encounters little barrier to exiting a rigid host.
 +
* Cucurbit[7]uril with guest B5 {{Cite|Moghaddam2011}}. This tests binding of a bulky cationic guest that encounters a substantial energy barrier to exiting a rigid host.
 +
* Some guest binding beta-cyclodextrin.  This would test binding to a much more flexible host.
 +
* Octa-acid with benzoic acid guest derivatives (from SAMPL4 and SAMPL5 blind prediction challenge){{Cite|Olsson2016}}.
  
* System: Anthracene solvation in water  in TIP3P water
+
== Protein-Ligand Binding ==
  
* Notes: No ligand degrees of freedom to complicate the analysis. Null transforms are possible, but these have ended up particularly difficult to implement in different simulation packages. We are deciding whether this could also perform the anthracene->benzene->anthracene transformation.
+
The following test systems were proposed at the [[2012_Workshop_on_Free_Energy_Methods_in_Drug_Design| 2012 Workshop on Free Energy Methods in Drug Design]]. One proposal would be to include 5-10 ligands. However, we should discuss how many ligands are needed for numerical evaluation of methods.
  
 +
* T4 Lysozyme, polar and apolar sites (methods should be able to get this). [[Media:Minimal.tar.gz|GROMACS format minimal set of input files]]{{Cite|Boyce2009}}. (A full set of topology/coordinate files for this set is also available, though the minimal set is probably adequate for most purposes. If desired the full set is available [https://dl.dropboxusercontent.com/u/3409095/paper_support/fullL99AM102Q.tar.gz here (511MB)])
 +
* FKBP (rock solid, well-studied). Files in both GROMACS format and DESMOND-compatible .cms files, validated to give equivalent energies (up to energy calculation method differences)
 +
**  [[Media:FKBP_AMBER_GAFF.tgz|AMBER parameterized input files in GROMACS format]]
 +
**  [[Media:FKBP_desmond.tgz|The same input parameters converted into DESMOND format]]
 +
* Trypsin (well studied, potential issues with sampling and charges it would be good for people to swing at)
 +
* DNA gyrase (from Vertex's data collection curated by Richard Dixon).
 +
* CCP model binding site{{Cite|Rocklin2013}}. [[Media:CCP.zip|GROMACS format minimal set of input files]].
 +
* Absolute free energies - Diverse-ligands to bromodomain BRD4{{Cite|Aldeghi2016}}.  Download a complete zip from: [http://dx.doi.org/10.5281/zenodo.57131 http://dx.doi.org/10.5281/zenodo.57131].
  
Future problems to tackle:
+
=References=
 
+
<references>
Problem 4) Can the method handle long time scale barriers along torsional degrees of freedom?
+
{{Cite|Paliwal2011|Paliwal, H and Shirts, M. R. (2011) An efficient method for the calculation of quantum mechanics/molecular mechanics free energies. J. Chem. Theory Comp. 7(12): 4115-4134, J. Chem. Theory Comput.|http://www.citeulike.org/group/14929/article/10029023}}
 
+
{{Cite|Moghaddam2011|Moghaddam,S., Yang,C., Rekharsky,M., Ko,Y.H., Kim,K., Inoue,Y., and Gilson,M.K. (2011) New Ultrahigh Affinity Host - Guest Complexes of Cucurbit[7]uril with Bicyclo[2.2.2]octane and Adamantane Guests: Thermodynamic Analysis and Evaluation of M2 Affinity Calculations. J.Am.Chem.Soc. 133:3570-3581.}}
* Potential future system: 1-octanol -> ethane -> 1-octanol in TIP3P water.
+
{{Cite|Boyce2009|Boyce, S. E., Mobley, D. L., Rocklin, G. J., Graves, A. P., Dill, K. A. and Shoichet, B. K. (2009) Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J. Mol. Biol. 394:747-763.}}
 
+
{{Cite|Rocklin2013|Rocklin, G. J., Boyce, S. E., Fischer, M., Fish, I, Mobley, D. L., Shoichet, B. K., Dill, K. A. (2013) Blind prediction of charged ligand binding affinities in a model binding site. J. Mol. Biol. 425:4569-4583.}}
* Notes: Topologically, the system would be set up as HO-(CH2)14-OH, with the middle two carbons remaining coupled to the environment for the entire transformation. The h-bonds between alcohols and water might hopefully slow down the torsional sampling).
+
{{Cite|Aldeghi2016|Aldeghi, M., Heifetz, A., Bodkin, M. J., Knapp, S., Biggin, P.C (2016).  Accurate calculation of the absolute free energy of binding for drug molecules.  Chem Sci. 7:207-218.}}
 
+
{{Cite|Olsson2016| Olsson, M. A., Söderhjelm, P., Ryde U. (2016). J. Comp. Chem. 37:1589-1600.}}
Problem 5) Can the method handle complications caused by putting all together in complex systems?
+
</references>
 
 
* Potential Future System: Complicated substituted aromatic like Imatinib, with three substituted positions, with the transformation to cycle the substituents to different positions along the aromatic with benzene as the intermediate.
 
 
 
Estimators of the uncertainty should be validated against uncertainty generated directly from runs from independent configurations (100), and should include the computation of the correlation time of the observable used to calculate the uncorrelated samples used in the free energies (such as the potential energy differences or dH/dL).
 
 
 
1. Input topology and parameter files in a number of different formats:
 
 
 
* GROMACS
 
 
 
* AMBER
 
 
 
* DESMOND
 
 
 
We are interested in getting validated comparisons with the following systems
 
 
 
* GROMOS
 
 
 
* CHARMM
 
 
 
* NAMD
 
 
 
* DL_POLY
 
 
 
* TINKER
 
 
 
* LAMMPS
 
 
 
In each case, we have posted 100 starting configurations for each system, specifying initial box size and positions. We also list the exact energies of the starting configurations to make it easy to verify input files for additional programs.
 
 
 
4) Results from a number of different methods (TI, BAR, WHAM, Wang-Landau recursion).
 
 
 
* TI
 
 
 
* BAR, EXP, MBAR:
 
 
 
* Wang-Landau:
 
 
 
* Transition Matrix approaches
 
 
 
5) Optimization of variables
 
 
 
5a) Equilibrium methods
 
 
 
* For all: Spacing of states, pathway
 
 
 
** TI: numerical integration methods
 
 
 
** BAR: no others
 
 
 
** MBAR:no others
 
 
 
** DCMBAR: size of blocks, approximations in the dimension reduction of control variates.
 
 
 
5b) Equilibrium-at-limit methods
 
 
 
* For all : Spacing of states, pathway, MC or Gibbs sampling step type used
 
 
 
** Wang-Landau: The degree of flatness for decrementing the weight step, the magnitude of the weight step
 
 
 
** Transition state approaches: The transition kernel used
 

Latest revision as of 09:55, 10 August 2016

Purpose of Test Sets

One of the biggest challenges to carefully validating and comparing free energy methods is defining and sharing well-defined test cases (molecular systems and force field parameters) with reliably known numerical results. If one is not sure of the value of the free energy dictated by the energy model and other physical parameters, it is impossible to make fine comparisons among methods. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way, can cause legitimate small differences between computed free energies, obscuring differences in the methods. The goal of this Repository is to help define and disseminate a stable set of test systems of varied nature and complexity for use by the free energy simulation community. Note that the free energies provided by these systems may not agree particularly well with experiment, but this is not necessary, because the purpose here is to test the numerical performance of the methods.

To join a mailing list for a discussion of protein-ligand binding benchmarks, email michael.shirts at virginia.edu. If you have signed up previously, you can log into the discussion (password protected) at https://collab.itc.virginia.edu/portal/xlogin

Specifications of the content of binding benchmark tests

Current standards version is 0.5, dated Sept 27, 2013

There will be three types of depositions for the binding benchmark test sets:

All tests consist of a system specification and at least one potential energy result from a specified software version. After that, multiple people can contribute free energy results for the same system specification and potential energy result, or contribute potential energy results of the system for different simulation codes. They also might propose a new potential energy result based on their own preferences for simulations of the system (different cutoffs, etc). Importantly, the "free energy results" should be an attempt to be independent of any such nonphysical approximations.

Test Sets

Small Molecule Solvation Benchmark Sets

  • The Simple Small Molecule Solvation Benchmark Test Set: This test set was designed to test methods for computing hydration free energies of small molecules. It comprises a series of small molecules, parameter sets for three different software codes, and reference energies [1].
  • FreeSolv (Mobley) Hydration Set: This is an extensive (640+) molecule database of experimental and calculated hydration free energies for small neutral molecules. It includes GROMACS topology and coordinate files as well.

Host-Guest Binding

  • Cucurbit[7]uril with benzene (partial charges artificially set to zero). This tests binding of a nonpolar guest that encounters little barrier to exiting a rigid host.
  • Cucurbit[7]uril with guest B5 [2]. This tests binding of a bulky cationic guest that encounters a substantial energy barrier to exiting a rigid host.
  • Some guest binding beta-cyclodextrin. This would test binding to a much more flexible host.
  • Octa-acid with benzoic acid guest derivatives (from SAMPL4 and SAMPL5 blind prediction challenge)[3].

Protein-Ligand Binding

The following test systems were proposed at the 2012 Workshop on Free Energy Methods in Drug Design. One proposal would be to include 5-10 ligands. However, we should discuss how many ligands are needed for numerical evaluation of methods.

References

  1. Paliwal, H and Shirts, M. R. (2011) An efficient method for the calculation of quantum mechanics/molecular mechanics free energies. J. Chem. Theory Comp. 7(12): 4115-4134, J. Chem. Theory Comput. - Find at Cite-U-Like
  2. Moghaddam,S., Yang,C., Rekharsky,M., Ko,Y.H., Kim,K., Inoue,Y., and Gilson,M.K. (2011) New Ultrahigh Affinity Host - Guest Complexes of Cucurbit[7]uril with Bicyclo[2.2.2]octane and Adamantane Guests: Thermodynamic Analysis and Evaluation of M2 Affinity Calculations. J.Am.Chem.Soc. 133:3570-3581.
  3. Olsson, M. A., Söderhjelm, P., Ryde U. (2016). J. Comp. Chem. 37:1589-1600.
  4. Boyce, S. E., Mobley, D. L., Rocklin, G. J., Graves, A. P., Dill, K. A. and Shoichet, B. K. (2009) Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J. Mol. Biol. 394:747-763.
  5. Rocklin, G. J., Boyce, S. E., Fischer, M., Fish, I, Mobley, D. L., Shoichet, B. K., Dill, K. A. (2013) Blind prediction of charged ligand binding affinities in a model binding site. J. Mol. Biol. 425:4569-4583.
  6. Aldeghi, M., Heifetz, A., Bodkin, M. J., Knapp, S., Biggin, P.C (2016). Accurate calculation of the absolute free energy of binding for drug molecules. Chem Sci. 7:207-218.