Difference between revisions of "Test System Repository"

From AlchemistryWiki
Jump to navigation Jump to search
 
(79 intermediate revisions by 6 users not shown)
Line 1: Line 1:
Direct editing is disabled for this page for clarity of presentation; please visit [http://www.alchemistry.org/wiki/index.php/Talk:Test_System_Repository the discussion page] to add comments:
+
=Purpose of Test Sets=
 +
One of the biggest challenges to carefully validating and comparing free energy methods is defining and sharing well-defined test cases (molecular systems and force field parameters) with reliably known numerical results. If one is not sure of the value of the free energy dictated by the energy model and other physical parameters, it is impossible to make fine comparisons among methods. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way, can cause legitimate small differences between computed free energies, obscuring differences in the methods. The goal of this Repository is to help define and disseminate a stable set of test systems of varied nature and complexity for use by the free energy simulation community. Note that the free energies provided by these systems may not agree particularly well with experiment, but this is not necessary, because the purpose here is to test the numerical performance of the methods.
  
One of the biggest problems in careful comparisons and validations of methods is the difficulty of trying to agree on a single number for the free energy. If one is not sure of the value of the free energy, fine comparisons of methods are very difficult. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way can have legitimate small differences in the free energies, obscuring differences in the methods.  
+
To join a mailing list for a discussion of protein-ligand binding benchmarks, email michael.shirts at virginia.edu. If you have signed up previously, you can log into the discussion (password protected) at https://collab.itc.virginia.edu/portal/xlogin
  
Ideally test systems should have zero free energy, though this is not generally possible  With a zero transformation, then it is necessary to partially specify the path, so that particularly clever methods do not manage to solve the problem in ways that would not be valid in general simulations. Because of this, the zero transformations here are defined with the endpoints and one midpoint along the transformation. This ensures that a large transformation is performed, but allows these systems to be used to improve free energy pathways as well.
+
= Specifications of the content of binding benchmark tests =
  
= Benchmark Test Set, v1.0 =
+
''Current standards version is 0.5, dated Sept 27, 2013''
  
Problem 1) Is the method at all valid for molecular systems?
+
There will be three types of depositions for the binding benchmark test sets:
  
* System: Simplest molecular free energy system = UA methane in TIP3P water.
+
* [[System specifications]]
 +
* [[Potential energy results]]
 +
* [[Free energy results]]
  
* Notes: There are no bond, angle or torsions terms, or solute/solvent charge: this is the simplest system that can be truly defined as realistic.
+
All tests consist of a system specification and at least one potential energy result from a specified software version.  After that, multiple people can contribute free energy results for the same system specification and potential energy result, or contribute potential energy results of the system for different simulation codes. They also might propose a new potential energy result based on their own preferences for simulations of the system (different cutoffs, etc).  Importantly, the "free energy results" should be an attempt to be independent of any such nonphysical approximations.
  
Problem 2) Can the method handle water rearrangement around charges?
+
= Test Sets =
 +
== Small Molecule Solvation Benchmark Sets  ==
  
* System: Charged dipole on two LJ spheres tethered togetherThe Lennard-Jones and bond length terms are taken from UA ethane, with +/- 1 charges.
+
* [[The Simple Small Molecule Solvation Benchmark Test Set]]: This test set was designed to test methods for computing hydration free energies of small molecules. It comprises a series of small molecules, parameter sets for three different software codes, and reference energies {{Cite|Paliwal2011}}.
 +
* [http://www.escholarship.org/uc/item/6sd403pz FreeSolv (Mobley) Hydration Set]: This is an extensive (640+) molecule database of experimental and calculated hydration free energies for small neutral molecules. It includes GROMACS topology and coordinate files as well.
  
* Notes: This setup allows avoidance of computing free energies of ions directly, which is still not handled completely correctly in many codes.
+
== Host-Guest Binding ==
 +
* Cucurbit[7]uril with benzene (partial charges artificially set to zero).  This tests binding of a nonpolar guest that encounters little barrier to exiting a rigid host.
 +
* Cucurbit[7]uril with guest B5 {{Cite|Moghaddam2011}}. This tests binding of a bulky cationic guest that encounters a substantial energy barrier to exiting a rigid host.
 +
* Some guest binding beta-cyclodextrin.  This would test binding to a much more flexible host.
 +
* Octa-acid with benzoic acid guest derivatives (from SAMPL4 and SAMPL5 blind prediction challenge){{Cite|Olsson2016}}.
  
Problem 3) Can the free energy method handle multiple atomic sites efficiently?
+
== Protein-Ligand Binding ==
  
* System: Anthracene solvation in water  in TIP3P water
+
The following test systems were proposed at the [[2012_Workshop_on_Free_Energy_Methods_in_Drug_Design| 2012 Workshop on Free Energy Methods in Drug Design]]. One proposal would be to include 5-10 ligands. However, we should discuss how many ligands are needed for numerical evaluation of methods.
  
* Notes: No ligand degrees of freedom to complicate the analysis. Null transforms are possible, but these have ended up particularly difficult to implement in different simulation packages. We are deciding whether this could also perform the anthracene->benzene->anthracene transformation.
+
* T4 Lysozyme, polar and apolar sites (methods should be able to get this). [[Media:Minimal.tar.gz|GROMACS format minimal set of input files]]{{Cite|Boyce2009}}. (A full set of topology/coordinate files for this set is also available, though the minimal set is probably adequate for most purposes. If desired the full set is available [https://dl.dropboxusercontent.com/u/3409095/paper_support/fullL99AM102Q.tar.gz here (511MB)])
 +
* FKBP (rock solid, well-studied). Files in both GROMACS format and DESMOND-compatible .cms files, validated to give equivalent energies (up to energy calculation method differences)
 +
**  [[Media:FKBP_AMBER_GAFF.tgz|AMBER parameterized input files in GROMACS format]]
 +
**  [[Media:FKBP_desmond.tgz|The same input parameters converted into DESMOND format]]
 +
* Trypsin (well studied, potential issues with sampling and charges it would be good for people to swing at)
 +
* DNA gyrase (from Vertex's data collection curated by Richard Dixon).
 +
* CCP model binding site{{Cite|Rocklin2013}}. [[Media:CCP.zip|GROMACS format minimal set of input files]].
 +
* Absolute free energies - Diverse-ligands to bromodomain BRD4{{Cite|Aldeghi2016}}.  Download a complete zip from: [http://dx.doi.org/10.5281/zenodo.57131 http://dx.doi.org/10.5281/zenodo.57131].
  
== Available Files ==
+
=References=
 
+
<references>
In each case, we have including 100 starting configurations for each system, specifying initial box size and positions, and all other input files. We also list the exact energies of the starting configurations to make it easy to verify input files for additional programs.
+
{{Cite|Paliwal2011|Paliwal, H and Shirts, M. R. (2011) An efficient method for the calculation of quantum mechanics/molecular mechanics free energies. J. Chem. Theory Comp. 7(12): 4115-4134, J. Chem. Theory Comput.|http://www.citeulike.org/group/14929/article/10029023}}
 
+
{{Cite|Moghaddam2011|Moghaddam,S., Yang,C., Rekharsky,M., Ko,Y.H., Kim,K., Inoue,Y., and Gilson,M.K. (2011) New Ultrahigh Affinity Host - Guest Complexes of Cucurbit[7]uril with Bicyclo[2.2.2]octane and Adamantane Guests: Thermodynamic Analysis and Evaluation of M2 Affinity Calculations. J.Am.Chem.Soc. 133:3570-3581.}}
* README File: [[File:README.pdf|README]]
+
{{Cite|Boyce2009|Boyce, S. E., Mobley, D. L., Rocklin, G. J., Graves, A. P., Dill, K. A. and Shoichet, B. K. (2009) Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J. Mol. Biol. 394:747-763.}}
 
+
{{Cite|Rocklin2013|Rocklin, G. J., Boyce, S. E., Fischer, M., Fish, I, Mobley, D. L., Shoichet, B. K., Dill, K. A. (2013) Blind prediction of charged ligand binding affinities in a model binding site. J. Mol. Biol. 425:4569-4583.}}
* Comparisons of energies evaluated: [[File:energy_comparisons.tgz|Energy Comparisons]]
+
{{Cite|Aldeghi2016|Aldeghi, M., Heifetz, A., Bodkin, M. J., Knapp, S., Biggin, P.C (2016).  Accurate calculation of the absolute free energy of binding for drug molecules.  Chem Sci. 7:207-218.}}
 
+
{{Cite|Olsson2016| Olsson, M. A., Söderhjelm, P., Ryde U. (2016). J. Comp. Chem. 37:1589-1600.}}
* GROMACS Files [[File:GROMACS.tgz]]
+
</references>
 
 
* AMBER Files [[File:AMBER.tgz]]
 
 
 
* DESMOND Files [[File:DESMOND.tgz]]
 
 
 
= Improvements to the Benchmark Test Set =
 
 
 
== Extensions to other programs ==
 
 
 
We are interested in getting validated comparisons with the following systems
 
 
 
* GROMOS
 
 
 
* CHARMM
 
 
 
* NAMD
 
 
 
* DL_POLY
 
 
 
* TINKER
 
 
 
* LAMMPS
 
 
 
== Future problems to tackle: Benchmark set v2.0 ==
 
 
 
Additional Problem 1) Can the method handle long time scale barriers along torsional degrees of freedom?
 
 
 
* Potential future system: 1-octanol -> ethane -> 1-octanol in TIP3P water.
 
 
 
* Notes: Topologically, the system would be set up as HO-(CH2)14-OH, with the middle two carbons remaining coupled to the environment for the entire transformation. The h-bonds between alcohols and water might hopefully slow down the torsional sampling).
 
 
 
Additionally Problem 2) Can the method handle complications caused by putting all together in complex systems?
 
 
 
* Potential Future System: Complicated substituted aromatic like Imatinib, with three substituted positions, with the transformation to cycle the substituents to different positions along the aromatic with benzene as the intermediate.
 
 
 
Estimators of the uncertainty should be validated against uncertainty generated directly from runs from independent configurations (100), and should include the computation of the correlation time of the observable used to calculate the uncorrelated samples used in the free energies (such as the potential energy differences or dH/dL).
 

Latest revision as of 09:55, 10 August 2016

Purpose of Test Sets

One of the biggest challenges to carefully validating and comparing free energy methods is defining and sharing well-defined test cases (molecular systems and force field parameters) with reliably known numerical results. If one is not sure of the value of the free energy dictated by the energy model and other physical parameters, it is impossible to make fine comparisons among methods. Additionally, different programs with different bookkeeping, or parameters that have been rounded in some way, can cause legitimate small differences between computed free energies, obscuring differences in the methods. The goal of this Repository is to help define and disseminate a stable set of test systems of varied nature and complexity for use by the free energy simulation community. Note that the free energies provided by these systems may not agree particularly well with experiment, but this is not necessary, because the purpose here is to test the numerical performance of the methods.

To join a mailing list for a discussion of protein-ligand binding benchmarks, email michael.shirts at virginia.edu. If you have signed up previously, you can log into the discussion (password protected) at https://collab.itc.virginia.edu/portal/xlogin

Specifications of the content of binding benchmark tests

Current standards version is 0.5, dated Sept 27, 2013

There will be three types of depositions for the binding benchmark test sets:

All tests consist of a system specification and at least one potential energy result from a specified software version. After that, multiple people can contribute free energy results for the same system specification and potential energy result, or contribute potential energy results of the system for different simulation codes. They also might propose a new potential energy result based on their own preferences for simulations of the system (different cutoffs, etc). Importantly, the "free energy results" should be an attempt to be independent of any such nonphysical approximations.

Test Sets

Small Molecule Solvation Benchmark Sets

  • The Simple Small Molecule Solvation Benchmark Test Set: This test set was designed to test methods for computing hydration free energies of small molecules. It comprises a series of small molecules, parameter sets for three different software codes, and reference energies [1].
  • FreeSolv (Mobley) Hydration Set: This is an extensive (640+) molecule database of experimental and calculated hydration free energies for small neutral molecules. It includes GROMACS topology and coordinate files as well.

Host-Guest Binding

  • Cucurbit[7]uril with benzene (partial charges artificially set to zero). This tests binding of a nonpolar guest that encounters little barrier to exiting a rigid host.
  • Cucurbit[7]uril with guest B5 [2]. This tests binding of a bulky cationic guest that encounters a substantial energy barrier to exiting a rigid host.
  • Some guest binding beta-cyclodextrin. This would test binding to a much more flexible host.
  • Octa-acid with benzoic acid guest derivatives (from SAMPL4 and SAMPL5 blind prediction challenge)[3].

Protein-Ligand Binding

The following test systems were proposed at the 2012 Workshop on Free Energy Methods in Drug Design. One proposal would be to include 5-10 ligands. However, we should discuss how many ligands are needed for numerical evaluation of methods.

References

  1. Paliwal, H and Shirts, M. R. (2011) An efficient method for the calculation of quantum mechanics/molecular mechanics free energies. J. Chem. Theory Comp. 7(12): 4115-4134, J. Chem. Theory Comput. - Find at Cite-U-Like
  2. Moghaddam,S., Yang,C., Rekharsky,M., Ko,Y.H., Kim,K., Inoue,Y., and Gilson,M.K. (2011) New Ultrahigh Affinity Host - Guest Complexes of Cucurbit[7]uril with Bicyclo[2.2.2]octane and Adamantane Guests: Thermodynamic Analysis and Evaluation of M2 Affinity Calculations. J.Am.Chem.Soc. 133:3570-3581.
  3. Olsson, M. A., Söderhjelm, P., Ryde U. (2016). J. Comp. Chem. 37:1589-1600.
  4. Boyce, S. E., Mobley, D. L., Rocklin, G. J., Graves, A. P., Dill, K. A. and Shoichet, B. K. (2009) Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J. Mol. Biol. 394:747-763.
  5. Rocklin, G. J., Boyce, S. E., Fischer, M., Fish, I, Mobley, D. L., Shoichet, B. K., Dill, K. A. (2013) Blind prediction of charged ligand binding affinities in a model binding site. J. Mol. Biol. 425:4569-4583.
  6. Aldeghi, M., Heifetz, A., Bodkin, M. J., Knapp, S., Biggin, P.C (2016). Accurate calculation of the absolute free energy of binding for drug molecules. Chem Sci. 7:207-218.