GROMACS 4.6 example: Direct ethanol solvation free energy
Free Energy Fundamentals |
---|
Methods of Free Energy Simulations
|
Free Energy How-to's |
---|
Setting up the calculation with Gromacs
There are nine intermediate states defined to perform the calculation of the solvation of ethanol. You will need to run nine different simulations. The will use the SAME .gro file (File:Ethanol.gro) , the SAME .top file, (File:Ethanol.mdp) and nine different .mdp (File:Ethanol.mdp) files. But the .mdp files will only be different by one line,
init_fep_state = X
Where X is 0 through 8, inclusive, because there are 9 states.
Running the calculation with Gromacs
Run grompp and mdrun as normal. Specifically:
grompp -f ethanol.X.mdp -c ethanol.gro -p ethanol.top -o ethanol.X.tpr -p $top -o $tpr -maxwarn 10
There may be some warnings, and you'll need to override, hence -maxwarn. X runs from 0 to 8.
The number of threads can be anything you want (less than the number run on the cluster) If running on the cluster, you will also want to set the -nopin flag. If you don't set the dhdl file independently, it will be saved to ethanol.X.xvg, and might get written over accidentally if you run g_energy. X again runs 0-8
mdrun -nt 1 -deffnm ethanol.X -dhdl ethanol.X.dhdl.xvg
Strictly, you will only need the dhdl files, but looking at the others output files can be useful to determine
Analyze the calculation with Gromacs
After you have generated the dhdl files, you can then analyze them using pymbar with the dhdl output.
The correct invocation is:
python alchemical-gromacs.py -f directory/prefix -t 300 -p 1 -n 2500 -v > outpufile
-t is temperature -p is pressure -f is (prefix of the files, including directory)
It will use all files it finds with the given prefix, and it will
assume they are numbered in order. We omit the first 500 ps (0.2 ps
per sample) for equilibration (likely overkill for this problem, but
better save than sorry).
You may see the following warnings:
Warning on use of the timeseries module: If the inherent timescales of the system are long compared to those being analyzed, this statistical inefficiency may be an underestimate. The estimate presumes the use of many statistically independent samples Tests should be performed to assess whether this condition is satisfied. Be cautious in the interpretation of the data
This is a standard warning -- the correlation time code will be an
underestimate if there are correlation times that are long with
respect to the simulation time. In this case, there aren't. With
proteins, there probably are.
/Users/mrshirts/work/papers/MBARFORDUMMIES/pymbar/trunk/pymbar/pymbar.py:308: RuntimeWarning: overflow encountered in exp log_f_R = - max_arg_R - numpy.log( numpy.exp(-max_arg_R) + numpy.exp(exp_arg_R - max_arg_R) ) - w_R /Users/mrshirts/work/papers/MBARFORDUMMIES/pymbar/trunk/pymbar/pymbar.py:494: RuntimeWarning: overflow encountered in exp fR = 1/(1+numpy.exp(w_R - C))
These warnings can pop up because we are using BAR initialization in the algorithm, which is a bit faster (not much). It will default to zeros, which is fine. As long as the MBAR calculation converges OK, these warnings can be ignored.