Difference between revisions of "2014 Workshop Tuesday Lunchtime Discussion"
(Created page with "Tuesday Afternoon Discussion: Making free energy calculations robust. Are we getting the right answers for the right reasons? * do we get the same answer from different pro...") |
|||
Line 1: | Line 1: | ||
Tuesday Afternoon Discussion: Making free energy calculations robust. | Tuesday Afternoon Discussion: Making free energy calculations robust. | ||
+ | Discussion Leaders: John Chodera and Robert Abel | ||
+ | |||
+ | Can we develop post-simulation health reports? Did things work the way we thought? Are we getting the right answers for the right reasons? | ||
+ | |||
+ | * We should get the same answers if we run from different programs | ||
+ | *Deviation from crystal structures should be looked at | ||
+ | ** There is a database of ground ligands: But crystal image quality needs to be checked first, | ||
+ | ** Although this gets ahead of did we check to see if the simulation worked right in the first place (not validated) | ||
+ | * e.g. if you ran with very few counter ions (just enough to balance system), you would probably undersample ion positions | ||
+ | |||
+ | ** Solution: don't run with only limited number of ions, use more | ||
+ | Do we worry that protein is at higher concentration in sim than sol? | ||
+ | |||
+ | * Related to box size dependence | ||
+ | * Ligand will also be affected since we really are not at infinite solution | ||
+ | |||
+ | We have a Draft Standards on Alchemistry.org | ||
+ | |||
+ | * This was a brain dump and we encourage people to contribute | ||
+ | * We should script this process and run our checks | ||
+ | * We may think about generating a prety report similar to medical health reports | ||
+ | |||
+ | Before you can do good FEP, you need to do good simulations | ||
+ | |||
+ | A health check was made in someone's company where they eyeballed results | ||
+ | * Human judgment got better after the start of project | ||
+ | |||
+ | How can we use previous results to get a sense of what is wrong? | ||
+ | |||
+ | What parts of active site move for ligands in congenerc series? | ||
+ | |||
+ | If part of protein we had not seen move suddenly moved before changes conformations, this is a red flag | ||
+ | |||
+ | * We would like to have simulations where we see things happen a number of times or never | ||
+ | ** If we see something occur but not frequently (slow) usch as loop opening, this should raise a red flag | ||
+ | ** Look at number of rotamer transitions, lots good, one or few bad. | ||
+ | ** Might still have events that are NOT coupled to the binding site, so are they important or can they be ignored? | ||
+ | ** We could use markov model to detect the slow degrees of freedom | ||
+ | ** Could also look at correlation of slow degrees of freedom and du/dl | ||
+ | |||
+ | Detecting ergodicity problems with simlations vs. gross errors vs. setup errors | ||
+ | * Some pipline that could take PDB and spin up short/single-frame sims from multiple packages and compare. This will catch about 80% of errors | ||
+ | |||
+ | People can look for consistency in thermodynamic pathways | ||
+ | *Building redudancy into workflows to add another layer of error checking | ||
+ | * Replicating the same simulation and varying what should not mater (e.g, different initial velocities)/ We should get the same result, but we often don't | ||
+ | ** Initial binding modes, water placement | ||
+ | ** Danny used slightly different preps of same protein | ||
+ | * Could run multiple reruns from of the simulation to get error bars (3-5 times) | ||
+ | ** Use different starting points | ||
+ | How much does this actually block/interrupt people | ||
+ | * Small typos, can cause 1/2 sims to fail | ||
+ | * There are an infinite number of ways things can go wrong | ||
+ | * Clear to people who are at it the 10th time, harder for 1st time users like people learning | ||
+ | * Also problematic for people who much later on get bored and try to tweak something | ||
+ | * del F = 0 for FE of solution to solution from bad start point | ||
+ | * Having the students failing actually teaches them, so it may be useful | ||
+ | * But this may disuade broader comunity who dont want to deal with it | ||
+ | * Even with checks in the SW, for cases when they set a param phyisically wrong but SW accepts, what then? | ||
+ | ** There is a very subtle line between how much checking to do, and when you should have a feel for it | ||
+ | Maybe setting an "Expert flag?" | ||
+ | * Do we want borader community to fiddle with fine params like the Barrostats? | ||
+ | * Having the idea of default settings such as Docking (or TurboTax/TT) | ||
+ | **TT: feed a #, most users with this # do this. | ||
+ | ** A similar prompting system may be helpful for people | ||
+ | ** Some kind of default progression path | ||
+ | * Maybe Teirs of interfaces like Pheonix for crystalography | ||
+ | * Tooltips will be useful in the setup | ||
+ | Equilibration and correlated samples | ||
+ | * Running longer: You may be bias to stopping if you see an estimation you expect | ||
+ | Things if people don't do, you get annoyed | ||
+ | * Not varying initial conditions | ||
+ | ** Bound vs unbound starting structure | ||
+ | * Different binding mode | ||
+ | **Reverse convergence of DeltaG with time | ||
+ | ** Energy/volume Drift, energy change fluctuations | ||
+ | ***Using the right method (e.g. can't use Zwanzig formula if work distro too large) | ||
+ | ** RMSD of protein and ligand | ||
+ | ***Some X is bad? or some plateu is okay? we dont know. | ||
+ | **Were motions we expected to sample actually sampled (e.g. loops) | ||
+ | ***Repex convergence | ||
+ | ***If individual replicas (which should visit all the states) converge to the same answer as all data together, that converges | ||
+ | ***Time autocorrelation of the property we are interested in. | ||
+ | ***w/o we have no idea of the statistical hyigene | ||
+ | |||
+ | We need a common place to converge | ||
+ | |||
+ | We can also check our convergences against expereiment if we have th eresources for it | ||
+ | We need to have best practices we all check, and things we need to check for our users. | ||
Are we getting the right answers for the right reasons? | Are we getting the right answers for the right reasons? |
Revision as of 12:35, 26 May 2014
Tuesday Afternoon Discussion: Making free energy calculations robust.
Discussion Leaders: John Chodera and Robert Abel
Can we develop post-simulation health reports? Did things work the way we thought? Are we getting the right answers for the right reasons?
- We should get the same answers if we run from different programs
- Deviation from crystal structures should be looked at
- There is a database of ground ligands: But crystal image quality needs to be checked first,
- Although this gets ahead of did we check to see if the simulation worked right in the first place (not validated)
- e.g. if you ran with very few counter ions (just enough to balance system), you would probably undersample ion positions
- Solution: don't run with only limited number of ions, use more
Do we worry that protein is at higher concentration in sim than sol?
- Related to box size dependence
- Ligand will also be affected since we really are not at infinite solution
We have a Draft Standards on Alchemistry.org
- This was a brain dump and we encourage people to contribute
- We should script this process and run our checks
- We may think about generating a prety report similar to medical health reports
Before you can do good FEP, you need to do good simulations
A health check was made in someone's company where they eyeballed results
- Human judgment got better after the start of project
How can we use previous results to get a sense of what is wrong?
What parts of active site move for ligands in congenerc series?
If part of protein we had not seen move suddenly moved before changes conformations, this is a red flag
- We would like to have simulations where we see things happen a number of times or never
- If we see something occur but not frequently (slow) usch as loop opening, this should raise a red flag
- Look at number of rotamer transitions, lots good, one or few bad.
- Might still have events that are NOT coupled to the binding site, so are they important or can they be ignored?
- We could use markov model to detect the slow degrees of freedom
- Could also look at correlation of slow degrees of freedom and du/dl
Detecting ergodicity problems with simlations vs. gross errors vs. setup errors
- Some pipline that could take PDB and spin up short/single-frame sims from multiple packages and compare. This will catch about 80% of errors
People can look for consistency in thermodynamic pathways
- Building redudancy into workflows to add another layer of error checking
- Replicating the same simulation and varying what should not mater (e.g, different initial velocities)/ We should get the same result, but we often don't
- Initial binding modes, water placement
- Danny used slightly different preps of same protein
- Could run multiple reruns from of the simulation to get error bars (3-5 times)
- Use different starting points
How much does this actually block/interrupt people
- Small typos, can cause 1/2 sims to fail
- There are an infinite number of ways things can go wrong
- Clear to people who are at it the 10th time, harder for 1st time users like people learning
- Also problematic for people who much later on get bored and try to tweak something
- del F = 0 for FE of solution to solution from bad start point
- Having the students failing actually teaches them, so it may be useful
- But this may disuade broader comunity who dont want to deal with it
- Even with checks in the SW, for cases when they set a param phyisically wrong but SW accepts, what then?
- There is a very subtle line between how much checking to do, and when you should have a feel for it
Maybe setting an "Expert flag?"
- Do we want borader community to fiddle with fine params like the Barrostats?
- Having the idea of default settings such as Docking (or TurboTax/TT)
- TT: feed a #, most users with this # do this.
- A similar prompting system may be helpful for people
- Some kind of default progression path
- Maybe Teirs of interfaces like Pheonix for crystalography
- Tooltips will be useful in the setup
Equilibration and correlated samples
- Running longer: You may be bias to stopping if you see an estimation you expect
Things if people don't do, you get annoyed
- Not varying initial conditions
- Bound vs unbound starting structure
- Different binding mode
- Reverse convergence of DeltaG with time
- Energy/volume Drift, energy change fluctuations
- Using the right method (e.g. can't use Zwanzig formula if work distro too large)
- RMSD of protein and ligand
- Some X is bad? or some plateu is okay? we dont know.
- Were motions we expected to sample actually sampled (e.g. loops)
- Repex convergence
- If individual replicas (which should visit all the states) converge to the same answer as all data together, that converges
- Time autocorrelation of the property we are interested in.
***w/o we have no idea of the statistical hyigene
We need a common place to converge
We can also check our convergences against expereiment if we have th eresources for it We need to have best practices we all check, and things we need to check for our users.
Are we getting the right answers for the right reasons?
- do we get the same answer from different programs? different people? different protocols?
- deviation from X-ray structures may indicate problems? how to quantify?
- evaluate crystal quality beforehand, especially ligand density; automated?
- insufficient number of counterions; poor counterion sampling/decorrelation
- what parts of active site move for ligands in a congeneric series? if a part of the protein we hadn’t seen move before changes conformation, that is potentially a red flag
- infrequent events are bad for convergence; how can we detect them? number of rotamer transitions? MSMs? correlations with dV/dlambda?
- detecting ergodicity problems with simulations we’ve run vs. gross errors vs. setup errors
- compare different community simulation pipelines for same input?
- consistency in thermodynamic pathways / cycle closure / redundancy
- replication of the same simulation, varying things we think don’t matter (e.g. velocities, initial binding modes, water placement, slightly different preps of same protein,
bound vs unbound structure); check consistency
- error bar estimates from multiple replicates of same simulation
- “TurboTax” style warnings and sensible default recommendations
- “Phenix”-style choices for modeling, tailoring options to user
- tooltips are useful for users in setup
- reverse convergence of DeltaG with time
- energy drift or large fluctuations; check energy conservation
- using right method (e.g. can’t use Zwanzig formula if work distribution too large)
- RMSD of protein and ligand
- were motions we expected to sample actually sampled? (e.g. loops)
- time autocorrelation of property you are interested in (simple statistical hygiene)
- need at least N independent data points to have some idea of how things are behaving