Skip to content

Commit e0bb471

Browse files
IAlibayatravitz
andauthored
Adding details about --n-protocol-repeats to CLI tutorial (#213)
* Start adding details about --n-protocol-repeats * Update rbfe_tutorial/cli_tutorial.md * Apply suggestions from code review * update result tree * update language * updating output example * =1 -> 1 --------- Co-authored-by: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> Co-authored-by: Alyssa Travitz <alyssa.travitz@omsf.io>
1 parent cdc7c72 commit e0bb471

File tree

1 file changed

+67
-42
lines changed

1 file changed

+67
-42
lines changed

rbfe_tutorial/cli_tutorial.md

Lines changed: 67 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ running the simulations and gathering the results are the same.
3838
With the single command:
3939

4040
```bash
41-
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup
41+
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1
4242
```
4343

4444
we do the following:
@@ -49,6 +49,11 @@ we do the following:
4949
- Pass a PDB of the protein target (TYK2) with `-p tyk2_protein.pdb`.
5050
- Instruct `openfe` to output files into a directory called `network_setup`
5151
with the `-o network_setup` option.
52+
- Instruct `openfe` to only run one repeat of the alchemical simulation per
53+
`quickrun` call using `--n-protocol-repeats 1`.
54+
**Note:** `openfe`'s default behaviour is to use three
55+
repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. When
56+
setting `--n-protocol-repeats 1`, you must execute the transformation multiple times - at minimum 2, but best practie is 3 independent repeats.
5257

5358
Planning the campaign may take some time due to the complex series of tasks involved:
5459

@@ -61,7 +66,7 @@ The partial charge generation can take advantage of multiprocessing which offers
6166
the number of processors available using the `-n` flag:
6267

6368
```bash
64-
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -n 4
69+
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -n 4
6570
```
6671

6772
This will result in a directory called `network_setup/`, which is structured like this:
@@ -80,7 +85,7 @@ network_setup
8085
├── rbfe_lig_ejm_31_complex_lig_ejm_50_complex.json
8186
├── rbfe_lig_ejm_31_solvent_lig_ejm_42_solvent.json
8287
├── rbfe_lig_ejm_31_solvent_lig_ejm_46_solvent.json
83-
[continues]
88+
...
8489
```
8590

8691
The `ligand_network.graphml` file describes the atom mappings between the
@@ -146,7 +151,7 @@ partial_charge:
146151
2. Plan your rbfe network with an additional `-s` flag for passing the settings:
147152

148153
```bash
149-
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -s settings.yaml
154+
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -s settings.yaml
150155
```
151156

152157
3. The output of the CLI program will now reflect the changes made:
@@ -156,17 +161,19 @@ RBFE-NETWORK PLANNER
156161
______________________
157162
158163
Parsing in Files:
159-
Got input:
160-
Small Molecules: SmallMoleculeComponent(name=lig_ejm_54) SmallMoleculeComponent(name=lig_jmc_23) SmallMoleculeComponent(name=lig_ejm_47) SmallMoleculeComponent(name=lig_jmc_27) SmallMoleculeComponent(name=lig_ejm_46) SmallMoleculeComponent(name=lig_ejm_31) SmallMoleculeComponent(name=lig_ejm_42) SmallMoleculeComponent(name=lig_ejm_50) SmallMoleculeComponent(name=lig_ejm_45) SmallMoleculeComponent(name=lig_jmc_28) SmallMoleculeComponent(name=lig_ejm_55) SmallMoleculeComponent(name=lig_ejm_43) SmallMoleculeComponent(name=lig_ejm_48)
161-
Protein: ProteinComponent(name=)
162-
Cofactors: []
163-
Solvent: SolventComponent(name=O, Na+, Cl-)
164+
Got input:
165+
Small Molecules: SmallMoleculeComponent(name=lig_ejm_31) SmallMoleculeComponent(name=lig_ejm_42) SmallMoleculeComponent(name=lig_ejm_43) SmallMoleculeComponent(name=lig_ejm_46) SmallMoleculeComponent(name=lig_ejm_47) SmallMoleculeComponent(name=lig_ejm_48) SmallMoleculeComponent(name=lig_ejm_50) SmallMoleculeComponent(name=lig_jmc_23) SmallMoleculeComponent(name=lig_jmc_27) SmallMoleculeComponent(name=lig_jmc_28)
166+
Protein: ProteinComponent(name=)
167+
Cofactors: []
168+
Solvent: SolventComponent(name=O, Na+, Cl-)
164169
165170
Using Options:
166-
Mapper: <kartograf.atom_mapper.KartografAtomMapper object at 0x7fea079de790>
167-
Mapping Scorer: <function default_lomap_score at 0x7fea1b423d80>
168-
Networker: functools.partial(<function generate_maximal_network at 0x7fea18371260>)
169-
Partial Charge Generation: am1bccelf10
171+
Mapper: <LomapAtomMapper (time=20, threed=True, max3d=1.0, element_change=True, seed='', shift=False)>
172+
Mapping Scorer: <function default_lomap_score at 0x166bc5300>
173+
Network Generation: <function generate_minimal_spanning_network at 0x16a413e20>
174+
Partial Charge Generation: am1bcc
175+
176+
n_protocol_repeats=1 (1 simulation repeat(s) per transformation)
170177
```
171178

172179
That concludes the straightforward process of tailoring your OpenFE setup to your specifications.
@@ -214,7 +221,7 @@ where `path/to/transformation.json` is the path to one of the files created abov
214221

215222
When running a complete network of simulations, it is important to ensure that
216223
the file name for the result JSON and name of the working directory are
217-
different for each leg, otherwise you'll overwrite results. We recommend doing
224+
different for each leg and each repeat, otherwise you'll overwrite results. We recommend doing
218225
that with something like the following, which uses the fact that the JSON files
219226
in `network_setup/transformations/` have unique names, and creates directories
220227
and result JSON files based on those names. To run all legs sequentially (not
@@ -225,7 +232,10 @@ recommended) you could do something like:
225232
for file in network_setup/transformations/*.json; do
226233
relpath=${file:30} # strip off "network_setup/transformations/"
227234
dirpath=${relpath%.*} # strip off final ".json"
228-
openfe quickrun $file -o results/$relpath -d results/$dirpath
235+
# loop over three repeats
236+
for repeat in {1..3}; do
237+
openfe quickrun $file -o results/repeat${repeat}/$relpath -d results/repeat${repeat}/$dirpath
238+
done
229239
done
230240
```
231241

@@ -241,10 +251,12 @@ and submit a job script for the simplest SLURM use case:
241251
for file in network_setup/transformations/*.json; do
242252
relpath=${file:30} # strip off "network_setup/transformations/"
243253
dirpath=${relpath%.*} # strip off final ".json"
244-
jobpath="network_setup/transformations/${dirpath}.job"
245-
cmd="openfe quickrun $file -o results/$relpath -d results/$dirpath"
246-
echo -e "#!/usr/bin/env bash\n${cmd}" > $jobpath
247-
sbatch $jobpath
254+
for repeat in {1..3}; do
255+
jobpath="network_setup/transformations/${dirpath}_${repeat}.job"
256+
cmd="openfe quickrun $file -o results/repeat${repeat}/$relpath -d results/repeat${repeat}/$dirpath"
257+
echo -e "#!/usr/bin/env bash\n${cmd}" > $jobpath
258+
sbatch $jobpath
259+
done
248260
done
249261
```
250262

@@ -273,29 +285,42 @@ openfe
273285

274286
```text
275287
results
276-
├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex
277-
│   ├── shared_RelativeHybridTopologyProtocolUnit-3ea82011-75f0-4bb6-b415-e7d05bd012f6
278-
│   │   ├── checkpoint.nc
279-
│   │   └── simulation.nc
280-
│   ├── shared_RelativeHybridTopologyProtocolUnit-5262feb6-cb50-4bb2-90a2-359810c2bb9c
281-
│   │   ├── checkpoint.nc
282-
│   │   └── simulation.nc
283-
│   └── shared_RelativeHybridTopologyProtocolUnit-7a6def34-2967-4452-8d47-483bc7219c06
284-
│   ├── checkpoint.nc
285-
│   └── simulation.nc
286-
├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json
287-
├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex
288-
│   ├── shared_RelativeHybridTopologyProtocolUnit-ad113e55-5636-474e-9be3-ee77fe887e77
289-
│   │   ├── checkpoint.nc
290-
│   │   └── simulation.nc
291-
│   ├── shared_RelativeHybridTopologyProtocolUnit-ca74ad3c-2ac8-4961-be7c-fa802a1ec76b
292-
│   │   ├── checkpoint.nc
293-
│   │   └── simulation.nc
294-
│   └── shared_RelativeHybridTopologyProtocolUnit-f848e671-fdd3-4b8d-8bd2-6eb5140e3ed3
295-
│   ├── checkpoint.nc
296-
│   └── simulation.nc
297-
├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex.json
298-
[continues]
288+
├── replicate_0
289+
│   ├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex
290+
│   │   ├── shared_RelativeHybridTopologyProtocolUnit-79c279f04ec84218b7935bc0447539a9_attempt_0
291+
│   │   │   ├── checkpoint.nc
292+
│   │   │   ├── db.json
293+
│   │   │   ├── simulation_real_time_analysis.yaml
294+
│   │   │   └── simulation.nc
295+
│   │   ├── shared_RelativeHybridTopologyProtocolUnit-a3cef34132aa4e9cbb824fcbcd043b0e_attempt_0
296+
│   │   │   ├── checkpoint.nc
297+
│   │   │   ├── db.json
298+
│   │   │   ├── simulation_real_time_analysis.yaml
299+
│   │   │   └── simulation.nc
300+
│   │   └── shared_RelativeHybridTopologyProtocolUnit-abb2b104151c45fc8b0993fa0a7ee0af_attempt_0
301+
│   │   ├── checkpoint.nc
302+
│   │   ├── db.json
303+
│   │   ├── simulation_real_time_analysis.yaml
304+
│   │   └── simulation.nc
305+
│   ├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json
306+
│   ├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex
307+
│   │   ├── shared_RelativeHybridTopologyProtocolUnit-361500fe831c431aa830efd207db0955_attempt_0
308+
│   │   │   ├── checkpoint.nc
309+
│   │   │   ├── db.json
310+
│   │   │   ├── simulation_real_time_analysis.yaml
311+
│   │   │   └── simulation.nc
312+
│   │   ├── shared_RelativeHybridTopologyProtocolUnit-5a6176cfbf074f92bc76caac91b1c1bf_attempt_0
313+
│   │   │   ├── checkpoint.nc
314+
│   │   │   ├── db.json
315+
│   │   │   ├── simulation_real_time_analysis.yaml
316+
│   │   │   └── simulation.nc
317+
│   │   └── shared_RelativeHybridTopologyProtocolUnit-e16de73f07964e9096f34611e0c874ca_attempt_0
318+
│   │   ├── checkpoint.nc
319+
│   │   ├── db.json
320+
│   │   ├── simulation_real_time_analysis.yaml
321+
│   │   └── simulation.nc
322+
│   ├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex.json
323+
...
299324
```
300325

301326
The JSON results file contains not only the calculated $\Delta G$, and

0 commit comments

Comments
 (0)