From ba8add62b3e0d99467bca0bf355c02a0304a04db Mon Sep 17 00:00:00 2001 From: Anna Kravchenko Date: Tue, 13 Jan 2026 08:30:21 +0100 Subject: [PATCH 1/4] upd index, md and docking --- education/molmod_online/docking.md | 35 ++----------------------- education/molmod_online/index.md | 2 +- education/molmod_online/simulation.md | 37 +++++++++++++++++---------- 3 files changed, 26 insertions(+), 48 deletions(-) diff --git a/education/molmod_online/docking.md b/education/molmod_online/docking.md index 21d233fd..c067aa6c 100644 --- a/education/molmod_online/docking.md +++ b/education/molmod_online/docking.md @@ -204,39 +204,8 @@ solely on the evolutionary conservation analysis? ### Predicting interface residues -Besides sequence conservation, other features can be used to predict possible interfaces on protein -structures. For example, certain residues tend to be overrepresented at protein-protein interfaces. -This information, combined with evolutionary conservation and with a surface clustering algorithm -that finds groups of surface residues meeting both the previous criteria results in reasonably -accurate predictions. This is the basis of the -[WHISCY](https://wenmr.science.uu.nl/whiscy/){:target="_blank"} server. A more advanced -predictor, the [CPORT](https://alcazar.science.uu.nl/services/CPORT/){:target="_blank"} web server, judiciously -combines (up to) 6 different predictors to provide a consensus prediction that is more robust and -more reliable than any of the individual predictors alone. CPORT was designed to provide -predictions for HADDOCK. The server also returns a PDB file of the -original structure loaded with the predictions in the temperature factor column. This is extremely -helpful to visualize the predictions in PyMOL. - - - Submit the homology model of mouse MDM2 to the CPORT web server and load the resulting PDB file -in Pymol. - - - spectrum b, cyan_red, cport - - - Do the predictions highlight a particular region of the homology model? - - - Note down the list of residues predicted by CPORT to be part of an interface. - - -Many tools in science are developed by dedicated PhD students and postdocs. Unfortunately, over time, some of these tools may become unavailable as maintaining and supporting them requires significant time and effort. In such cases, it may be necessary to transition to alternative tools. - -### Obtain known interfaces of homologous proteins - -Another way to obtain information about possible interface residues is by analysing known interfaces found in **homologous** proteins. -This can easily be performed by [ARCTIC-3D](https://wenmr.science.uu.nl/arctic3d/){:target="_blank"}, a [new tool](https://www.nature.com/articles/s42003-023-05718-w){:target="_blank"} dedicated to an automatic retrieval and clustering of interfaces in complexes from 3D structural information. +Besides sequence conservation, another way to obtain information about possible interface residues is by analysing known interfaces found in **homologous** proteins. +This can easily be performed by [ARCTIC-3D](https://wenmr.science.uu.nl/arctic3d/){:target="_blank"}, a [tool](https://www.nature.com/articles/s42003-023-05718-w){:target="_blank"} dedicated to an automatic retrieval and clustering of interfaces in complexes from 3D structural information. As structural information of the human MDM2 interacting with other partners is available, ARCTIC-3D will extract interacting residues and cluster them into binding surfaces. Not all residues of a binding surface are relevant, as some amino acids may be rarely present among the interfaces that define that patch. Wisely define a probability threshold and note down the residue indices, as you will need them to define *active* residues in HADDOCK. diff --git a/education/molmod_online/index.md b/education/molmod_online/index.md index 785094b0..00e08e39 100644 --- a/education/molmod_online/index.md +++ b/education/molmod_online/index.md @@ -51,7 +51,7 @@ By the end of this tutorial, you should know the steps involved in: ### Part 3: [Protein-peptide data-driven docking](/education/molmod_online/docking) The third module introduces protein-peptide docking using the [HADDOCK2.4 web server](https://wenmr.science.uu.nl/haddock2.4/). -It also introduces the CPORT web server for interface prediction, based on evolutionary conservation and other biophysical properties. +It also introduces the ARCTIC3D web server for interface prediction, based on clustering of structural protein interface information. By the end of this tutorial, you should know how to: * setup a HADDOCK run diff --git a/education/molmod_online/simulation.md b/education/molmod_online/simulation.md index 281731a4..e0710402 100644 --- a/education/molmod_online/simulation.md +++ b/education/molmod_online/simulation.md @@ -155,15 +155,19 @@ Take your time to know your system and what particularities its simulation entai To run the actual simulation, you will need access to a computing cluster. Running on your laptop is likely to take far too long. In our hands, the simulations of this system take ~2 full days on 18 CPU cores in our dedicated cluster. + +You may have noticed that NMRBox is in the process of migrating its virtual machines from Ubuntu 20 to Ubuntu 24. The “Selecting an initial structure” section of this course was developed with Ubuntu 20 in mind and is currently not functional under Ubuntu 24. +However, Ubuntu 24 can be used for the rest of this part of the course. + In NMRBox, after you open the terminal prompt you notice `username@machine`, where your username is the same as the NMRbox username. -You will find your own copy of the course material in `~/EVENTS/2025-struct-bioinfo-uu/` directory. +You will find your own copy of the course material in `~/EVENTS/2026-struct-bioinfo-uu/` directory. You can store your data in your `home` directory but we recommend creating a new directory where you will store your data and work in. __Note__: The data are automatically copied to your home directory under the `EVENTS` directory provided you have registered for this event on NMRBox. The event can be found at [https://nmrbox.nmrhub.org/events](https://nmrbox.nmrhub.org/events){:target="_blank"}. In order to register for the course you need to have an NMRBox account. -__Note__: In case you are following this tutorial on your own, you will have to manually copy all the required data and edit possibly some files to correct the paths (e.g. the `setup.sh` and the `bashrc` scripts). The data for the course can be found once logged in into a VM in the following directory: `/public/EVENTS/2025-struct-bioinfo-uu/`.This directory will however automatically be copied to your home directory when you register for the course on NMRBox +__Note__: In case you are following this tutorial on your own, you will have to manually copy all the required data and edit possibly some files to correct the paths (e.g. the `setup.sh` and the `bashrc` scripts). The data for the course can be found once logged in into a VM in the following directory: `/public/EVENTS/2026-struct-bioinfo-uu/`.This directory will however automatically be copied to your home directory when you register for the course on NMRBox Open the terminal and create a directory where you will work in with name of your choice: @@ -351,25 +355,29 @@ these atoms when reading the structure and (re)generate their coordinates using parameters defined in the force field. Also, the program allows the user to define the status of the termini of the molecule through the `-ter` flag. Termini can be either charged (e.g. NH3+ and COO-), uncharged (e.g. NH2 and COOH), or -capped by an additional chemical group (e.g. N-terminal acetyl and C-terminal amide). This is very -important since leaving the termini charged (default) can lead to artificial charge-charge -interactions, particular in small molecules. If a peptide is part of a larger structure, then it -makes sense to cap the termini in order to neutralize their charge, as it would happen in reality. +capped by an additional chemical group (e.g. N-terminal acetyl and C-terminal amide). + + +This is very important since leaving the termini charged (default) can lead to artificial charge-charge +interactions, particular in small-sized molecules. + + +If a peptide is part of a larger structure, then it makes sense to cap the termini in order to neutralize their charge, +as it would happen in reality. Terminal capping should be performed prior to topology generation using the `pdb_cap.py` script. This script replaces the first residue with an ACE cap and the last residue with an NME cap by modifying atom and residue names in the PDB file, making them compatible with the CHARMM36m force field. -For capping to work correctly, the input structure must include one additional residue at both the N- and C-termini +For capping to work correctly, **the input structure must include one additional residue** at both the N- and C-termini (i.e. residues *−1* and *N+1* relative to the peptide of interest). These residues act as placeholders and will be converted into caps. In practice, we add two glycine residues, one at each end of the peptide sequence, before capping. -Capping is performed with: - +Capping is performed with a python script `$MOLMOD_BIN/pdb_cap.py`, read it help message to learn how to use it: -python3.10 $MOLMOD_BIN/pdb_cap.py --pdb peptide_helix.pdb --cap +python3.10 $MOLMOD_BIN/pdb_cap.py -h -The script produces a new file named peptide_helix_capped.pdb, which should then be used as input for pdb2gmx. -Once capped, pdb2gmx will recognize the ACE and NME residues automatically when using the CHARMM36m force field. +The script will produce a new file, which should then be used as input for pdb2gmx. +Once capped, `pdb2gmx` will recognize the ACE and NME residues automatically when using the CHARMM36m force field. Read through the output of `pdb2gmx` and check the choices the program made for histidine protonation states and the resulting charge of the peptide. @@ -420,7 +428,8 @@ Protein 3 -Look at the partial charge that each atom carries (column 7) and note the differences between different types of atom. +Look at the partial charge that each atom carries (column 7) and note the differences between different types of atom. +Note that displayed file was generated using the default settings. @@ -718,7 +727,7 @@ Despite dissipating most of the strain in the system, energy minimization does n temperature, and therefore velocities and kinetic energy. When first running molecular dynamics, the algorithm assigns velocities to the atoms, which again stresses the system and might cause the simulation to become unstable. To avoid possible instabilities, the preparation setup here -described includes several stages of molecular dynamics that progressively remove constraints on +describes several stages of molecular dynamics that progressively remove constraints on the system and as such, let it slowly adapt to the conditions in which the production simulation will run. From d749863005d847595da7d96ecd5cbf0860c1e289 Mon Sep 17 00:00:00 2001 From: Anna Kravchenko Date: Tue, 13 Jan 2026 14:48:31 +0100 Subject: [PATCH 2/4] Alex's comments --- education/molmod_online/docking.md | 8 +++++++- education/molmod_online/simulation.md | 6 +++--- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/education/molmod_online/docking.md b/education/molmod_online/docking.md index c067aa6c..dd2d9d81 100644 --- a/education/molmod_online/docking.md +++ b/education/molmod_online/docking.md @@ -204,7 +204,13 @@ solely on the evolutionary conservation analysis? ### Predicting interface residues -Besides sequence conservation, another way to obtain information about possible interface residues is by analysing known interfaces found in **homologous** proteins. +Besides sequence conservation, other features can be used to predict possible interfaces on protein structures. For example, certain residues tend to be overrepresented at protein-protein interfaces. This information, combined with evolutionary conservation and with a surface clustering algorithm that finds groups of surface residues meeting both the previous criteria results in reasonably accurate predictions. This is the basis of the [WHISCY](https://wenmr.science.uu.nl/whiscy/){:target="_blank"} server. A more advanced predictor, the [CPORT](https://alcazar.science.uu.nl/services/CPORT/){:target="_blank"} web server, judiciously combines (up to) 6 different predictors to provide a consensus prediction that is more robust and more reliable than any of the individual predictors alone. + +Many tools in science are developed by dedicated PhD students and postdocs. Unfortunately, over time, some of these tools may become unavailable as maintaining and supporting them requires significant time and effort. In such cases, it may be necessary to use alternative tools. + +### Obtain known interfaces of homologous proteins + +Another way to obtain information about possible interface residues is by analysing known interfaces found in **homologous** proteins. This can easily be performed by [ARCTIC-3D](https://wenmr.science.uu.nl/arctic3d/){:target="_blank"}, a [tool](https://www.nature.com/articles/s42003-023-05718-w){:target="_blank"} dedicated to an automatic retrieval and clustering of interfaces in complexes from 3D structural information. As structural information of the human MDM2 interacting with other partners is available, ARCTIC-3D will extract interacting residues and cluster them into binding surfaces. Not all residues of a binding surface are relevant, as some amino acids may be rarely present among the interfaces that define that patch. Wisely define a probability threshold and note down the residue indices, as you will need them to define *active* residues in HADDOCK. diff --git a/education/molmod_online/simulation.md b/education/molmod_online/simulation.md index e0710402..721601be 100644 --- a/education/molmod_online/simulation.md +++ b/education/molmod_online/simulation.md @@ -157,7 +157,7 @@ Take your time to know your system and what particularities its simulation entai You may have noticed that NMRBox is in the process of migrating its virtual machines from Ubuntu 20 to Ubuntu 24. The “Selecting an initial structure” section of this course was developed with Ubuntu 20 in mind and is currently not functional under Ubuntu 24. -However, Ubuntu 24 can be used for the rest of this part of the course. +However, Ubuntu 24 can be used for the remaining of this part of the course. In NMRBox, after you open the terminal prompt you notice `username@machine`, where your username is the same as the NMRbox username. @@ -371,7 +371,7 @@ For capping to work correctly, **the input structure must include one additional (i.e. residues *−1* and *N+1* relative to the peptide of interest). These residues act as placeholders and will be converted into caps. In practice, we add two glycine residues, one at each end of the peptide sequence, before capping. -Capping is performed with a python script `$MOLMOD_BIN/pdb_cap.py`, read it help message to learn how to use it: +Capping is performed with a python script `$MOLMOD_BIN/pdb_cap.py`, read it's help message to learn how to use it: python3.10 $MOLMOD_BIN/pdb_cap.py -h @@ -727,7 +727,7 @@ Despite dissipating most of the strain in the system, energy minimization does n temperature, and therefore velocities and kinetic energy. When first running molecular dynamics, the algorithm assigns velocities to the atoms, which again stresses the system and might cause the simulation to become unstable. To avoid possible instabilities, the preparation setup here -describes several stages of molecular dynamics that progressively remove constraints on +describes by includes several stages of molecular dynamics that progressively remove constraints on the system and as such, let it slowly adapt to the conditions in which the production simulation will run. From 1760e68604a53312eba8bc4e2e83e8f4f70c1b69 Mon Sep 17 00:00:00 2001 From: Anna Kravchenko Date: Tue, 13 Jan 2026 15:21:57 +0100 Subject: [PATCH 3/4] Danai's comment --- education/molmod_online/simulation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/education/molmod_online/simulation.md b/education/molmod_online/simulation.md index 721601be..4ec12c20 100644 --- a/education/molmod_online/simulation.md +++ b/education/molmod_online/simulation.md @@ -726,8 +726,8 @@ gmx mdrun -v -deffnm peptide-EM-solvated Despite dissipating most of the strain in the system, energy minimization does not consider temperature, and therefore velocities and kinetic energy. When first running molecular dynamics, the algorithm assigns velocities to the atoms, which again stresses the system and might cause the -simulation to become unstable. To avoid possible instabilities, the preparation setup here -describes by includes several stages of molecular dynamics that progressively remove constraints on +simulation to become unstable. To avoid possible instabilities, the preparation setup described +here includes several stages of molecular dynamics that progressively remove constraints on the system and as such, let it slowly adapt to the conditions in which the production simulation will run. From 15447335da66cfbfaf557a9a27d00008b2f6eb10 Mon Sep 17 00:00:00 2001 From: Anna Kravchenko Date: Wed, 14 Jan 2026 16:18:56 +0100 Subject: [PATCH 4/4] info-prompt to check CPU load --- education/molmod_online/simulation.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/education/molmod_online/simulation.md b/education/molmod_online/simulation.md index 4ec12c20..aefacc9d 100644 --- a/education/molmod_online/simulation.md +++ b/education/molmod_online/simulation.md @@ -952,7 +952,11 @@ with your name or initials. - Run the production MD! This will take some time, from a few hours to a few days - depending on the amount of computing resources available. + Run the production MD! This will take some time, from a few hours to a few days - depending on the amount of computing resources available. + + + + MD simulations will run faster if a VM has all or most of its CPU available. You can check the CPU load for all NMRBox VMs at [https://nmrbox.org/user-dashboard](https://nmrbox.org/user-dashboard){:target="_blank"} by looking at the “CPU Load” column.