Week 12 Hands-On: Running an AMBER Simulation of Alanine Dipeptide on Pete

5.4.15. Week 12 Hands-On: Running an AMBER Simulation of Alanine Dipeptide on Pete#

This notebook is a Pete-adapted version of the AMBER alanine dipeptide tutorial:

AMBER tutorial: Simple Simulation of Alanine Dipeptide
https://ambermd.org/tutorials/basic/tutorial0/index.php

ADP

5.4.15.1. Goals#

By the end of this hands-on lesson, you should be able to:

create a working directory for Week 12 on Pete
load the AMBER environment
build alanine dipeptide with tleap (not xleap)
generate parm7 and rst7 files
prepare AMBER input files for minimization, heating, and production
run minimization and MD
do a small amount of post-processing with cpptraj

5.4.15.2. Important adaptation for Pete#

The original tutorial creates a generic Tutorial folder.
For this course, start in:

/projects/tia001/$USER/week12

Note: your prompt said /projects/tia001/$SUER, which appears to be a typo.
In the commands below I use the standard environment variable $USER.

5.4.15.3. Set up your Week 12 working directory on Pete#

Create a week12 directory in our shared project space.

mkdir -p /projects/tia001/$USER/week12

Move into that directory

cd /projects/tia001/$USER/week12

Load the amber module so that your shell has access to the amber executables

module load amber

Check that the amber executables are available

echo $AMBERHOME
which tleap

5.4.15.4. Build alanine dipeptide with `tleap`#

This follows the AMBER tutorial closely:

load the ff14SB protein force field
build alanine dipeptide as ACE-ALA-NME
load the TIP3P water model
solvate in an octahedral box with a 8 Å buffer
save parm7 and rst7

We will do this starting tleap and then executing a few commands in tleap:

tleap

You should see the following (or comparable) output

-I: Adding /opt/amber/18/amber18/dat/leap/prep to search path.
-I: Adding /opt/amber/18/amber18/dat/leap/lib to search path.
-I: Adding /opt/amber/18/amber18/dat/leap/parm to search path.
-I: Adding /opt/amber/18/amber18/dat/leap/cmd to search path.

Welcome to LEaP!
(no leaprc in search path)

We start by source the parameters (force constants, charges, etc - called a force field) that we will use for this tutorial

> source leaprc.protein.ff14SB

You should see the following output:

----- Source: /opt/amber/18/amber18/dat/leap/cmd/leaprc.protein.ff14SB
----- Source of /opt/amber/18/amber18/dat/leap/cmd/leaprc.protein.ff14SB done
Log file: ./leap.log
Loading parameters: /opt/amber/18/amber18/dat/leap/parm/parm10.dat
Reading title:
PARM99 + frcmod.ff99SB + frcmod.parmbsc0 + OL3 for RNA
Loading parameters: /opt/amber/18/amber18/dat/leap/parm/frcmod.ff14SB
Reading force field modification type file (frcmod)
Reading title:
ff14SB protein backbone and sidechain parameters
Loading library: /opt/amber/18/amber18/dat/leap/lib/amino12.lib
Loading library: /opt/amber/18/amber18/dat/leap/lib/aminoct12.lib
Loading library: /opt/amber/18/amber18/dat/leap/lib/aminont12.lib

Now create a diala object in tleap by using the sequence command to connect the ACE residue to an ALA residue to a NME residue.

> diala = sequence { ACE ALA NME }

Now we will add water around our solute. We start by loading the water parameters

> source leaprc.water.tip3p

Add a box of TIP3P water around the solute with a buffer of alteast 8 Angstroms on all sides of the solute.

> solvatebox diala TIP3PBOX 8.0

Save the amber parameter/topology file (parm7) and input coordinates (rst7):

> saveamberparm diala diala_solv.parm7 diala_solv.rst7

Save a pdb:

> savepdb diala diala_solv.pdb

Quit tleap:

> quit

You should now have the following files in your week12 folder:

ls

diala_solv.parm7  diala_solv.pdb  diala_solv.rst7  leap.log

You can also achieve this by copying and pasting the following into the terminal directly (from your week12 folder)

cat > build_diala.in <<'EOF'
source leaprc.protein.ff14SB
diala = sequence { ACE ALA NME }
source leaprc.water.opc
solvateoct diala OPCBOX 10.0
saveamberparm diala parm7 rst7
savepdb diala diala_solvated.pdb
quit
EOF

echo "Contents of build_diala.in:"
cat build_diala.in
echo
echo "Running tleap..."
tleap -f build_diala.in

5.4.15.5. Create AMBER MD input files#

We will create three files:

min.in for minimization
heat.in for heating from 0 K to 300 K
prod_short.in for a short production run suitable for in-class activity

5.4.15.5.1. `min.in` - Minimization input file#

The first stage of performing a molecular dynamics simulation is to make sure your starting structure has a reasonable potential energy. To achieve this, we need to create a file called min.in to specify to amber that we want to do an energy minimization. Create this file by

vi min.in

then, within vi, enter input mode by typing i. Then copy and past the following information into the file (Then save and quit).

Minimize
 &cntrl
  imin=1,
  ntx=1,
  irest=0,
  maxcyc=2000,
  ncyc=1000,
  ntpr=100,
  ntwx=0,
  cut=8.0,
 /

To save and quit, first hit esc to exit input mode, then type :wq and hit enter/return.

If you are unable to get vi to work properly, this can all be achieved by copying and pasting the following into a shell command line and then hitting return.

cat > min.in <<'EOF'
Minimize
 &cntrl
  imin=1,
  ntx=1,
  irest=0,
  maxcyc=2000,
  ncyc=1000,
  ntpr=100,
  ntwx=0,
  cut=8.0,
 /
EOF

Make sure you have properly created a file called min.in in your /projects/tia001/$USER/week12 folder.

5.4.15.5.2. `heat.in` - Heating input file#

The second stage of a typical MD simulation is to heat the system. After minimization, the system is basically at 0K, so we need to heat it to the temperature of interest. We start by creating a heat.in file. This will look similar to the min.in file but with important differences.

vi heat.in

Within vi, enter input mode by typing i then paste the following

Heat
 &cntrl
  imin=0,
  ntx=1,
  irest=0,
  nstlim=10000,
  dt=0.002,
  ntf=2,
  ntc=2,
  tempi=0.0,
  temp0=300.0,
  ntpr=100,
  ntwx=100,
  cut=8.0,
  ntb=1,
  ntp=0,
  ntt=3,
  gamma_ln=2.0,
  nmropt=1,
  ig=-1,
 /
&wt type='TEMP0', istep1=0, istep2=9000, value1=0.0, value2=300.0 /
&wt type='TEMP0', istep1=9001, istep2=10000, value1=300.0, value2=300.0 /
&wt type='END'

Exit input mode by hitting the esc key and then typing :wq and hitting return.

If you cannot get vi to work, you can also achieve this by putting this all in a command line and hitting return

cat > heat.in <<'EOF'
Heat
 &cntrl
  imin=0,
  ntx=1,
  irest=0,
  nstlim=10000,
  dt=0.002,
  ntf=2,
  ntc=2,
  tempi=0.0,
  temp0=300.0,
  ntpr=100,
  ntwx=100,
  cut=8.0,
  ntb=1,
  ntp=0,
  ntt=3,
  gamma_ln=2.0,
  nmropt=1,
  ig=-1,
 /
&wt type='TEMP0', istep1=0, istep2=9000, value1=0.0, value2=300.0 /
&wt type='TEMP0', istep1=9001, istep2=10000, value1=300.0, value2=300.0 /
&wt type='END' /
EOF

5.4.15.5.3. `prod.in` - Production run input file#

The next phase of MD simulation is the production (or sometimes equilibration) run. This will generate the data we will analyze.

Again, vi and put in the following:

vi prod.in

Production
 &cntrl
  imin=0,
  ntx=5,
  irest=1,
  nstlim=50000,
  dt=0.002,
  ntf=2,
  ntc=2,
  temp0=300.0,
  ntpr=500,
  ntwx=500,
  cut=8.0,
  ntb=2,
  ntp=1,
  ntt=3,
  barostat=1,
  gamma_ln=2.0,
  ig=-1,
 /

At dt = 0.002 ps, the run length is:

time (ps) = nstlim × 0.002

So:

nstlim = 50000 gives 100 ps
nstlim = 5000000 gives 10 ns (tutorial-scale)

5.4.15.6. Review the input files#

cd /projects/tia001/$USER/week12

echo "===== 01_Min.in ====="
cat min.in
echo
echo "===== 02_Heat.in ====="
cat heat.in
echo
echo "===== 03_Prod_short.in ====="
cat prod.in

5.4.15.7. Run minimization#

Create the a job submit script called min.bash or something similar. Put the following into the file using vi (or cat if you need to).

#!/bin/bash
#SBATCH --job-name=min
#SBATCH --output=min_job.out
#SBATCH --error=min_job.err
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks=1

module load amber

sander -O -i min.in -o min.out -p diala_solv.parm7 -c diala_solv.rst7 -r min.ncrst -inf min.mdinfo

Once you have created min.bash with that info, submit the job:

sbatch min.bash

5.4.15.8. Inspect the minimization output#

You should see the energy decrease during minimization.

echo "Last 40 lines of min.out:"
tail -n 40 min.out

5.4.15.9. Run heating#

Create a submit script for the heating step called heat.bash. This submit script reads heat.in and uses the output coordinates from the minimization step (min.ncrst).

#!/bin/bash
#SBATCH --job-name=heat
#SBATCH --output=heat_job.out
#SBATCH --error=heat_job.err
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks=1

module load amber

sander -O \
  -i heat.in \
  -o heat.out \
  -p diala_solv.parm7 \
  -c min.ncrst \
  -r heat.ncrst \
  -x heat.nc \
  -inf heat.mdinfo

Submit the job using the command

sbatch heat.bash

5.4.15.10. Inspect the heating output#

Look for temperature increasing during the run.

grep -n "NSTEP" heat.out | head
echo
tail -n 60 heat.out

5.4.15.11. Run production MD#

Create a job submit script called prod.bash that contains the following

#!/bin/bash
#SBATCH --job-name=prod
#SBATCH --output=prod_job.out
#SBATCH --error=prod_job.err
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks=1

module load amber

pmemd -O \
  -i prod.in \
  -o prod.out \
  -p diala_solv.parm7 \
  -c heat.ncrst \
  -r prod.ncrst \
  -x prod.nc \
  -inf prod.mdinfo

submit the job with the command

sbatch prod.bash

5.4.15.12. Check the production outputs#

ls prod.*

5.4.15.13. Basic trajectory analysis with `cpptraj`#

The original tutorial includes basic post-processing.
Here we will do two simple analyses:

RMSD relative to the first production frame
backbone dihedral angles phi and psi for alanine dipeptide

For alanine dipeptide, phi/psi are especially useful because they show the conformational sampling of the molecule.

The cpptraj input file to calcualte the RMSD relative to the initial frame can be created and run using the following bash command (copy and paste directly into the command line):

cat > cpptraj_rmsd.in <<'EOF'
parm diala_solv.parm7
trajin prod.nc
rms first out rmsd.dat @C,CA,N
run
quit
EOF

cpptraj -i cpptraj_rmsd.in

The cpptraj file to calcualte $\phi$ and $\psi$ from the trajectory can be created and run using the following:

cat > cpptraj_phipsi.in <<'EOF'
parm diala_solv.parm7
trajin prod.nc
dihedral phi  :1@C :2@N :2@CA :2@C out phi.dat
dihedral psi  :2@N :2@CA :2@C :3@N out psi.dat
run
quit
EOF

cpptraj -i cpptraj_phipsi.in

5.4.16. Plot RMSD and the `phi`/`psi` time series in Python#

You will need to transfer rmsd.dat, phi.dat and psi.dat to your local machine (laptop/desktop) and put in the folder containing this jupyter notebook in order to run the following plotting code.

import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

work = Path(f"/projects/tia001/{Path.home().name}/week12")

rmsd = np.loadtxt("rmsd.dat", comments="#")
phi = np.loadtxt("phi.dat", comments="#")
psi = np.loadtxt("psi.dat", comments="#")

fig = plt.figure(figsize=(6,4))
plt.plot(rmsd[:,0], rmsd[:,1], '-o')
plt.xlabel("Frame")
plt.ylabel("RMSD (Å)")
plt.title("Alanine dipeptide production RMSD")
plt.tight_layout()
plt.show()

# plot phi psi scatter plot
fig, ax = plt.subplots(figsize=(4,4))

ax.scatter(phi[:,1], psi[:,1], s=3)

ax.set_xlabel(r"$\phi$")
ax.set_ylabel(r"$\psi$")
ax.set_title("Alanine dipeptide backbone dihedrals")

# Exact domain
ax.set_xlim(-180, 180)
ax.set_ylim(-180, 180)

# Remove padding
ax.margins(0)
# for tick marks
ax.set_xticks([-180, -90, 0, 90, 180])
ax.set_yticks([-180, -90, 0, 90, 180])
# grid
ax.grid(True, alpha=0.3)
ax.tick_params(labelsize=10)
# THIS is the key fix (stronger than set_aspect alone)
ax.set_box_aspect(1)

plt.show()

../../_images/e73b08d13d9ac22291532e43c75de43e40853e16b7148c34950468b20a96818e.png

../../_images/e899d0a11a616cc72adf088591423d87448829ab9150464ac13263fc81987d3c.png

5.4.16.1. Short assignment#

Submit answers to the following questions in a word or pdf document to Canvas

What files are produced by tleap, and what is the role of each?
What is the purpose of the minimization stage before heating?
Why do we heat gradually instead of starting immediately at 300 K?
How would you change nstlim to run a 1 ns simulation at dt = 0.002 ps?
Make a plot of RMSD vs trajectory time for your trajectory and describe what you see.
Make a scatter plot of phi and psi. Based on the phi and psi plot, does alanine dipeptide appear to remain in one conformational region or sample multiple regions?