Temperature Replica Exchange

Table of Contents

1 Introduction

This example shows how to use UWHAM and RE-SWHAM to analyze the data generated by temperature replica exchange (RE) simulations.

1.1 System

The goal of this practice is to construct the free energy profile (the Ramachandran plot) of alanine dipeptide (AlaD) in implicit solvent. This time the advanced sampling algorithm that we used is the temperature replica exchange method. Like the second example, the simulations were preformed by using the GROMACS 5.1.2 simulation package with the Amber99SB force field and the OBC GB implicit solvent model. In this example, the replica exchange simulations were run at \(10\) temperatures for \(800\ ns\). The temperatures of the RE simulations are \(300\ K\), \(317.52\ K\), \(336.063\ K, 355.689\ K\), \(376.462\ K\), \(398.447\ K\), \(421.716\ K\), \(446.345\ K\), \(472.411\ K\), \(500\ K\). The configurations of AlaD molecules were recorded every \(1\ ps\). Therefore, there are \(10\) \(\lambda\)-states in this system, and the simulation at each \(\lambda\)-state generated \(800,001\) data points (including the initial structure).

2 Files

Please download all these files into a same directory.

2.1 Data Files

  • data.tgz: Unzipping this file generates a directory called "Data" which contains \(10\) data files. Each data file contains four columns of data. The numbers in the first column are the indexes of data points. The data in the second column and the third columns are the time sequence of the \(\phi\) and \(\psi\) dihedral angles of AlaD respectively. The last column is the total potential energy of the system in unit of \(kJ/mol\).
  • info.dat: This file contains two columns of data. The first column lists the relative path of each data file. The second column lists the temperature of each \(\lambda\)-state.

2.2 Python Scripts

  • prepare.py: Python script generates the input file for UWHAM.
  • 2DHistogram.py: Python script constructs a two dimensional histogram.

2.3 Others

  • AlaD_re_uwham.gnu: input file for gnuplot to make pictures based on the UWHAM analysis.
  • AlaD_re_swham.gnu: input file for gnuplot to make pictures based on the SWHAM analysis.

3 UWHAM Analysis

Please read the manual of the UWHAM program before practicing.

  1. The first step is to construct the biasing matrix and create the input file of UWHAM. As we explained previously, there are \(10\) \(\lambda\)-states. Therefore, the input file contains \(10\) data blocks. The first line of each block is a comment line starting with a "#" sign. After the comment line, there is a matrix contains \(800,001\) rows (corresponding to the \(800,001\) data points observed at this \(\lambda\)-state) and \(10\) columns (corresponding to the total number of \(\lambda\)-states). The matrix element at the \(j^{th}\) row and the \(k^{th}\) column represents the potential energy at the \(k^{th}\) \(\lambda\)-state for the \(j^{th}\) observation in unit of \(k_B T\)

    \begin{equation} U_b[j][k] = \frac{E(u_j)}{k_B T[k]} \,, \end{equation}

    where \(E(u_j)\) is the potential energy of the \(j^{th}\) observation, and \(T[k]\) is the temperature of the \(k^{th}\) \(\lambda\)-state. The input for UWHAM can be generated by running the command

    ./prepare.py -f info.dat > input4uwham.dat 
    
  2. Run UWHAM

    /path of UWHAM/uwham.o -d input4uwham.dat -q 10 -m 1 -e 1.0e-10 -w 1
    

    The explanations for options are as follows:

    -d input4uwham.dat read the file "input4uwham.dat" for the biasing matrix
    -q 10 -m 1 run the self-consistent iteration 10 times, then switch to the Convex Newton-Raphson iteration
    -e 1.0e-10 stop after the relative changes of the unitless free energy estimates are smaller than 1.0e-10
    -w 1 write the weight of each data point at the first \(\lambda\)-state (\(T=300K\)) into a file called "weight.data"
  3. After we obtained the probability of observing each observation at the first \(\lambda\)-state, we need to construct a file called "phipsi.dat" which contains the \(\phi\) and \(\psi\) values of each observation. This can be done by running

    for i in {01..10}
    do
    perl -wnal -e ' printf("%12g %12g\n", $F[1]*3.1415926/180.0, $F[2]*3.1415926/180.0) ' Data/state$i.dat >> phipsi.dat
    done
    
  4. Plot the potential energy
    • First we construct a two dimensional histogram based on the results of the UWHAM analysis and \(\phi\) and \(\psi\) values of each observation by running command line

      ./2DHistogram.py -f phipsi.dat -i 1,2 -w weights.data -j 1 -n -3.1415926,-3.1415926 -x 3.1415926,3.1415926 -b 90,90 -e -o 1 > AlaD_re_uwham.hist
      

      The explanation for each option is:

      -f phipsi.dat -i 1,2 read the first and the second columns of the file "phipsi.dat" as the coordinates of data points
      -w weight.data -j 1 read the third column of the file "weights.data" as the weights for data points
      -n -3.1415926,-3.1415926 the minimum of each dimension is -3.1415926
      -x 3.1415926,3.1415926 the maximum of each dimension is 3.1415926
      -b 90,90 totally 90 bins in each dimension of the histogram
      -e calculate the potential energy for each bin
      -o 1 change the unit of free energy estimates from \(k_B T\) to \(kcal/mol\)

      The output file "AlaD_re_uwham.hist" contains four columns of data. The first and the second columns are the \(\phi\) and \(\psi\) values which together decide the position of each bin (or grid); the third column is the height of each bin; and the fourth column is the free energy estimate of each bin in units of \(kcal/mol\). Note this histogram has been normalized. In other words, the sum of the volume of all bins equals one.

    • You can plot the two dimensional free energy surface using the first, the second and the fourth columns of file "AlaD_re_uwham.hist". This can be done by running gnuplot (>v5.0) and loading the file "AlaD_re_uwham.gnu". The picture should look like

      alad_re_uwham.jpg

4 RE-SWHAM Analysis

Since the total number of \(\lambda\)-states is small in this problem, there is no advantage to apply SWHAM to analyze the data. However, this is a good example to learn and understand the RE-SWHAM algorithm. Please read the manual of the SWHAM program before practicing.

  1. First we construct the input file which provides the information of observations. Run the following command

    for i in {01..10}
    do
    echo "# state" >> input4swham.dat
    perl -wnal -e ' printf("%12g %12g %12g\n", $F[3], $F[1]*3.1415926/180.0, $F[2]*3.1415926/180.0) ' Data/state$i.dat >> input4swham.dat
    done
    

    This short shell script will generate a file called "input4swham.dat". In this "input4swham.dat" file, there are \(11\) blocks corresponding to \(11\) \(\lambda\)-states. Each blocks contains \(800001\) lines, and each line corresponds to an observation. The first (float) number of each line is the potential energy of that observation in units of \(kJ/mol\). The second and the third numbers are the \(\phi\) and \(\psi\) values of that observation in units of radian. We also need to create a file called "stateinfo.dat", which lists the temperatures of all \(\lambda\)-states.

    perl -wnal -e ' printf("%12g\n", $F[1]) ' info.dat > stateinfo.dat
    
  2. Run SWHAM

    /PATH of SWHAM/swham.o -d input4swham.dat -f stateinfo.dat -u 1 -i 2 -q 1 -n 10 -x 500 -s 1 -p 2,3 -m > swham_results.dat
    

    The explanation for each option is:

    -d input4swham.dat read the file "input4swham.dat" for the information of observations
    -f stateinfo.dat read the file "stateinfo.dat" for the information of \(\lambda\)-states
    -u 1 use the type 1 potential function implemented in the SWHAM program
    -i 2 the potential energies in the "input4swham.dat" are in units of "kJ/mol"
    -q 1 -n 10 -m equilibrate the system for 1,000,000 cycles first; then run swham for 10,000,000 cycles
    -x 500 500 exchange attempts per cycle to reach the infinite swapping limit
    -s 1 -p 2,3 print out the second and third properties (\(\phi\) and \(\psi\)) of each observation sampled at the first \(\lambda\)-state (t=300K)
  3. Then we construct a two dimensional histogram based on the results of the SWHAM analysis by running the command line

    ./2DHistogram.py -f swham_results.dat -i 1,2 -n -3.1415926,-3.1415926 -x 3.1415926,3.1415926 -b 90,90 -e -o 1 > AlaD_re_swham.hist
    

    The explanation for each option is:

    -f results.dat -i 1,2 read the first and the second columns of the file "swham_results.dat" as the coordinates of data points
    -n -3.1415926,-3.1415926 the minimum of each dimension is -3.1415926
    -x 3.1415926,3.1415926 the maximum of each dimension is 3.1415926
    -b 90,90 totally 90 bins in each dimension of the histogram
    -e calculate the estimate of free energy for each bin
    -o 1 change the unit of free energy estimates from \(k_B T\) to \(kcal/mol\)

    The output file "AlaD_re_swham.hist" contains four columns of data. The first and the second columns are the \(\phi\) and \(\psi\) values which together decide the position of each bin; the third column is the height of each bin; and the fourth column is the estimate of free energy in units of \(kcal/mol\). Note this histogram has been normalized.

  4. You can plot the two dimensional free energy surface by running gnuplot (>v5.0) and loading file "AlaD_re_swham.gnu". The picture should look like

    alad_re_swham.jpg

    As can be seen, it agrees with the UWHAM result but missing the data points where the free energy is higher than \(\sim 6\ kcal/mol\). Because RE-SWHAM resample the data points according to their UWHAM weights, if the total number of the configurations of the output of RE-SWHAM is \(N_{re}\), the data points whose UWHAM weights are smaller than \(1/N_{re}\) are most likely absent in the ensemble of the RE-SWHAM output. The following picture shows the estimate of the free energy surface based on an RE-SWHAM analysis that generate \(500,000,000\) configurations at each \(\lambda\)-state.

    alad_re_swham_500M.jpg

Author: Bin Zhang

Created: 2018-06-07 Thu 16:36

Validate