Hamiltonian Replica Exchange

Table of Contents

1 Introduction

This example shows how to use UWHAM and SWHAM to analyze the data generated by Hamiltonian replica exchange simulations.

1.1 System

We study the binding affinity of a guest molecule (heptanoate) to a host molecule (\(\beta\)-cyclodextrin) in implicit solvent (OPLA-AA/AGBNP2). Here we apply the binding energy distribution analysis method (BEDAM) to obtain the binding free energy and binding energy distributions of this complex. BEDAM is a free energy method based on the Hamiltonian replica exchange (RE) algorithm in which the interaction between the ligand and acceptor is scaled by the factor \(\lambda\) changing gradually from \(0\) to \(1\). Suppose there are \(M\) parallel simulations in BEDAM, the Hamiltonian (potential) function of the \(i^{th}\) \(\lambda\)-state is

\begin{equation} H[i] = V_0 + \lambda[i] u\,, \end{equation}

where \(V_0\) is the effective potential energy of the complex without the direct and solvent-mediated ligand-receptor interactions, and \(u\) is the binding energy. In this example, we ran BEDAM at \(300\ K\) by using \(16\) \(\lambda\)-states. The chosen \(\lambda\) values are \(0.0\), \(0.001\), \(0.002\), \(0.004\), \(0.01\), \(0.04\), \(0.07\), \(0.1\), \(0.2\), \(0.4\), \(0.6\), \(0.7\), \(0.8\), \(0.9\), \(0.95\), \(1.0\). The BEDAM simulation lasted \(72\ ns\). And the values of \(V_0\) and \(u\) were recorded very \(0.5\ ps\). Therefore, this system has \(16\) \(\lambda\)-states; the simulation at each \(\lambda\)-state generated \(144,000\) data points.

2 Files

Please download all these files into a same directory.

2.1 Data Files

  • data.tgz: Unzipping this file generates a directory called "Data" which contains \(10\) data files. Each data file contains three columns of data. The numbers in the first column are the indexes of data points. The second column is the binding energy \(u\), and the third column is the effective potential energy \(V_0\) of each observation. Both are in units of "\(kcal/mol\)".
  • info.dat: This file contains two columns of data. The first column shows the relative path of each data file. The second column lists the \(\lambda\) value of each \(\lambda\)-state.

2.2 Python Scripts

  • prepare.py: Python script generates the input file for UWHAM.
  • Histogram.py: Python script constructs a one dimensional histogram.

2.3 Others

3 UWHAM Analysis

Please read the manual of the UWHAM and SWHAM program before practicing.

  1. The first step is to construct the biasing matrix and create the input file for UWHAM. As we explained previously, there are \(16\) \(\lambda\)-states in this system. Therefore, the input file contains \(16\) data blocks. The first line of each block is a comment line starting with a "#" sign. After the comment line, there is a matrix contains \(144,000\) rows (corresponding to the \(144,000\) data points observed at this \(\lambda\)-state) and \(16\) columns (corresponding to the total number of \(\lambda\)-states). The matrix element at the \(j^{th}\) row and the \(k^{th}\) column in the \(i^{th}\) block represents the bias potential for the \(j^{th}\) observation (observed at the \(i^{th}\) \(\lambda\)-state) at the \(k^{th}\) \(\lambda\)-state

    \begin{equation} U_b[j][k] = u[j] \times \lambda[k] \,, \end{equation}

    where \(u[j]\) is the binding energy of the \(j^{th}\) observation, and \(\lambda[k]\) is the \(\lambda\) value of the \(k^{th}\) \(\lambda\)-state. The input for UWHAM can be generated by running

    ./prepare.py -f info.dat > input4uwham.dat 
    
  2. Run UWHAM

    /path of UWHAM/uwham.o -d input4uwham.dat -i 1 -o 1 -t 300 -q 5 -m 1 -e 1.0e-10 -w 16
    

    The explanation for each option is listed as follows:

    -d input.dat read the file "input.dat" for the biasing matrix
    -i 1 -o 1 the input and output are in units of "\(kcal/mol\)"
    -t 300 the simulations are run at \(300\ K\)
    -q 5 -m 1 run the self-consistent iteration 5 times, then switch to the Convex Newton-Raphson iteration
    -e 1.0e-10 stop when the relative changes of the unitless free energy estimates are smaller than 1.0e-10
    -w 16 write the weight of each data point at the \(16^{th}\) (\(\lambda=1\)) state

    The output of the UWHAM program will show these lines:

    The free energy differences in unit of kcal/mol: 
    0  0.898076 1.57834 2.07108 2.31315 2.7656 3.01242 3.18749 3.49453 3.38743 2.78044 2.35484 1.81594 0.95743 0.275247 -0.602911 
    

    The binding free energy is \(-0.602911+G_{vsite}\) where \(G_{vsite}\) is a correction because of the restraint applied to the ligand.

  3. Plot the binding energy distribution at the \(\lambda=1\) state

    • Let's construct a one dimensional histogram for the binding energy at the \(\lambda=1\) state based on the results of the UWHAM analysis. First run

      cat Data/* > all.dat 
      

      to combine all the raw data together. Next run the command

      ./Histogram.py -f all.dat -i 2 -w weights.data -j 1 -b 100 -n -45 -x 5 -a > bindingE_uwham.hist
      

      The explanations for options are:

      -f all.dat -i 2 read the second column of the file "all.dat" as the binding energy of data points
      -w weights.data -j 1 read the first column of the file "weights.data" as the weights for data points
      -b 100 totally 100 bins in the histogram
      -n -45 -x 5 the first bin is at \(-45.0\ kcal/mol\); and the last bin is at \(5.0\ kcal/mol\)
      -a the probability of data points \(<-45.0\ kcal/mol\) and \(> 5.0\ kcal/mol\) are included in the first bin and the last bin respectively.
    • You can plot the distribution of binding energies at the \(\lambda=1\) state by using the first and the second columns of the file "bindingE_uwham.hist". This can be done by running gnuplot (>v5.0) and loading the file "bindingE_uwham.gnu". The resulting plot should look like this

    bindingE_uwham.jpg

4 SWHAM Analyses

  1. First, run the following command lines to generate the input file for SWHAM

    for i in {01..16}
    do
    echo "# state $i" >> input4swham.dat
    perl -wnal -e ' printf("%12g\n", $F[1]) ' Data/lambda_$i.dat >> input4swham.dat
    done
    

    As can be seen, in the "input4swham.dat" file, there are \(16\) blocks which correspond to \(16\) \(\lambda\)-states. Each line in "input4swham.dat" corresponds to one observation. For each observation, only one property, the binding energy, is required for our analysis. Then run command line

    perl -wnal -e ' printf("%12g\n", $F[1]) ' info.dat > stateinfo.dat
    

    to generate the file contains the information of \(\lambda\)-states.

  2. To obtain the binding free energy, run ST-SWHAM

    /path of SWHAM/swham.o -d input4swham.dat -f stateinfo.dat -u 2 -t 300 -i 1 -o 1 -q 3 -n 15 -g 3 -s 1-16 -m
    

    The explanations for options are:

    -d input4swham.dat read the file "input4swham.dat" for the information of observations
    -f stateinfo.dat read the file "stateinfo.dat" for the information of \(\lambda\)-states
    -u 2 use the type 2 potential function implemented in SWHAM program
    -t 300 all the simulations were run at \(300\ k\).
    -i 1 -o 1 the input and out data are in units of \(kcal/mol\)
    -q 3 -n 15 -m equilibrate the system for 3,000,000 cycles first ; then run ST-SWHAM for 15,000,000 cycles
    -g 3 -m print out the free energy estimates every 3,000,000 cycle
    -s 1-16 print out free energy estimates for all \(\lambda\)-states

    Do the ST-SWHAM results agree with the previous UWHAM results?

  3. To obtain the equilibrium distribution at the \(\lambda=1\) state, run RE-SWHAM

    /path of SWHAM/swham.o -d input4swham.dat -f stateinfo.dat -u 2 -t 300 -i 1 -o 1 -q 1 -n 3 -x 500 -s 16 -p 1 -m > reswham.out
    

    The explanations for options are:

    -d swham_input.dat read the file "input4swham.dat" for the information of observations
    -f stateinfo.dat read the file "stateinfo.dat" for the information of \(\lambda\)-states
    -u 2 use the type 2 potential function implemented in SWHAM program
    -t 300 all the simulations were run at \(300\ k\).
    -i 1 -o 1 the input and out data are in units of \(kcal/mol\)
    -q 1 -n 3 -m equilibrate the system for 1,000,000 cycles first ; then run RE-SWHAM for 3,000,000 cycles
    -x 500 500 exchange attempts per RE cycle to reach the infinite swapping limit
    -s 16 -p 1 print out the first property (binding energy) of each observation sampled at the \(\lambda=1.0\) state

    Then run the command

    ./Histogram.py -f reswham.out -i 1 -b 100 -n -45 -x 5 -a > bindingE_swham.hist
    

    to obtain the binding energy distribution based on the output of RE-SWHAM.

    -f reswham.out -i 1 read the first column of the file "reswham.out" as the data points
    -b 100 totally 100 bins in the histogram
    -n -45 -x 5 the minimum of the histogram is \(-45\ kcal/mol\) and the maximum is \(5\ kcal/mol\)
    -a the probability of data points \(< -45.0\ kcal/mol\) and \(> 5.0\ kcal/mol\) are included in the first bin and the last bin respectively.

    Then we can plot the distribution of binding energies estimated by RE-SWHAM and compare it with the UWHAM estimate. This can be done by running gnuplot (>v5.0) and loading the file "bindingE_swham.gnu". The resulting plot should look like this

    bindingE_swham.jpg

Author: Bin Zhang

Created: 2018-05-02 Wed 13:10

Validate