Two Binding Modes of the \(\beta\)-cyclodextrin Heptanoate Complex

Table of Contents

1 Introduction

This example shows how to apply Stratified RE-SWHAM to analyze the data of a multicanonical ensemble when a subset of the parallel simulations is far from being equilibrated because of barriers between free energy basins which are only rarely (or never) crossed at some states.

1.1 System

Again, we study the binding affinity of a guest molecule (heptanoate) to a host molecule (\(\beta\)-cyclodextrin) in implicit solvent (OPLA-AA/AGBNP2). Although we already used this system in the previous examples, we have not explained the challenge of studying the binding affinity of this system.

The host, \(\beta\)-cyclodextrin (\(\beta\)CD), is a frustum-shaped molecule with a hydrophobic interior core. The narrow opening end of \(\beta\)CD is laced with 7 primary hydroxyls, and the wide opening end is laced with \(14\) secondary hydroxyls. Because of its chemical nature, \(\beta\)CD can bind with a number of ligands, therefore serving as a classic “host” for the study of molecular recognition phenomena. The guest molecule, heptanoate, consists of a hydrophilic carboxylate group and hydrophobic alkyl groups. As the hydrophobic alkyl groups of heptanoate are nested in the cavity of \(\beta\)CD, the carboxylate group of heptanoate can form hydrogen bonds with either the primary or the secondary hydroxyls of \(\beta\)CD depending on the orientation of the heptanoate molecule. As shown in the following Figure, the \(\beta\)-cyclodextrin heptanoate complex has two binding states, which will be referred to as the UP and DOWN macrostates.

BCD_and_Hep.jpg

Figure 1: \(\beta\)-Cyclodextrin heptanoate binding complex. Left: UP macrostate. Right: DOWN macrostate.

In our previous example we have studied the binding affinity of this host−guest system by using BEDAM free energy method that is based on replica exchange simulations. In BEDAM simulations, an additional parameter \(\lambda\) is introduced to scale the interaction between the host and the guest molecules from none to full interaction. The features of \(\beta\)-cyclodextrin heptanoate binding obtained using replica exchange serve as the benchmark for this test case where we employ Stratified RE-SWHAM to combine and analyze the results of independent (uncoupled) MD simulations at each of the \(\lambda\) Hamiltonian states.

We ran two sets of \(72\ ns\) independent MD simulations at \(300\ K\) of the \(\beta\)-cyclodextrin heptanoate complex in implicit solvent (AGBNP GB model48) at 16 \(\lambda\)-states: (\(0.0\), \(0.001\), \(0.002\), \(0.004\), \(0.01\), \(0.04\), \(0.07\), \(0.1\), \(0.2\), \(0.4\), \(0.6\), \(0.7\), \(0.8\), \(0.9\), \(0.95\), \(1.0\)). The \(\lambda\)-states are chosen to be the same as those in the previous BEDAM simulations. At the \(\lambda = 0.0\) state, there is no interaction between the ligand and the receptor, and the interaction is fully turned on at the \(\lambda = 1.0\) state. However, there is no replica exchange coupling among different \(\lambda\)-states. We note that simulations which use computational grids typically do not employ replica exchange; this observation serves to motivate this example. One set of independent simulations was started from the UP macrostate, and the other set was started from the DOWN macrostate. At the seven \(\lambda\)-states with the largest \(\lambda\) values, because of the strong interaction between heptanoate and \(\beta\)CD molecules, it is difficult for the binding complex to switch between the UP and DOWN macrostates. During the \(72\ ns\) simulations, no transitions between the UP and DOWN macrostates were observed at the \(\lambda = 1.0, 0.95, 0.9\) states, and only one or two transitions were observed at the \(\lambda = 0.8, 0.7, 0.6\) states. However, when the interaction between the ligand and the receptor is further reduced (for \(\lambda\) values smaller than or equal to \(0.2\)), multiple transitions occurred. Therefore, there are \(16\) \(\lambda\)-states in this system. The \(\lambda\)-states with the largest seven \(\lambda\) values (\(\lambda = 1.0\), \(0.95\), \(0.9\), \(0.8\), \(0.7\), \(0.6\), \(0.4\)) are considered as the partially connected states; the other nine \(\lambda\)-states are the fully connected states. Each \(\lambda\)-state has \(288,000\) data points. Here, the goal is to obtain the equilibrium distribution at the \(\lambda=1\) state.

2 Files

  • data.tgz: Unzipping this file generates a directory called "Data" which contains \(16\) data files. Each data file corresponds to one \(\lambda\)-state and contains three columns of data. the first column is the the binding energy \(u\); the second column is effective potential energy \(V_0\) of each observation. Both are in unit of "\(kcal/mol\)". The last column is the tag that marks the orientation of the heptanoate molecule relative to \(\beta\)CD. "1" represents the UP state, and "2" represents the DOWN state.
  • stateinfo.dat: This file contains the information of \(\lambda\)-states. There are \(16\) lines in this file. Each line corresponds to a \(\lambda\)-state. The first column is the \(\lambda\) value of each \(\lambda\)-state. The second column is the tag represents whether this \(\lambda\)-state is fully connected, where "1" represents fully connected and "0" represents partially connected.

2.1 Python Scripts:

  • Histogram.py: Python script constructs a one dimensional histogram.

2.2 Others

3 Stratified RE-SWHAM Analysis

Please read the manual of the Stratified RE-SWHAM program before practicing.

3.1 Run Stratified RE-SWHAM

  1. First we construct the input file for Stratified RE-SWHAM by running

    for i in {01..16}
    do
    echo "# State $i" >> input.dat
    cat Data/lambda_$i.dat >> input.dat
    done
    

    This shell script simply combines the raw data of each \(\lambda\)-state into one file "input.dat". The data points of different \(\lambda\)-states are separated by a line beginning with "#".

  2. Run Stratified RE-SWHAM

    /path of Stratified RE-SWHAM/sreswham.o -d input.dat -f stateinfo.dat -a 2 -u 2 -t 300 -i 1 -o 1 -q 1 -n 10 -x 100 -s 1-16 -p 1,3 -m > sreswham.dat 
    

    The explanation for each option is:

    -d input.dat read the file "input.dat" for the information of observations
    -f stateinfo.dat read the file "stateinfo.dat" for the information of states
    -a 2 two macrostate clusters
    -u 2 use the type 2 potential function implemented in the Stratified SWHAM program
    -t 300 all the simulations were run at \(300\ K\)
    -i 1 -o 1 the input and out data are in units of \(kcal/mol\)
    -q 1 -n 10 -m equilibrate the system for 1,000,000 cycles first; then run stratified RE-SWHAM for 10,000,000 cycles
    -x 100 100 exchange attempts per RE cycle to reach the infinite swapping limit
    -s 1-16 -p 1,3 print out the first and the third property of each observation sampled at all \(\lambda\)-states.

3.2 Plot the binding energy distribution at the \(\lambda=1\) state

  • Let's construct a one dimensional histogram for the binding energy at the \(\lambda=1\) state based on the results of the Stratified RE-SWHAM analysis by running

    ./Histogram.py -f sreswham.dat -i 31 -b 100 -n -45 -x 5 -a > bindingE.hist
    

    The explanations for options are:

    -f sreswham.dat -i 31 read the 31st column of the file "sreswham.dat" as the binding energy of data points
    -b 100 totally 100 bins in the histogram
    -n -45 -x 5 the first bin is at \(-45.0\ kcal/mol\); and the last bin is at \(5.0\ kcal/mol\)
    -a the probability of data points \(<-45.0\ kcal/mol\) and \(>5.0\ kcal/mol\) are included in the first bin and the last bin respectively.
  • You can plot the distribution of binding energies at the \(\lambda=1\) state by using the first and the second columns of the file "bindingE.hist". This can be done by running gnuplot (>v5.0) and loading the file "bindingE.gnu". Note that it will also plot the benchmark results to compared with. The plot should look like this

    bindingE.jpg

3.3 Population percentage of the configurations in the DOWN macrostate

  • The population percentage of the configuration in the DOWN macrostate1 can be calculated by running

    for i in {1..16}
    do
    let j=$i*2-1
    perl -wnal -e " BEGIN{\$sum=0;} /#/ or \$sum+=\$F[$j]-1; END{printf(\"%12g\\n\", \$sum/\$.);} " sreswham.dat >> orientation.dat
    done
    
  • You can plot the population percentage of the configuration in the DOWN macrostate by running gnuplot (>v5.0) and loading file "orientation.gnu". The plot should look like

    orientation.jpg

Footnotes:

1

Stratified UWHAM and Its Stochastic Approximation for Multicanonical Simulations Which Are Far from Equilibrium (link)

Author: Bin Zhang

Created: 2018-05-02 Wed 12:30

Validate