One Dimensional Umbrella Sampling

Table of Contents

1 Introduction

This example shows how to apply UWHAM and ST-SWHAM to analyze the raw data generated by a one dimensional Umbrella Sampling and construct the free energy profile.

1.1 System

The potential energy function of the system studied in this example is a one dimensional double well potential

\begin{equation} U(x) = \frac{H}{W^4} (x^2-W^2)^2 \,, \label{eq:DW} \end{equation}

where \(H=20 k_B T\) is the height of the barrier between the two wells; \(k_{B}\) is the Boltzmann's constant; \(T\) is the temperature; and \(W=1\) is the half width between the two minima of the potential. This one dimensional space can be explored by a Brownian particle simulated with the over-damped Langevin dynamics

\begin{equation} \frac{dx}{dt}=\frac{F(x)}{\gamma}+R(t)\,, \label{eq:Langevin} \end{equation}

where \(F(x)\) is the force acting on the particle based on the potential energy \(U(x)\), and \(\gamma\) is the friction constant. The noise, \(R(t)\), is taken to be Gaussian and white with zero mean and correlation

\begin{equation} \langle R(t)R(t')\rangle=\left(\frac{2k_{B}T}{\gamma}\right)\delta(t-t')\,. \label{eq:noise} \end{equation}

The straightforward way to explore this one dimensional potential is to run a long MD simulation. However, it takes quite a while to obtain a converged result because of the high barrier between the two wells. Here we applied \(31\) parabolic potentials \(U_u[i](x)\) distributed in the region between \(x=-3\) to \(x=3\) to perform Umbrella Sampling. The parabolic potentials are

\begin{equation} U_u[i](x) = \frac{1}{2} k_u[i] (x - x_0[i])^2 \,, \label{eq:umbrella} \end{equation}

where \(k_u[i] = 100 k_B T\) are the force constants of the restraint potentials; and the the minimum positions of the restraint potentials \(x_0[i] = -3.0 + 0.2(i-1)\) are defined by the index \(i\) which ranges from \(1\) to \(31\). For each umbrella, a \(1.0e8\) steps long simulation was run. During the simulations, the position of the Brownian particle was recorded every \(1000\) steps. Therefore, there are totally \(31\) \(\lambda\)-states in this example, and for each \(\lambda\)-states, \(1.0e5\) data points were generated by an independent MD simulation governed by the total potential energy \(U(x) + U_u[i](x)\).

2 Files

Please download all these files into a same directory.

2.1 Data Files

  • data.tgz: Unzipping this file generates a directory called "Data" which contains \(31\) data files. Each file contains two columns of data. The numbers in the first column are the indexes of data points. The data in the second column are the time sequence of the positions of the Brownian particle.
  • info.dat: This file contains three columns of data. The first column lists the relative path of each data file; the second column shows the minimum position of each parabolic potential, \(x_0[i]\); and the third column is the force constant of each parabolic potential \(k_u[i]\). Note that in this example all the \(k_u[i]\) are the same, but they can be different if necessary.

2.2 Python Scripts

  • prepare.py: Python script that generates the input file for UWHAM.
  • reweight.py: Python script that removes the bias of the output of UWHAM.
  • Histogram.py: Python script that constructs a one dimensional histogram.

2.3 Others

  • dw_us_uwham.gnu: input file for gnuplot to make pictures based on the UWHAM analysis.
  • dw_us_swham.gnu: input file for gnuplot to make pictures based on the SWHAM analysis.

3 UWHAM Analysis

Please read the manual of UWHAM before practicing.

  1. The first step is to construct the biasing matrix and create the input file for UWHAM. As we explained previously, there are \(31\) \(\lambda\)-states. Therefore, the input file contains \(31\) data blocks. The first line of each block is a comment line starting with a "#" sign. After the comment line, there is a matrix contains \(1.0e5\) rows (corresponding to the \(1.0e5\) data points observed at this \(\lambda\)-state) and \(31\) columns (corresponding to the 31 \(\lambda\)-states). The matrix element at the \(j^{th}\) row and the \(k^{th}\) column represents the bias potential for the \(j^{th}\) observation at the \(k^{th}\) \(\lambda\)-state

    \begin{equation} U_b[j][k] = \frac{1}{2} k_u[k] (x[j] - x_0[k])^2 \,. \label{eq:bias} \end{equation}

    The input for UWHAM can be generated by running the command

    ./prepare.py -f info.dat > input4uwham.dat 
    
  2. Run UWHAM

    /path of UWHAM/uwham.o -d input4uwham.dat -q 10 -m 1 -e 1.0e-10 -w 0
    

    The explanations for options are as follows:

    -d input4uwham.dat read the file "input4uwham.dat" for the biasing matrix
    -q 10 -m 1 run the self-consistent iteration 10 times, then switch to the Convex Newton-Raphson iteration
    -e 1.0e-10 stop when the relative changes of the unitless free energy estimates are smaller than 1.0e-10
    -w 0 write the weight of each observation at the \(\lambda\)-state at which it was observed into the "weight.data" file
  3. Reweighting
    • The file "weight.data" contains a \(3.1e6 \times 1\) matrix (\(3.1e6\) rows and one column). Each row corresponding to one observations. For example, for the \(n^{th}\) data point, which is the \(j^{th}\) observation observed at the \(i^{th}\) \(\lambda\)-state and \(n=1.0e5*(i-1)+j\), the unnormalized probability of observing this observation at the \(i^{th}\) state are listed in the \(n^{th}\) row

      \begin{equation} w(u_n, i) = \Omega(u_n) \exp\{-\beta U_b[j][i]\} \,. \label{eq:weight} \end{equation}

      Here \(U_b[j][i]\) is the element at the \(j^{th}\) row and \(i^{th}\) column of the \(i^{th}\) biasing matrix (block); \(\Omega(u_n)\) is the effective density of state. In this example

      \begin{equation} \Omega(u_n) \propto \Omega_0(u_n) \exp\{-\beta U(u_n)\} \,, \label{eq:eDoS} \end{equation}

      where \(\Omega_0(u_n)\) is the (traditional) density of state, and \(U(u_n)\) is the double well potential defined by Eq.[\ref{eq:DW}].

    • To construct the one dimensional potential energy function, we need to calculate the probability of observing each data point when there is no bias applied to the system, namely, \(\Omega(u_k)\) according to Eq.[\ref{eq:eDoS}]. For the \(n^{th}\) observation, namely the \(j^{th}\) observation observed at the \(i^{th}\) \(\lambda\)-state, this probability is calculated by

      \begin{equation} p(u_n) \propto \Omega(u_n) = w(u_n, i)/\exp\{-\beta U_b[j][i]\} \,. \label{eq:reweighting} \end{equation}

      This reweighting procedure is done by running

      ./reweight.py -w weights.data -f info.dat > uwham_results.dat
      
    • The file "uwham_results.dat" contains the final results of the UWHAM analysis. The first column in the file "uwham_results.dat" is the coordinate of each observation, and the second column is the unnormalized probability of observing this observation when there is no bias applied to the system.
  4. Plot the potential energy
    • First we construct a one dimensional histogram based on the results of the UWHAM analysis by running the command line

      ./Histogram.py -f uwham_results.dat -i 1 -w uwham_results.dat -j 2 -b 100 -e > dw_us_uwham.hist
      

      The explanations for options are:

      -f uwham_results.dat -i 1 using the first column of the file "uwham_results.dat" as the coordinate of data points
      -w uwham_results.dat -j 2 using the second column of the file "uwham_results.dat" as the weights for data points
      -b 100 totally 100 bins in the histogram
      -e calculate the potential energy for each bin

      The output file "dw_us_uwham.hist" contains three columns of data. The first column is the position of each bin; the second column is the height of each bin of the one dimensional histogram; and the third column is the potential energy estimate. Note this histogram has been normalized. In other words, the sum of the area of all bins equals one.

    • You can plot the estimate of the one dimensional potential using the first and the third columns of file "dw_us_uwham.hist" and compare it with the analytic function Eq.[\ref{eq:DW}]. This can be done by running gnuplot (>v5.0) and loading file "dw_us_uwham.gnu". If nothing goes wrong, the resulting plot should look like this

      dw_us_uwham.jpg

      As you can see, the UWHAM estimates and the curve based on the analytic function agree with each other very well.

4 ST-SWHAM Analysis

We can try ST-SWHAM to analyze the data although there are only \(31\) \(\lambda\)-states in this example.

  1. First, run the following shell script to generate the input file containing all observations for ST-SWHAM:

    for i in {01..31}
    do
    echo "# state" >> input4swham.dat
    perl -wnal -e ' printf("%12g\n", $F[1]) ' Data/state_$i.dat >> input4swham.dat
    done
    

    As can be seen, this script simply combines all the observations together, and separate them into \(31\) blocks by using lines that begin with a "#" character. For each observation, there is only one property — the position of the Brownian particle — recorded in the "input4swham.dat" file. We also need to change the "info.dat" a little bit to prepare an input file containing information of all \(\lambda\)-states:

    perl -wnal -e 'printf("%12g %12g\n", $F[2], $F[1])' info.dat > stateinfo.dat
    
  2. Run ST-SWHAM:

    mv weights.data uwham_weights.data	 
    
    /path of SWHAM/swham.o -d input4swham.dat -f stateinfo.dat -u 4 -t 300 -i 0 -o 0 -q 3 -n 15 -g 3 -s 1-31 -m -w
    

    The explanations for options are:

    -d input4swham.dat read the file "input4swham.dat" for the information of observations
    -f stateinfo.dat read the file "stateinfo.dat" for the information of states
    -u 4 use the type 4 potential function implemented in the SWHAM program
    -i 0 -o 0 the potential energies in the "input4swham.dat" and the output are in the unit of \(k_B T\)
    -g 3 -m run ST-SWHAM and print out the estimates of free energies every \(3,000,000\) cycles
    -q 3 -n 15 -m equilibrate the system for \(3,000,000\) cycles first ; then run ST-SWHAM for \(15,000,000\) cycles
    -s 1-31 print out the free energy estimate for each \(\lambda\)-state
    -w write the weight of each data point at the \(\lambda\)-state at which it was observed into the "weight.data" file

    Note that we renamed the "weights.data" generated by UWHAM so that it will not be overwritten by the "weights.data" file generated by ST-SWHAM.

  3. Then we need to reweight the probability of each observation. This step is as the same as the one in the UWHAM analysis

    ./reweight.py -w weights.data -f info.dat > swham_results.dat
    
  4. Next we construct a one dimensional histogram based on the results of the ST-SWHAM analysis by running

    ./Histogram.py -f swham_results.dat -i 1 -w swham_results.dat -j 2 -b 100 -e > dw_us_swham.hist
    

    Finally, let's compare the estimates of ST-SWHAM with the estimates of UWHAM and the the analytic function Eq.[\ref{eq:DW}]. This can be done by running gnuplot (>5.0) and loading file "dw_us_swham.gnu". The resulting plot should look like this

    dw_us_swham.jpg

    As you can see, the SWHAM estimates and the curve based on the analytic function also agree with each other very well.

Author: Bin Zhang

Created: 2018-05-02 Wed 12:50

Validate