INTRODUCTION TO CURRENT RESEARCH IN THE LEVY GROUP: 2018
Research in the Levy group is focused on the development and application of
computational methods for studying the structure, function, and dynamics of
proteins. We work on problems involving the interplay between computational
models in structural biology and experiments at different levels of resolution and
different time scales. Using a statistical mechanics framework, we are mapping
conformational free energy landscapes that determine the statistical
thermodynamic basis for protein-ligand binding and protein allostery. We are
surveying corresponding fitness landscapes in sequence space, and developing
new statistical methods to analyze how correlated mutations evolve under drug
selection pressure, leading to resistance.
Protein Dynamics, Folding, and Misfolding
In recent work we have coupled replica exchange simulations with kinetic network models to elucidate the heterogeneity of the pathways by which small proteins fold. In a 2015 report we explained why relaxation within the protein unfolded basin can be very fast even though state to state first passage times within that basin can be very slow, thus resolving a paradox the field was grappling with. Our work also explains why many small proteins fold with single exponential kinetics even though the folding pathways are diverse and have different barriers.
Levy, Ronald M., Wei Dai, Nan-Jie Deng, and
Dmitrii E. Makarov (2013). How long does it take to equilibrate
the unfolded state of a protein?. Protein Science, 22(11),
1459-1465. DOI: 10.1002/pro.2335. PMCID: PMC3831662.
Deng, Nan-jie, Wei Dai, and Ronald M. Levy (2013). How Kinetics within the Unfolded State Affects Protein Folding: An Analysis Based on Markov State Models and an Ultra-Long MD Trajectory. J. Phys. Chem. B, 117(42), 12787-12799. DOI: 10.1021/jp401962k. PMCID: PMC3808496.
Xia, Junchao and Ronald M. Levy (2014). Molecular Dynamics of the Proline Switch and Its Role in Crk Signaling. J. Phys. Chem. B, 118(17), 4535-4545. DOI: 10.1021/jp5013297. PMCID: PMC4007982.
Dai, Wei, Anirvan M. Sengupta, and Ronald M. Levy (2015). First Passage Times, Lifetimes, and Relaxation Times of Unfolded Proteins. Phys. Rev. Lett., 115(4), 048101. DOI: 10.1103/PhysRevLett.115.048101. PMCID: PMC4531052.
Solvation Thermodynamics in Biophysics and Structural Biology
It is widely believed that the displacement of water from binding sites at protein receptor surfaces plays a key role in determining protein-ligand binding affinity. The loosely expressed idea is that by displacing a water molecule whose thermodynamic signature is “unfavorable” relative to the bulk, a ligand may gain extra binding affinity. While this suggestion appears frequently in the literature, the idea is not clearly formulated in this way, as the chemical potential of the solvent is constant throughout the solution. However the excess chemical potential varies throughout the solution and knowledge of the direct and indirect parts can be used to inform the ligand design process. Earlier efforts to take the thermodynamic signatures of interfacial waters into account for the ligand design process have been based on inhomogeneous solvation theory (IST). We have questioned whether IST is the best framework for attacking this problem and have proposed an alternative approach based on classical density functional theory (DFT). We plan to use this new statistical thermodynamic framework to analyze the role of interfacial water in protein-ligand binding and protein stability.
Levy, Ronald M., Di Cui, Bin W. Zhang, and Nobuyuki Matubayasi (2017). The Relationship Between Solvation Thermodynamics from IST and DFT Perspectives. The Journal of Physical Chemistry B, 121(15), 3825-3841. DOI: 10.1021/acs.jpcb.6b12889. PMCID: PMC5869707.
Cui, Di, Bin W. Zhang, Nobuyuki Matubayasi, and Ronald M. Levy (2017). The Role of Interfacial Water in Protein–Ligand Binding: Insights from the Indirect Solvent Mediated Potential of Mean Force. Journal of Chemical Theory and Computation, 14(2), 512–526. DOI: 10.1021/acs.jctc.7b01076. PMCID: PMC5897112.
Zhang, Bin W., Di Cui, Nobuyuki Matubayasi, and Ronald M. Levy (2018). The Excess Chemical Potential of Water at the Interface with a Protein from End Point Simulations. The Journal of Physical Chemistry B, 122(17), 4700-4707. DOI: 10.1021/acs.jpcb.8b02666. PMCID: PMC5939383.
Mapping Free Energy Landscapes for Protein-Ligand Binding and Allostery, and the Design of Inhibitors of HIV-1 Proteins
Conformational dynamics plays a fundamental role in the regulation of molecular recognition and statistical mechanics provides the framework to derive a comprehensive theory for the binding of a ligand to a protein. Our goal is to develop models of sufficient accuracy to be predictive for thermodynamic and kinetic properties, but also as important to generate qualitative insights about the molecular mechanisms for binding and allosteric conformational transitions. We have long standing strong collaborations with structural biologists, biochemists and virologists, working on the design of inhibitors of HIV-1 proteins. Our model of the structure of the ALLINI induced IN multimer, was validated by the first reported crystal structure published in 2016.
Vijayan, R. S. K., Peng He, Vivek Modi, Krisna C. Duong-Ly, Haiching Ma, Jeffrey R. Peterson, Roland L. Dunbrack, and Ronald M. Levy (2015). Conformational Analysis of the DFG-Out Kinase Motif and Biochemical Profiling of Structurally Validated Type II Inhibitors. Journal of Medicinal Chemistry, 58(1), 466-479. DOI: 10.1021/jm501603h. PMCID: PMC4326797.
Gallicchio, Emilio, Nanjie Deng, Peng He, Lauren Wickstrom, Alexander L. Perryman, Daniel N. Santiago, Stefano Forli, Arthur J. Olson, and Ronald M. Levy (2014). Virtual screening of integrase inhibitors by large scale binding free energy calculations: the SAMPL4 challenge. Journal of Computer-Aided Molecular Design, 28(4), 475-490. DOI: 10.1007/s10822-014-9711-9. PMCID: PMC4137862.
Mentes, Ahmet, Nan-Jie Deng, R. S. K. Vijayan, Junchao Xia, Emilio Gallicchio, and Ronald M. Levy (2016). Binding Energy Distribution Analysis Method: Hamiltonian Replica Exchange with Torsional Flattening for Binding Mode Prediction and Binding Free Energy Estimation. Journal of Chemical Theory and Computation, 12(5), 2459-2470. DOI: 10.1021/acs.jctc.6b00134. PMCID: PMC4862910.
Deng, Nanjie, Ashley Hoyte, Yara E Mansour, Mosaad S. Mohamed, James R. Fuchs, Alan N. Engelman, Mamuka Kvaratskhelia, and Ronald M. Levy (2016). Allosteric HIV-1 integrase inhibitors promote aberrant protein multimerization by directly mediating inter-subunit interactions: Structural and thermodynamic modeling studies. Protein Science, 25(11), 1911-1917. DOI: 10.1002/pro.2997. PMCID: PMC5079246.
Zhang, Bin W., Nanjie Deng, Zhiqiang Tan, and Ronald M. Levy (2017). Stratified UWHAM and Its Stochastic Approximation for Multicanonical Simulations Which are Far from Equilibrium. Journal of Chemical Theory and Computation, 13(None), 4660-4674. DOI: 10.1021/acs.jctc.7b00651. PMCID: PMC5897113.
Mapping the Fitness Landscapes of Proteins and the Evolution of Drug Resistance
We are pioneering new sequence based statistical inference methods to analyze correlated mutations that arise from evolutionary constraints and from drug selection pressure. We are using both sequence based and structure based approaches to map the conformational and fitness landscapes of kinase family proteins. We have constructed Potts Hamiltonian models based on multiple sequence alignments of HIV-1 proteins, and used these models to study the evolution of HIV-1 under drug selection pressure which leads to entrenchment of primary mutations through epistatic interactions with the sequence background.
Haldane, Allan, William F. Flynn, Peng He, R. S. K. Vijayan, and Ronald M. Levy (2016). Structural propensities of kinase family proteins from a Potts model of residue co-variation. Protein Science, 25(8), 1378-1384. DOI: 10.1002/pro.2954. PMCID: PMC4972195.
Levy, Ronald M., Allan Haldane, and William F. Flynn (2017). Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness. Current Opinion in Structural Biology, 43, 55-62. DOI: 10.1016/j.sbi.2016.11.004. PMCID: PMC5869684.
Flynn, William F., Allan Haldane, Bruce E. Torbett, and Ronald M. Levy (2017). Inference of epistatic effects leading to entrenchment and drug 6 resistance in HIV-1 protease. Molecular Biology and Evolution, 34(6), 1291. DOI: 10.1093/molbev/msx095. PMCID: PMC5435099.
Haldane, Allan, William F. Flynn, Peng He, and Ronald M. Levy (2018). Coevolutionary Landscape of Kinase Family Proteins: Sequence Probabilities and Functional Motifs. Biophysical Journal, 114(1), 21-31. DOI: 10.1016/j.bpj.2017.10.028. PMCID: PMC5773752.
INTRODUCTION TO RESEARCH IN THE LEVY GROUP: 2012
Molecular simulations have come to play a central role in structural biology
and biophysics; they are beginning to be used in cell and systems biology as
well. Simulations help us develop our intuition about the behavior of models
which link biological structures to function. Protein folding, molecular
recognition and ligand binding, biological machines used for transport and
signaling, these are some of the research areas that have been greatly
enriched by computational approaches based on molecular simulations. The
design of effective potentials for modeling solvation effects implicitly, the
development of new sampling methods based on replica exchange molecular
dynamics, and the exploration of rare events using network models are themes
in the Levy group which run through our current research highlighted below.
New multi-scale models, effective potentials, and sampling methods for molecular simulations
Implicit Solvent Models
Water plays a fundamental role in virtually all biological processes.
The accurate modeling of hydration thermodynamics is therefore
essential for studying protein conformational equilibria, aggregation,
and binding. Explicit solvent models provide the most detailed
description of hydration phenomena, but they have inherent limitations
that motivate the search for other ways to represent solvation.
Implicit solvent models, which are based on the statistical mechanics
concept of the solvent potential of mean force, are very useful
alternatives to explicit solvation for modeling protein folding and
binding.
We have developed an implicit solvent effective potential (AGBNP)
that is suitable for molecular dynamics simulations and high-
resolution modeling. It is based on a novel implementation of the
pairwise descreening Generalized Born model for the electrostatic
component and a nonpolar hydration free energy estimator. The model
is fully analytical with first derivatives and is computationally
efficient and has been incorporated into the IMPACT molecular
simulation program.
Gallicchio, E., K. Paris, and R.M. Levy
(2009). The AGBNP2 Implicit Solvent Model. J. Chem. Theory and
Comput., 5, 2544-2564. DOI: 10.1021/ct900234u. PMCID:
PMC2857935.
Replica Exchange Molecular Dynamics
One of the key challenges in the computer simulation of proteins at
the atomic level is the sampling of conformational space. The
efficiency of many common sampling protocols such as Monte-Carlo (MC)
and Molecular Dynamics (MD) is limited by the need to cross high free-
energy barriers and rugged energy landscapes. In the Replica Exchange
(RE) algorithm many coupled simulations are run in parallel; the
coupling is achieved by exchanging either thermodynamic or
Hamiltonian parameters. We are exploring novel implementations of
replica exchange to accelerate the convergence of protein folding
simulations, and the calculation of protein-ligand binding free
energies. In the latter example, the REMD coupling parameter is the
protein-ligand interaction energy.
Network Models, Rare Events, and Transition Paths
We can think of the data generated by replica exchange simulations of
protein folding and protein-ligand binding as generating a trace
through phase space. We can organize the data using graph theoretic
ideas, where the nodes are conformations and the edges represent
possible jumps between closely related nodes. We use the magic of
histogram re-weighting to assign relative probabilities to the jumps
(edges) which connect the nodes. Furthermore, we can construct
transition paths on the graphs which correspond to rare transitions
between stable states, and then use the tools of transition path
theory to analyze the transition path ensemble, in order to determine
the number of truly different important paths and their fluxes. This
is a form of multi-scale modeling where the underlying data is
derived from atomic simulations, while the transition paths are
constructed ex post facto.
Gallicchio, E., M. Andrec, A.K. Felts, and R.M.
Levy (2005). Temperature Weighted Histogram Analysis Method, Replica
Exchange, and Transition Paths. J. Phys. Chem. B, 109, 6722-6731.
DOI: 10.1021/jp045294f.
Gallicchio, E., R.M. Levy, and M. Parashar
(2008). Asynchronous Replica Exchange for Molecular Simulations. J.
Comput. Chem., 29, 788-794. DOI: 10.1002/jcc.20839. PMCID:
PMC2977925.
Zheng, W., M. Andrec, E. Gallicchio, and R.M.
Levy (2007). Simulating replica exchange simulations of protein
folding with a kinetic network model. Proc. Natl. Acad. Sci. USA,
104, 15340-15345. DOI: 10.1073/pnas.0704418104. PMCID: PMC2000486.
Zheng, W., M. Andrec, E. Gallicchio, and R.M.
Levy 2008. Simple Continuous and Discrete Models for Simulating
Replica Exchange Simulations of Protein Folding. J. Phys. Chem. B,
112, 6083-6093. DOI: 10.1021/jp076377+. PMCID: PMC2978075.
Protein folding and misfolding: fundamental physics, health-related applications
Kinetic Network Models and Protein Folding
Protein folding is a fundamental problem in modern molecular
biophysics and is an example of a slow process occurring via rare
events in a high-dimensional configurational space. For
this reason, it is difficult for an all-atom simulation to obtain
meaningful information on the kinetics and pathways of such
processes. A number of strategies for addressing this problem have
been proposed over the years that involve focusing on the
important slow processes while neglecting the less interesting
rapid kinetics by simplification of the state space or reduction
of dimensionality. Generalized ensemble methods such as replica
exchange molecular dynamics (REMD) have been developed that
enhance the ability to obtain accurate canonical populations in
complex systems by increasing sampling efficiency. However, since
REMD involves temperature swaps between MD trajectories, it is not
straightforward to obtain kinetic information from such
simulations. We have chosen to make use of a kinetic network model
in which the nodes correspond to discrete molecular conformations
from REMD simulation trajectories (rather than macrostates), and
the edges are derived from an ansatz based on structural
similarity. Since the network is a discretized representation of
the system and does not require additional energy and force
evaluations, there is a considerable gain in efficiency, allowing
us to study much slower kinetic processes than would be accessible
using conventional MD. Since the network topology can be
constructed based on virtually all degrees of freedom, this allows
for multiple pathways and complex transition states. Application
of the kinetic network model to study the folding of mini-proteins
such as Trp-Cage, and to the conformational reorganization of the
HIV-1 Protease receptor pocket, provides us with detailed
information about the rich variety of transition pathways both
their structural characteristics and their kinetics.
Andrec, M., A.K. Felts, E. Gallicchio, and R.M.
Levy (2005). Protein Folding Pathways from Replica Exchange
Simulations and a Kinetic Network Model. Proc. Natl. Acad. Sci.
USA, 102, 6801-6806. DOI: 10.1073/pnas.0408970102. PMCID:
PMC1100763
Zheng, W., M. Andrec, E. Gallicchio, and R.M.
Levy (2009). Recovering Kinetics from a Simplified Protein Folding
Model Using Replica Exchange Simulations: A Kinetic Network and Effective
Stochastic Dynamics. J. Phys. Chem. B, 113, 11702-11709. DOI:
10.1021/jp900445t. PMCID: PMC2975981.
Health-related applications: α synuclein and Parkinson's disease
Unlike globular proteins with stable native structures, partially
folded proteins, and natively unfolded proteins are best characterized
as conformational ensembles of rapidly interconverting structures.
Structural characterization of these dynamic systems is critical to
understanding the basis for protein misfolding that results in protein
aggregation and disease. In collaboration with Jean Baum's research
group we are integrating experiments and computational models to
develop a molecular picture of the steps involved in the aggregation of
a synuclein, an intrinsically disordered protein that appears in
aggregated forms in the brains of patients with Parkinson's disease.
The conversion from the monomeric disordered state to the highly
ordered aggregate fibril form is both complex and not well understood.
Aggregation rates are sensitive to changes in amino acid sequence and
environmental conditions. We are focusing on the early steps in the
aggregation process.
Weinstock, D.S., C. Narayanan, J. Baum, and R.M.
Levy (2008). Correlation Between 13C-alpha Chemical Shifts and Helix
Content of Peptide Ensembles. Protein Science, 17, 950-954. DOI:
10.1110/ps.073365408. PMCID: PMC2327285.
Wu, K-P, D.S. Weinstock, C. Narayanan, R.M. Levy,
and J. Baum (2009). Structural reorganization of a-synuclein at low
pH observed by NMR and REMD simulations. J. Mol. Biol., 391,
784-796. DOI: 10.1016/j.jmb.2009.06.063. PMCID: PMC2766395.
The energy landscape for ligand binding to protein receptors and drug design
Understanding how to accurately predict binding free energies of small molecule
ligands to protein receptor targets is a very active area of research in our
lab. The statistical mechanics of binding can be formulated in many alternative
ways; we are pursuing approaches particularly well suited to modern cluster
computing environments with highly parallelized computation.
Gallicchio E., and R.M. Levy (2012). Prediction
of SAMPL3 Host-Guest Affinities withthe Binding Energy Distribution Analysis
Method (BEDAM). J Comp Aided Mol Design, 26, 505-516. DOI:
10.1007/s10822-012-9552-3. PMCID: PMC3383899.
Statistical mechanics and the free energy of binding
A new method has been developed for estimating protein-ligand affinities with
implicit solvation called the Binding Energy Distribution Analysis Method
(BEDAM) which is a statistical mechanics formulation for the free energy of
binding based on the analysis of the distribution of binding energies between
the ligand and the protein receptor obtained in a fictitious ensemble in which
the ligand resides in the binding pocket without interacting with the receptor.
This framework is developed based on implicit solvation, Hamiltonian replica
exchange parallel molecular dynamics sampling, and reweighting techniques. The
formulation takes into account ligand-receptor interactions, multiple binding
modes, and conformational entropy losses and is currently being applied to a
series of important medicinal targets.
Receptor reorganization in protein-ligand binding: HIV-1 Protease and
Reverse Transcriptase
We are simultaneously investigating computational approaches to
calculate receptor strain energies that are a component of binding free
energies. The overall affinity of a ligand for a receptor can be expressed as a
balance between the strength of the interactions of the ligand for a particular
binding-competent conformation of the receptor and the probability of
occurrence of that conformation in the absence of the ligand. For receptors
that do not experience much conformational change upon binding, the affinity is
primarily based on the interaction between the ligand and receptor. However,
for flexible receptors such as HIV-1 reverse transcriptase (RT), HIV-1 protease
and a number of kinases, the conformational changes in the receptor may require
inclusion of receptor reorganization or strain energy to properly model the
binding of ligands to that protein. We are exploring the use of bioinformatic
techniques and simulations to model the receptor strain energy for flexible
receptors.
We are also working with collaborators at Rutgers (Eddy Arnold group) and the
University of Pittsburgh School of Medicine (Mike Parniak group) to identify
lead compounds that will inhibit the ribonuclease H activity of HIV-1 RT.
Structure based design of inhibitors for the RNase H function of RT has lagged
behind inhibitor design for the polymerase function of RT due to limited
structural information about the binding modes of inhibitors to the RNase H
active site. Our current strategy attempts to leverage experimental information
generated by our collaborators which has identified active compounds in large
libraries by comparing this information with the predictions of enrichment for
the same compounds when they are docked to many putative receptor sites on the
RNase H domain.
Paris, K.A., O. Haq, A.K. Felts, K. Das, E.
Arnold, and R.M. Levy (2009). Conformational Landscape of the Human
Immunodeficiency Virus Type 1 Reverse Transcriptase Non-Nucleoside
Inhibitor Binding Pocket: Lessons for Inhibitor Design from a Cluster
Analysis of Many Crystal Structures. J. Med. Chem., 52,
6413-6420. DOI: 10.1021/jm900854h. PMCID: PMC3182518.
Lapelosa, M., G. Ferstandig Arnold, E. Gallicchio,
E. Arnold, and R.M Levy (2010). Antigenic Characteristics of
Rhinovirus Chimeras Designed in silico for Enhanced Presentation of HIV-1
gp41 Epitopes. J. Mol. Biol., 397, 752-766. DOI:
10.1016/j.jmb.2010.01.064. PMCID: PMC2940250.
Lapelosa, M., E. Gallicchio, G. Ferstandig Arnold,
E. Arnold and R.M. Levy (2009). In silico Vaccine Design Based on
Molecular Simulations of Rhinovirus Chimeras Presenting HIV-1 gp41
Epitopes. J. Mol. Biol., 385, 675-691. DOI:
10.1016/j.jmb.2008.10.089. PMCID: PMC2649764.
Protein stability and the evolution of drug resistance: bioinformatics and biophysics
Mutational patterns in HIV-1 Protease
Proteins evolve through random mutagenesis and their evolutionary selection is
constrained by structural, functional and environmental limitations.
Thermodynamic stability is by far the most important structural factor, as most
proteins need to be folded in order to function. The stability range of
proteins is narrow, and is estimated experimentally to be ~ 10 kcal/mol or
less. Proteins operate "on a knife's edge," whereby a single highly deleterious
mutation could potentially lead to an unfolded protein. Drug resistance is
acquired through mutations. Primary mutations are typically destabilizing, and
must be accompanied by or correlated with compensatory stabilizing mutations.
We are developing statistical methods to model observed mutational patterns
within proteins evolving in response to drug pressure. The system we are
currently working on is HIV-1 protease for which we have access to over 40,000
publicly available amino acid sequences, isolated from patients undergoing
chemotherapy. The mutational patterns in HIV protease are highly non-trivial
and involve primary mutations that confer drug resistance, compensatory
stabilizing effects, and viral evolvability. Our aim is to rationalize the
mutational patterns using coarse grained biophysical models and to thereby
relate them to the structure and energetics of the enzyme.
Computational mutagenesis methods, ranging from statistical, empirical, and
physical approaches, have been useful for understanding and predicting protein
stabilities. Most existing methods are limited because they are unable to
reproduce correctly the magnitude of the free energy change (ddG), the
difference in free energy of unfolding between wild-type and mutant proteins (>
1.0 kcal/mol deviation from experimental ddG values). In order to overcome this
problem, we are developing a computational approach that uses sampling combined
with a linear interaction energy (LIE) method to predict the changes in the
free energy of the native state induced by mutations. Initial tests of the
method have shown an unsigned error between calculated and experimental values
of <1 kcal/mol. Further development of the LIE functional form in combination
with sampling techniques is an active, promising project which will allow for
more accurate prediction of mutational effects on protein stability which are
important to the understanding of drug-resistant mutations that are formed in
many protein targets including HIV-1 protease.