Ron Levy group

Exploring
molecular
landscapes

INTRODUCTION TO CURRENT RESEARCH IN THE LEVY GROUP: 2018

Research in the Levy group is focused on the development and application of computational methods for studying the structure, function, and dynamics of proteins. We work on problems involving the interplay between computational models in structural biology and experiments at different levels of resolution and different time scales. Using a statistical mechanics framework, we are mapping conformational free energy landscapes that determine the statistical thermodynamic basis for protein-ligand binding and protein allostery. We are surveying corresponding fitness landscapes in sequence space, and developing new statistical methods to analyze how correlated mutations evolve under drug selection pressure, leading to resistance.

Protein Dynamics, Folding, and Misfolding

In recent work we have coupled replica exchange simulations with kinetic network models to elucidate the heterogeneity of the pathways by which small proteins fold. In a 2015 report we explained why relaxation within the protein unfolded basin can be very fast even though state to state first passage times within that basin can be very slow, thus resolving a paradox the field was grappling with. Our work also explains why many small proteins fold with single exponential kinetics even though the folding pathways are diverse and have different barriers.

Levy, Ronald M., Wei Dai, Nan-Jie Deng, and Dmitrii E. Makarov (2013). How long does it take to equilibrate the unfolded state of a protein?. Protein Science, 22(11), 1459-1465. DOI: 10.1002/pro.2335. PMCID: PMC3831662.
Deng, Nan-jie, Wei Dai, and Ronald M. Levy (2013). How Kinetics within the Unfolded State Affects Protein Folding: An Analysis Based on Markov State Models and an Ultra-Long MD Trajectory. J. Phys. Chem. B, 117(42), 12787-12799. DOI: 10.1021/jp401962k. PMCID: PMC3808496.
Xia, Junchao and Ronald M. Levy (2014). Molecular Dynamics of the Proline Switch and Its Role in Crk Signaling. J. Phys. Chem. B, 118(17), 4535-4545. DOI: 10.1021/jp5013297. PMCID: PMC4007982.
Dai, Wei, Anirvan M. Sengupta, and Ronald M. Levy (2015). First Passage Times, Lifetimes, and Relaxation Times of Unfolded Proteins. Phys. Rev. Lett., 115(4), 048101. DOI: 10.1103/PhysRevLett.115.048101. PMCID: PMC4531052.

Solvation Thermodynamics in Biophysics and Structural Biology

It is widely believed that the displacement of water from binding sites at protein receptor surfaces plays a key role in determining protein-ligand binding affinity. The loosely expressed idea is that by displacing a water molecule whose thermodynamic signature is “unfavorable” relative to the bulk, a ligand may gain extra binding affinity. While this suggestion appears frequently in the literature, the idea is not clearly formulated in this way, as the chemical potential of the solvent is constant throughout the solution. However the excess chemical potential varies throughout the solution and knowledge of the direct and indirect parts can be used to inform the ligand design process. Earlier efforts to take the thermodynamic signatures of interfacial waters into account for the ligand design process have been based on inhomogeneous solvation theory (IST). We have questioned whether IST is the best framework for attacking this problem and have proposed an alternative approach based on classical density functional theory (DFT). We plan to use this new statistical thermodynamic framework to analyze the role of interfacial water in protein-ligand binding and protein stability.

Levy, Ronald M., Di Cui, Bin W. Zhang, and Nobuyuki Matubayasi (2017). The Relationship Between Solvation Thermodynamics from IST and DFT Perspectives. The Journal of Physical Chemistry B, 121(15), 3825-3841. DOI: 10.1021/acs.jpcb.6b12889. PMCID: PMC5869707.
Cui, Di, Bin W. Zhang, Nobuyuki Matubayasi, and Ronald M. Levy (2017). The Role of Interfacial Water in Protein–Ligand Binding: Insights from the Indirect Solvent Mediated Potential of Mean Force. Journal of Chemical Theory and Computation, 14(2), 512–526. DOI: 10.1021/acs.jctc.7b01076. PMCID: PMC5897112.
Zhang, Bin W., Di Cui, Nobuyuki Matubayasi, and Ronald M. Levy (2018). The Excess Chemical Potential of Water at the Interface with a Protein from End Point Simulations. The Journal of Physical Chemistry B, 122(17), 4700-4707. DOI: 10.1021/acs.jpcb.8b02666. PMCID: PMC5939383.

Mapping Free Energy Landscapes for Protein-Ligand Binding and Allostery, and the Design of Inhibitors of HIV-1 Proteins

Conformational dynamics plays a fundamental role in the regulation of molecular recognition and statistical mechanics provides the framework to derive a comprehensive theory for the binding of a ligand to a protein. Our goal is to develop models of sufficient accuracy to be predictive for thermodynamic and kinetic properties, but also as important to generate qualitative insights about the molecular mechanisms for binding and allosteric conformational transitions. We have long standing strong collaborations with structural biologists, biochemists and virologists, working on the design of inhibitors of HIV-1 proteins. Our model of the structure of the ALLINI induced IN multimer, was validated by the first reported crystal structure published in 2016.

Vijayan, R. S. K., Peng He, Vivek Modi, Krisna C. Duong-Ly, Haiching Ma, Jeffrey R. Peterson, Roland L. Dunbrack, and Ronald M. Levy (2015). Conformational Analysis of the DFG-Out Kinase Motif and Biochemical Profiling of Structurally Validated Type II Inhibitors. Journal of Medicinal Chemistry, 58(1), 466-479. DOI: 10.1021/jm501603h. PMCID: PMC4326797.
Gallicchio, Emilio, Nanjie Deng, Peng He, Lauren Wickstrom, Alexander L. Perryman, Daniel N. Santiago, Stefano Forli, Arthur J. Olson, and Ronald M. Levy (2014). Virtual screening of integrase inhibitors by large scale binding free energy calculations: the SAMPL4 challenge. Journal of Computer-Aided Molecular Design, 28(4), 475-490. DOI: 10.1007/s10822-014-9711-9. PMCID: PMC4137862.
Mentes, Ahmet, Nan-Jie Deng, R. S. K. Vijayan, Junchao Xia, Emilio Gallicchio, and Ronald M. Levy (2016). Binding Energy Distribution Analysis Method: Hamiltonian Replica Exchange with Torsional Flattening for Binding Mode Prediction and Binding Free Energy Estimation. Journal of Chemical Theory and Computation, 12(5), 2459-2470. DOI: 10.1021/acs.jctc.6b00134. PMCID: PMC4862910.
Deng, Nanjie, Ashley Hoyte, Yara E Mansour, Mosaad S. Mohamed, James R. Fuchs, Alan N. Engelman, Mamuka Kvaratskhelia, and Ronald M. Levy (2016). Allosteric HIV-1 integrase inhibitors promote aberrant protein multimerization by directly mediating inter-subunit interactions: Structural and thermodynamic modeling studies. Protein Science, 25(11), 1911-1917. DOI: 10.1002/pro.2997. PMCID: PMC5079246.
Zhang, Bin W., Nanjie Deng, Zhiqiang Tan, and Ronald M. Levy (2017). Stratified UWHAM and Its Stochastic Approximation for Multicanonical Simulations Which are Far from Equilibrium. Journal of Chemical Theory and Computation, 13(None), 4660-4674. DOI: 10.1021/acs.jctc.7b00651. PMCID: PMC5897113.

Mapping the Fitness Landscapes of Proteins and the Evolution of Drug Resistance

We are pioneering new sequence based statistical inference methods to analyze correlated mutations that arise from evolutionary constraints and from drug selection pressure. We are using both sequence based and structure based approaches to map the conformational and fitness landscapes of kinase family proteins. We have constructed Potts Hamiltonian models based on multiple sequence alignments of HIV-1 proteins, and used these models to study the evolution of HIV-1 under drug selection pressure which leads to entrenchment of primary mutations through epistatic interactions with the sequence background.

Haldane, Allan, William F. Flynn, Peng He, R. S. K. Vijayan, and Ronald M. Levy (2016). Structural propensities of kinase family proteins from a Potts model of residue co-variation. Protein Science, 25(8), 1378-1384. DOI: 10.1002/pro.2954. PMCID: PMC4972195.
Levy, Ronald M., Allan Haldane, and William F. Flynn (2017). Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness. Current Opinion in Structural Biology, 43, 55-62. DOI: 10.1016/j.sbi.2016.11.004. PMCID: PMC5869684.
Flynn, William F., Allan Haldane, Bruce E. Torbett, and Ronald M. Levy (2017). Inference of epistatic effects leading to entrenchment and drug 6 resistance in HIV-1 protease. Molecular Biology and Evolution, 34(6), 1291. DOI: 10.1093/molbev/msx095. PMCID: PMC5435099.
Haldane, Allan, William F. Flynn, Peng He, and Ronald M. Levy (2018). Coevolutionary Landscape of Kinase Family Proteins: Sequence Probabilities and Functional Motifs. Biophysical Journal, 114(1), 21-31. DOI: 10.1016/j.bpj.2017.10.028. PMCID: PMC5773752.

INTRODUCTION TO RESEARCH IN THE LEVY GROUP: 2012

Molecular simulations have come to play a central role in structural biology and biophysics; they are beginning to be used in cell and systems biology as well. Simulations help us develop our intuition about the behavior of models which link biological structures to function. Protein folding, molecular recognition and ligand binding, biological machines used for transport and signaling, these are some of the research areas that have been greatly enriched by computational approaches based on molecular simulations. The design of effective potentials for modeling solvation effects implicitly, the development of new sampling methods based on replica exchange molecular dynamics, and the exploration of rare events using network models are themes in the Levy group which run through our current research highlighted below.

New multi-scale models, effective potentials, and sampling methods for molecular simulations

Implicit Solvent Models

Water plays a fundamental role in virtually all biological processes. The accurate modeling of hydration thermodynamics is therefore essential for studying protein conformational equilibria, aggregation, and binding. Explicit solvent models provide the most detailed description of hydration phenomena, but they have inherent limitations that motivate the search for other ways to represent solvation. Implicit solvent models, which are based on the statistical mechanics concept of the solvent potential of mean force, are very useful alternatives to explicit solvation for modeling protein folding and binding.

We have developed an implicit solvent effective potential (AGBNP) that is suitable for molecular dynamics simulations and high- resolution modeling. It is based on a novel implementation of the pairwise descreening Generalized Born model for the electrostatic component and a nonpolar hydration free energy estimator. The model is fully analytical with first derivatives and is computationally efficient and has been incorporated into the IMPACT molecular simulation program.

Gallicchio, E., K. Paris, and R.M. Levy (2009). The AGBNP2 Implicit Solvent Model. J. Chem. Theory and Comput., 5, 2544-2564. DOI: 10.1021/ct900234u. PMCID: PMC2857935.

Replica Exchange Molecular Dynamics

One of the key challenges in the computer simulation of proteins at the atomic level is the sampling of conformational space. The efficiency of many common sampling protocols such as Monte-Carlo (MC) and Molecular Dynamics (MD) is limited by the need to cross high free- energy barriers and rugged energy landscapes. In the Replica Exchange (RE) algorithm many coupled simulations are run in parallel; the coupling is achieved by exchanging either thermodynamic or Hamiltonian parameters. We are exploring novel implementations of replica exchange to accelerate the convergence of protein folding simulations, and the calculation of protein-ligand binding free energies. In the latter example, the REMD coupling parameter is the protein-ligand interaction energy.

Network Models, Rare Events, and Transition Paths

We can think of the data generated by replica exchange simulations of protein folding and protein-ligand binding as generating a trace through phase space. We can organize the data using graph theoretic ideas, where the nodes are conformations and the edges represent possible jumps between closely related nodes. We use the magic of histogram re-weighting to assign relative probabilities to the jumps (edges) which connect the nodes. Furthermore, we can construct transition paths on the graphs which correspond to rare transitions between stable states, and then use the tools of transition path theory to analyze the transition path ensemble, in order to determine the number of truly different important paths and their fluxes. This is a form of multi-scale modeling where the underlying data is derived from atomic simulations, while the transition paths are constructed ex post facto.

Gallicchio, E., M. Andrec, A.K. Felts, and R.M. Levy (2005). Temperature Weighted Histogram Analysis Method, Replica Exchange, and Transition Paths. J. Phys. Chem. B, 109, 6722-6731. DOI: 10.1021/jp045294f.
Gallicchio, E., R.M. Levy, and M. Parashar (2008). Asynchronous Replica Exchange for Molecular Simulations. J. Comput. Chem., 29, 788-794. DOI: 10.1002/jcc.20839. PMCID: PMC2977925.
Zheng, W., M. Andrec, E. Gallicchio, and R.M. Levy (2007). Simulating replica exchange simulations of protein folding with a kinetic network model. Proc. Natl. Acad. Sci. USA, 104, 15340-15345. DOI: 10.1073/pnas.0704418104. PMCID: PMC2000486.
Zheng, W., M. Andrec, E. Gallicchio, and R.M. Levy 2008. Simple Continuous and Discrete Models for Simulating Replica Exchange Simulations of Protein Folding. J. Phys. Chem. B, 112, 6083-6093. DOI: 10.1021/jp076377+. PMCID: PMC2978075.

Protein folding and misfolding: fundamental physics, health-related applications

Kinetic Network Models and Protein Folding

Protein folding is a fundamental problem in modern molecular biophysics and is an example of a slow process occurring via rare events in a high-dimensional configurational space. For this reason, it is difficult for an all-atom simulation to obtain meaningful information on the kinetics and pathways of such processes. A number of strategies for addressing this problem have been proposed over the years that involve focusing on the important slow processes while neglecting the less interesting rapid kinetics by simplification of the state space or reduction of dimensionality. Generalized ensemble methods such as replica exchange molecular dynamics (REMD) have been developed that enhance the ability to obtain accurate canonical populations in complex systems by increasing sampling efficiency. However, since REMD involves temperature swaps between MD trajectories, it is not straightforward to obtain kinetic information from such simulations. We have chosen to make use of a kinetic network model in which the nodes correspond to discrete molecular conformations from REMD simulation trajectories (rather than macrostates), and the edges are derived from an ansatz based on structural similarity. Since the network is a discretized representation of the system and does not require additional energy and force evaluations, there is a considerable gain in efficiency, allowing us to study much slower kinetic processes than would be accessible using conventional MD. Since the network topology can be constructed based on virtually all degrees of freedom, this allows for multiple pathways and complex transition states. Application of the kinetic network model to study the folding of mini-proteins such as Trp-Cage, and to the conformational reorganization of the HIV-1 Protease receptor pocket, provides us with detailed information about the rich variety of transition pathways both their structural characteristics and their kinetics.

Andrec, M., A.K. Felts, E. Gallicchio, and R.M. Levy (2005). Protein Folding Pathways from Replica Exchange Simulations and a Kinetic Network Model. Proc. Natl. Acad. Sci. USA, 102, 6801-6806. DOI: 10.1073/pnas.0408970102. PMCID: PMC1100763
Zheng, W., M. Andrec, E. Gallicchio, and R.M. Levy (2009). Recovering Kinetics from a Simplified Protein Folding Model Using Replica Exchange Simulations: A Kinetic Network and Effective Stochastic Dynamics. J. Phys. Chem. B, 113, 11702-11709. DOI: 10.1021/jp900445t. PMCID: PMC2975981.

Health-related applications: α synuclein and Parkinson's disease

Unlike globular proteins with stable native structures, partially folded proteins, and natively unfolded proteins are best characterized as conformational ensembles of rapidly interconverting structures. Structural characterization of these dynamic systems is critical to understanding the basis for protein misfolding that results in protein aggregation and disease. In collaboration with Jean Baum's research group we are integrating experiments and computational models to develop a molecular picture of the steps involved in the aggregation of a synuclein, an intrinsically disordered protein that appears in aggregated forms in the brains of patients with Parkinson's disease. The conversion from the monomeric disordered state to the highly ordered aggregate fibril form is both complex and not well understood. Aggregation rates are sensitive to changes in amino acid sequence and environmental conditions. We are focusing on the early steps in the aggregation process.

Weinstock, D.S., C. Narayanan, J. Baum, and R.M. Levy (2008). Correlation Between 13C-alpha Chemical Shifts and Helix Content of Peptide Ensembles. Protein Science, 17, 950-954. DOI: 10.1110/ps.073365408. PMCID: PMC2327285.
Wu, K-P, D.S. Weinstock, C. Narayanan, R.M. Levy, and J. Baum (2009). Structural reorganization of a-synuclein at low pH observed by NMR and REMD simulations. J. Mol. Biol., 391, 784-796. DOI: 10.1016/j.jmb.2009.06.063. PMCID: PMC2766395.

The energy landscape for ligand binding to protein receptors and drug design

Understanding how to accurately predict binding free energies of small molecule ligands to protein receptor targets is a very active area of research in our lab. The statistical mechanics of binding can be formulated in many alternative ways; we are pursuing approaches particularly well suited to modern cluster computing environments with highly parallelized computation.

Gallicchio E., and R.M. Levy (2012). Prediction of SAMPL3 Host-Guest Affinities withthe Binding Energy Distribution Analysis Method (BEDAM). J Comp Aided Mol Design, 26, 505-516. DOI: 10.1007/s10822-012-9552-3. PMCID: PMC3383899.

Statistical mechanics and the free energy of binding

A new method has been developed for estimating protein-ligand affinities with implicit solvation called the Binding Energy Distribution Analysis Method (BEDAM) which is a statistical mechanics formulation for the free energy of binding based on the analysis of the distribution of binding energies between the ligand and the protein receptor obtained in a fictitious ensemble in which the ligand resides in the binding pocket without interacting with the receptor. This framework is developed based on implicit solvation, Hamiltonian replica exchange parallel molecular dynamics sampling, and reweighting techniques. The formulation takes into account ligand-receptor interactions, multiple binding modes, and conformational entropy losses and is currently being applied to a series of important medicinal targets.

Receptor reorganization in protein-ligand binding: HIV-1 Protease and Reverse Transcriptase

We are simultaneously investigating computational approaches to calculate receptor strain energies that are a component of binding free energies. The overall affinity of a ligand for a receptor can be expressed as a balance between the strength of the interactions of the ligand for a particular binding-competent conformation of the receptor and the probability of occurrence of that conformation in the absence of the ligand. For receptors that do not experience much conformational change upon binding, the affinity is primarily based on the interaction between the ligand and receptor. However, for flexible receptors such as HIV-1 reverse transcriptase (RT), HIV-1 protease and a number of kinases, the conformational changes in the receptor may require inclusion of receptor reorganization or strain energy to properly model the binding of ligands to that protein. We are exploring the use of bioinformatic techniques and simulations to model the receptor strain energy for flexible receptors.

We are also working with collaborators at Rutgers (Eddy Arnold group) and the University of Pittsburgh School of Medicine (Mike Parniak group) to identify lead compounds that will inhibit the ribonuclease H activity of HIV-1 RT. Structure based design of inhibitors for the RNase H function of RT has lagged behind inhibitor design for the polymerase function of RT due to limited structural information about the binding modes of inhibitors to the RNase H active site. Our current strategy attempts to leverage experimental information generated by our collaborators which has identified active compounds in large libraries by comparing this information with the predictions of enrichment for the same compounds when they are docked to many putative receptor sites on the RNase H domain.

Paris, K.A., O. Haq, A.K. Felts, K. Das, E. Arnold, and R.M. Levy (2009). Conformational Landscape of the Human Immunodeficiency Virus Type 1 Reverse Transcriptase Non-Nucleoside Inhibitor Binding Pocket: Lessons for Inhibitor Design from a Cluster Analysis of Many Crystal Structures. J. Med. Chem., 52, 6413-6420. DOI: 10.1021/jm900854h. PMCID: PMC3182518.
Lapelosa, M., G. Ferstandig Arnold, E. Gallicchio, E. Arnold, and R.M Levy (2010). Antigenic Characteristics of Rhinovirus Chimeras Designed in silico for Enhanced Presentation of HIV-1 gp41 Epitopes. J. Mol. Biol., 397, 752-766. DOI: 10.1016/j.jmb.2010.01.064. PMCID: PMC2940250.
Lapelosa, M., E. Gallicchio, G. Ferstandig Arnold, E. Arnold and R.M. Levy (2009). In silico Vaccine Design Based on Molecular Simulations of Rhinovirus Chimeras Presenting HIV-1 gp41 Epitopes. J. Mol. Biol., 385, 675-691. DOI: 10.1016/j.jmb.2008.10.089. PMCID: PMC2649764.

Protein stability and the evolution of drug resistance: bioinformatics and biophysics

Mutational patterns in HIV-1 Protease

Proteins evolve through random mutagenesis and their evolutionary selection is constrained by structural, functional and environmental limitations. Thermodynamic stability is by far the most important structural factor, as most proteins need to be folded in order to function. The stability range of proteins is narrow, and is estimated experimentally to be ~ 10 kcal/mol or less. Proteins operate "on a knife's edge," whereby a single highly deleterious mutation could potentially lead to an unfolded protein. Drug resistance is acquired through mutations. Primary mutations are typically destabilizing, and must be accompanied by or correlated with compensatory stabilizing mutations. We are developing statistical methods to model observed mutational patterns within proteins evolving in response to drug pressure. The system we are currently working on is HIV-1 protease for which we have access to over 40,000 publicly available amino acid sequences, isolated from patients undergoing chemotherapy. The mutational patterns in HIV protease are highly non-trivial and involve primary mutations that confer drug resistance, compensatory stabilizing effects, and viral evolvability. Our aim is to rationalize the mutational patterns using coarse grained biophysical models and to thereby relate them to the structure and energetics of the enzyme.

Computational mutagenesis methods, ranging from statistical, empirical, and physical approaches, have been useful for understanding and predicting protein stabilities. Most existing methods are limited because they are unable to reproduce correctly the magnitude of the free energy change (ddG), the difference in free energy of unfolding between wild-type and mutant proteins (> 1.0 kcal/mol deviation from experimental ddG values). In order to overcome this problem, we are developing a computational approach that uses sampling combined with a linear interaction energy (LIE) method to predict the changes in the free energy of the native state induced by mutations. Initial tests of the method have shown an unsigned error between calculated and experimental values of <1 kcal/mol. Further development of the LIE functional form in combination with sampling techniques is an active, promising project which will allow for more accurate prediction of mutational effects on protein stability which are important to the understanding of drug-resistant mutations that are formed in many protein targets including HIV-1 protease.