Ron Levy group

Exploring
molecular
landscapes

Structural Bioinformatics I

Biology 5411

Fall 2015

Time and Location:

  • 5:30PM, SERC 456

Instrcutors:

Description

This course will cover the basic concepts of structural bioinformatics. A broad qualitative overview of macromolecular structure and protein folding will be provided which includes sequence alignment, secondary structure calculation, tertiary structure prediction and an overview of biological databases. An introduction to programming languages, data mining and algorithms used in Bioinformatics will be covered to provide competence in handling large and complex biological data.

Reference Texts

  • Arthur M. Lesk, Introduction to Protein Science: Architecture, Function, and Genomics, 2nd ed. Oxford University Press, 2010
  • Gregory A. Petsko, Dagmar Ringe, Protein Structure and Function, New Science Press, 2004
  • Carl Branden and John Tooze, Introduction to Protein Structure, 2nd ed. Garland Science, 1998
  • Arthur M. Lesk, Introduction to Bioinformatics, 3rd ed. Oxford University Press, 2008
  • R. Durbin, S. Eddy, A. Krogh, Biological Sequence Analysis, Cambridge Univesity Press, 1998
  • Jenny Gu, Philip E. Bourne, Structural Bioinformatics, 2nd Edition, Wiley, 2009
  • David Mount, Bioinformatics: Sequence and Genome Analysis, Cold Spring Harbor New York, 2013
  • Neil C. Jones and Pavel A. Pevzner, An Introduction to Bioinformatics Algorithms, MIT Press, 2004.
  • Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer 2006.

Syllabus:

pdf

I. Introduction to Protein Structure

Dates Topics Lecturer Lecture Slides
8/24 Structure and chemistry of amino acids
Basic structural features of polypeptides
Primary structure
Secondary structure
Vijayan Ramaswamy ppt
8/31 Tertiary and quaternary Structure
Protein main-chain conformation and Ramachandran plots
Sidechain conformation and rotamer libraries
Protein folding patterns and structural classification and structural superposition
Vijayan Ramaswamy

II. Protein Folding and Design

Dates Topics Lecturer Lecture Slides
9/14 Basic concepts: stability of the native state, kinetics of protein folding
Experimental characterization of events in protein folding
Thermodynamics of protein folding, hydrophobic collapse and molten globule
Impact of free energy landscapes on folding kinetics: folding funnel
Vincenzo Carnevale ppt
9/21 Effect of denaturants on folding/unfolding equilibrium
Relationship between native structure and folding
The hierarchical model
Protein engineering and design
Vincenzo Carnevale

III. Bioalgorithms

Dates Topics Lecturer Lecture Slides
9/28 Introduction to UNIX and the command line
Shell commands, file system
Process management
Text editing
Vincent Voelz pdf
10/05 Introduction to scripting with Python
The interpreter
Data types: Strings, Lists, tuples, and dictionaries
Looping and Control Flow
Writing python scripts
Functions, Classes, Modules
Scripting example: mc.py
Vincent Voelz pdf
10/05 Making plots with matplotlib
Scientific computing with python
Searching and sorting
Graph theory
Depth-first vs. Breadth-first searches
Computational complexity
Vincent Voelz pdf
10/12 Data clustering and classification
Distance metrics
The RMSD
Clustering
k-centers, k-means clustering
Hierarchical clustering
Support Vector Machines, regression
Vincent Voelz pdf

IV. Databases

Dates Topics Lecturer Lecture Slides
10/19 Repositories and information retrieval
Nucleotide sequence databases
Protein sequence databases
Sequence motif databases
Vijayan Ramaswamy ppt
10/26 Protein structure databases
Small molecule databases
Protein structure repositories and visualization tools
Vijayan Ramaswamy

V. Bioinformatics of Protein Sequences and Structure

Dates Topics Lecturer Lecture Slides
11/02 Sequence Alignments
Measures of sequence similarity
Computing the alignment of two sequences (Smith-Waterman)
The dynamic programming algorithm
Vincenzo Carnevale TBD
11/09 Statistical significance of alignments
Multiple sequence Alignments
Structural inferences from multiple sequence alignments
Markov chains and Hidden Markov Models
Vincenzo Carnevale TBD
11/16 Formal definition of HMMs
Most probable state path: the Viterbi algorithm
The forward algorithm
Posterior decoding
Parameter estimation for HMMs
HMM model structure: choice of topology
Probabilistic modeling of sequence ensembles
Vincenzo Carnevale TBD
12/02 The direct problem
Statistical models and observables
Entropy and Kullback-Leibler divergence
The inverse problem
Statement of the inverse problem
Bayesian formulation
Maximum likelihood criteria
Vincenzo Carnevale TBD