1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Structures Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a (last edit.

Slides:



Advertisements
Similar presentations
Global Sequence Alignment by Dynamic Programming.
Advertisements

Alignment methods Introduction to global and local sequence alignment methods Global : Needleman-Wunch Local : Smith-Waterman Database Search BLAST FASTA.
Lecture 8 Alignment of pairs of sequence Local and global alignment
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Introduction Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a.
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Alignments 1 Sequence Analysis.
CENG 789 – Digital Geometry Processing 06- Rigid-Body Alignment Asst. Prof. Yusuf Sahillioğlu Computer Eng. Dept,, Turkey.
Introduction to Bioinformatics Burkhard Morgenstern Institute of Microbiology and Genetics Department of Bioinformatics Goldschmidtstr. 1 Göttingen, March.
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Structures Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a.
Proteins  Proteins control the biological functions of cellular organisms  e.g. metabolism, blood clotting, immune system amino acids  Building blocks.
Alignment methods and database searching April 14, 2005 Quiz#1 today Learning objectives- Finish Dotter Program analysis. Understand how to use the program.
©CMBI 2002 Homology modelling ? X-ray ? NMR ? Intro Proteins Modelling 8 Steps Detect Threading Alignment Template Side chain Indels Optimize Validate.
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Several motifs (  -sheet, beta-alpha-beta, helix-loop-helix) combine to form a compact globular.
Protein threading Structure is better conserved than sequence
Alignment methods June 26, 2007 Learning objectives- Understand how Global alignment program works. Understand how Local alignment program works.
Pairwise Alignment Global & local alignment Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis.
Recap 3 different types of comparisons 1. Whole genome comparison 2. Gene search 3. Motif discovery (shared pattern discovery)
Sequence analysis of nucleic acids and proteins: part 1 Based on Chapter 3 of Post-genome Bioinformatics by Minoru Kanehisa, Oxford University Press, 2000.
Structure Alignment in Polynomial Time Rachel Kolodny Stanford University Nati Linial The Hebrew University of Jerusalem.
Alignment methods II April 24, 2007 Learning objectives- 1) Understand how Global alignment program works using the longest common subsequence method.
Protein Structures.
IBGP/BMI 705 Lab 4: Protein structure and alignment TA: L. Cooper.
Bioiformatics I Fall Dynamic programming algorithm: pairwise comparisons.
BIOMETRICS Module Code: CA641 Week 11- Pairwise Sequence Alignment.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Structures Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
COMPARATIVE or HOMOLOGY MODELING
CISC667, S07, Lec5, Liao CISC 667 Intro to Bioinformatics (Spring 2007) Pairwise sequence alignment Needleman-Wunsch (global alignment)
Structure superposition ≠ Structure alignment Lecture 11 Chapter 16, Du and Bourne “Structural Bioinformatics”
PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
1 Randomized Algorithms for Three Dimensional Protein Structures Comparison Yaw-Ling Lin Dept Computer Sci and Info Engineering, Providence University,
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
Lecture 6. Pairwise Local Alignment and Database Search Csc 487/687 Computing for bioinformatics.
Jinxiang Chai CSCE441: Computer Graphics 3D Transformations 0.
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
We want to calculate the score for the yellow box. The final score that we fill in the yellow box will be the SUM of two other scores, we’ll call them.
DALI Method Distance mAtrix aLIgnment
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
A data-mining approach for multiple structural alignment of proteins WY Siu, N Mamoulis, SM Yiu, HL Chan The University of Hong Kong Sep 9, 2009.
3D Transformation A 3D point (x,y,z) – x,y, and z coordinates
MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance Andrew I. Jewett, Conrad C. Huang and Thomas.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
CENG 789 – Digital Geometry Processing 07- Rigid-Body Alignment Asst. Prof. Yusuf Sahillioğlu Computer Eng. Dept,, Turkey.
Lab Meeting 10/08/20041 SuperPose: A Web Server for Automated Protein Structure Superposition Gary Van Domselaar October.
Find the optimal alignment ? +. Optimal Alignment Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) Find a balance.
1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Several motifs (  -sheet, beta-alpha-beta, helix-loop-helix) combine to form a compact globular.
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Protein Structure Comparison
Multiple sequence alignment (msa)
The ideal approach is simultaneous alignment and tree estimation.
What is Bioinformatics?
Introduction to bioinformatics 2007
Sequence Alignment 11/24/2018.
Bioinformatics Biological Data Computer Calculations +
Protein Folding and Protein Threading
Large-Scale Genomic Surveys
Protein Structures.
BIOINFORMATICS Introduction
BIOINFORMATICS Summary
Protein structure prediction.
BIOINFORMATICS Sequence Comparison
Presentation transcript:

1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu BIOINFORMATICS Structures Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a (last edit in Fall '05)

2 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Contents: Structures What Structures Look Like? Structural Alignment by Iterated Dynamic Programming  RMS Superposition  Rotating and Translating Structures Scoring Structural Similarity

3 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Other Aspects of Structure, Besides just Comparing Atom Positions Atom Position, XYZ triplets Lines, Axes, Angles Surfaces, Volumes

4 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu What is Protein Geometry? Coordinates (X, Y, Z’s) Derivative Concepts  Distance, Surface Area, Volume, Cavity, Groove, Axes, Angle, &c Relation to  Function, Energies (E(x)), Dynamics (dx/dt)

5 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Depicting Protein Structure: Sperm Whale Myoglobin

6 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Incredulase

7 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Sperm Whale Myoglobin

8 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Structural Alignment of Two Globins

9 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Automatic Alignment to Build Fold Library Hb Mb Alignment of Individual Structures Fusing into a Single Fold “Template” Hb VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLS-----HGSAQVKGHGKKVADALTNAV |||.. | |.|| |. |. | | | | | | |.|.| || | ||. Mb VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKTEAEMKASEDLKKHGVTVLTALGAIL Hb AHVD-DMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR | |. || |....|.. | |..|.. | |. ||. Mb KK-KGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRHPGDFGADAQGAMNKALELFRKDIAAKYKELGYQG Elements: Domain definitions; Aligned structures, collecting together Non-homologous Sequences; Core annotation Previous work: Remington, Matthews ‘80; Taylor, Orengo ‘89, ‘94; Artymiuk, Rice, Willett ‘89; Sali, Blundell, ‘90; Vriend, Sander ‘91; Russell, Barton ‘92; Holm, Sander ‘93 ; Godzik, Skolnick ‘94; Gibrat, Madej, Bryant ‘96; Falicov, F Cohen, ‘96; Feng, Sippl ‘96; G Cohen ‘97; Singh & Brutlag, ‘98

10 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Automatically Comparing Protein Structures Given 2 Structures (A & B), 2 Basic Comparison Operations 1Given an alignment optimally SUPERIMPOSE A onto B Find Best R & T to move A onto B 2Find an Alignment between A and B based on their 3D coordinates Cor e Given a alignment, find a superposition Given a superposition, find an alignment

11 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu RMS Superposition

12 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu RMS Superposition (1) A B Cor e

13 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu RMS Superposition (2): Distance Between an Atom in 2 Structures Cor e

14 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu RMS Superposition (3): RMS Distance Between Aligned Atoms in 2 Structures Cor e

15 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu RMS Superposition (4): Rigid-Body Rotation and Translation of One Structure (B) Cor e

16 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Rotation Matrices 2D3D

17 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu RMS Superposition (5): Optimal Movement of One Structure to Minimize the RMS Methods of Solution: springs (F ~ kx) SVD Kabsch Cor e

18 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Detail on spring minimization

19 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Alignment

20 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Structural Alignment (1c*) Similarity Matrix for Structural Alignment Structural Alignment  Similarity Matrix S(i,J) depends on the 3D coordinates of residues i and J  Distance between CA of i and J  M(i,j) = 100 / (5 + d 2 )

21 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Alignment (1) Make a Similarity Matrix (Like Dot Plot)

22 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Alignment (2): Dynamic Programming, Start Computing the Sum Matrix new_value_cell(R,C) <= cell(R,C) { Old value, either 1 or 0 } + Max[ cell (R+1, C+1), { Diagonally Down, no gaps } cells(R+1, C+2 to C_max),{ Down a row, making col. gap } cells(R+2 to R_max, C+2) { Down a col., making row gap } ]

23 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Alignment (3):Dynamic Programming, Keep Going

24 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Alignment (4): Dynamic Programming, Sum Matrix All Done

25 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Alignment (5): Traceback Find Best Score (8) and Trace Back A B C N Y - R Q C L C R - P M A Y C - Y N R - C K C R B P

26 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Generalized Similarity Matrix PAM(A,V) = 0.5  Applies at every position i, J)  Specific Matrix for each pair of residues i in protein 1 and J in protein 2  Example is Y near N-term. matches any C-term. residue (Y at J=2) S(i,J)  Doesn’t need to depend on a.a. identities at all!  Just need to make up a score for matching residue i in protein 1 with residue J in protein 2 J i

27 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Use of Generalized Similarity Matrix in Sequence Comparison, Structure Comparison, & Threading

28 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Seq. Alignment, Struc. Alignment, Threading Cor e

29 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Iteration and Dyn. Prog. Issues

30 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu In Structural Alignment, Not Yet Done (Step 6*) Use Alignment to LSQ Fit Structure B onto Structure A  However, movement of B will now change the Similarity Matrix This Violates Fundamental Premise of Dynamic Programming  Way Residue at i is aligned can now affect previously optimal alignment of residues (from 1 to i-1) ACSQRP--LRV-SH -R SENCV A-SNKPQLVKLMTH VK DFCV-

31 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu How central idea of dynamic programming is violated in structural alignment Cor e Same theme as the Sec. Struc. alignment

32 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Structural Alignment (7*), Iterate Until Convergence 1Compute Sim. Matrix 2Align via Dyn. Prog. 3RMS Fit Based on Alignment 4Move Structure B 5 Re-compute Sim. Matrix 6If changed from #1, GOTO #2

33 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu Score S at End Just Like SW Score, but also have final RMS S = Total Score S(i,j) = similarity matrix score for aligning i and j Sum is carried out over all aligned i and j n = number of gaps (assuming no gap ext. penalty) G = gap penalty Score the same way as for sequences with EVD

34 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu End of class 2005,11.16 (m-10)