Modelling Genome Structure and Function Ram Samudrala University of Washington.

Slides:



Advertisements
Similar presentations
Protein structure prediction.. Protein folds. Fold definition: two folds are similar if they have a similar arrangement of SSEs (architecture) and connectivity.
Advertisements

Protein Structure Prediction using ROSETTA
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Thomas Blicher Center for Biological Sequence Analysis
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Evaluating alignments using motif detection Let’s evaluate alignments by searching for motifs If alignment X reveals more functional motifs than Y using.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Protein Classification A comparison of function inference techniques.
Protein Structure Prediction Xiaole Shirley Liu And Jun Liu STAT115.
Current Status of Homology Modeling Using MCSG Structures 319 MCSG structures in PDB have over 400,000 sequence homologues. These structures represent.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Forces and Prediction of Protein Structure Ming-Jing Hwang ( 黃明經 ) Institute of Biomedical Sciences Academia Sinica
BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin
Modelling, comparison, and analysis of proteomes Ram Samudrala University of Washington.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Modelling proteomes An integrated computational framework for systems biology research Ram Samudrala University of Washington How does the genome of an.
Representations of Molecular Structure: Bonds Only.
Lecture 12 CS5661 Structural Bioinformatics Motivation Concepts Structure Prediction Summary.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Protein Structure Prediction Ram Samudrala University of Washington.
Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study George Chikenji*, Yoshimi Fujitsuka, and Shoji Takada*
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
An Integrated Computational Framework for Systems Biology Ram Samudrala University of Washington How does the genome of an organism specify its behaviour.
Structure prediction: Homology modeling
Computational engineering of bionanostructures Ram Samudrala University of Washington How can we analyse, design, & engineer peptides capable of specific.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Modelling protein tertiary structure Ram Samudrala University of Washington.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Modelling proteins and proteomes using Linux clusters Ram Samudrala University of Washington.
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
Bioinformatics and Computational Biology
COMPUTATIONAL ENGINEERING OF BIONANOSTRUCTURES
Modelling proteomes Ram Samudrala Department of Microbiology How does the genome of an organism specify its behaviour and characteristics?
MODELLING PROTEOMES RAM SAMUDRALA ASSOCIATE PROFESSOR UNIVERSITY OF WASHINGTON How does the genome of an organism specify its behaviour and characteristics?
Modelling proteomes Ram Samudrala University of Washington How does the genome of an organism specify its behaviour and characteristics?
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Modelling proteins and proteomes using Linux clusters Ram Samudrala University of Washington.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Modelling proteomes Ram Samudrala University of Washington.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Modelling proteomes: Application to understanding HIV disease progression Ram Samudrala Department of Microbiology University of Washington How does the.
COMPUTATIONAL ENGINEERING OF BIONANOSTRUCTURES RAM SAMUDRALA ASSOCIATE PROFESSOR UNIVERSITY OF WASHINGTON How can we design peptides and proteins capable.
Structure/function studies of HIV proteins HIV gp120 V3 loop modelling using de novo approaches HIV protease-inhibitor binding energy prediction.
Ab-initio protein structure prediction ? Chen Keasar BGU Any educational usage of these slides is welcomed. Please acknowledge.
Modelling genome structure and function Ram Samudrala University of Washington.
Modelling Genome Structure and Function Ram Samudrala University of Washington.
Modelling proteomes Ram Samudrala University of Washington How does the genome of an organism specify its behaviour and characteristics?
Modelling genome structure and function - a practical approach Ram Samudrala University of Washington.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
Forces and Prediction of Protein Structure Ming-Jing Hwang ( 黃明經 ) Institute of Biomedical Sciences Academia Sinica
Protein Structure Visualisation
University of Washington
Modelling the rice proteome
University of Washington
Molecular Modeling By Rashmi Shrivastava Lecturer
The future of protein secondary structure prediction accuracy
Rosetta: De Novo determination of protein structure
University of Washington
Homology Modeling.
Protein structure prediction.
Presentation transcript:

Modelling Genome Structure and Function Ram Samudrala University of Washington

Rationale for understanding protein structure and function Protein sequence -large numbers of sequences, including whole genomes Protein function - rational drug design and treatment of disease - protein and genetic engineering - build networks to model cellular pathways - study organismal function and evolution ? structure determination structure prediction homology rational mutagenesis biochemical analysis model studies Protein structure - three dimensional - complicated - mediates function

Comparative modelling of protein structure KDHPFGFAVPTKNPDGTMNLMNWECAIP KDPPAGIGAPQDN----QNIMLWNAVIP ** * * * * * * * ** …… scan align refine physical functions build initial model minimum perturbation construct non-conserved side chains and main chains graph theory, semfold de novo simulation

CASP4: overall model accuracy ranging from 1 Å to 6 Å for 50-10% sequence identity **T112/dhso – 4.9 Å (348 residues; 24%)**T92/yeco – 5.6 Å (104 residues; 12%) **T128/sodm – 1.0 Å (198 residues; 50%) **T125/sp18 – 4.4 Å (137 residues; 24%) **T111/eno – 1.7 Å (430 residues; 51%)**T122/trpa – 2.9 Å (241 residues; 33%) Comparative modelling at CASP CASP2 fair ~ 75% ~ 1.0 Å ~ 3.0 Å CASP3 fair ~75% ~ 1.0 Å ~ 2.5 Å CASP4 fair ~75% ~ 1.0 Å ~ 2.0 Å CASP1 poor ~ 50% ~ 3.0 Å > 5.0 Å BC excellent ~ 80% 1.0 Å 2.0 Å alignment side chain short loops longer loops

Ab initio prediction of protein structure sample conformational space such that native-like conformations are found astronomically large number of conformations 5 states/100 residues = = select hard to design functions that are not fooled by non-native conformations (“decoys”)

Semi-exhaustive segment-based folding EFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDDAEALKKALEEAGAEVEVK generate fragments from database 14-state ,  model …… minimise monte carlo with simulated annealing conformational space annealing, GA …… filter all-atom pairwise interactions, bad contacts compactness, secondary structure

Ab initio prediction at CASP CASP1: worse than random CASP2: worse than random with one exception CASP4: consistently predicted correct topology - ~4-6.0 A for residues CASP3: consistently predicted correct topology - ~ 6.0 Å for 60+ residues **T110/rbfa – 4.0 Å (80 residues; 1-80)*T114/afp1 – 6.5 Å (45 residues; 36-80) **T97/er29 – 6.0 Å (80 residues; 18-97) **T106/sfrp3 – 6.2 Å (70 residues; 6-75) *T98/sp0a – 6.0 Å (60 residues; )**T102/as48 – 5.3 Å (70 residues; 1-70) Before CASP (BC): “solved” (biased results)

Application of prediction methods to Invb

Computational aspects of structural genomics D. ab initio prediction C. fold recognition * * * * * * * * * * B. comparative modelling A. sequence space * * * * * * * * * * * * E. target selection targets F. analysis * * (Figure idea by Steve Brenner.)

Computational aspects of functional genomics structure based methods microenvironment analysis zinc binding site? structure comparison homology function? sequence based methods sequence comparison motif searches phylogenetic profiles domain fusion analyses + experimental data + * * * * G. assign function * * assign function to entire protein space

Modelling structure and function of the Oryza sativa (rice) genome Most common functions (from PROSITE) ATP/GTP-binding site motif A (P loop) Serine/Threonine protein kinase active site EF-hand (Calcium binding) Cytochrome C Heme binding site Most common functions (from annotations) Reverse transcriptase Nucleotide Binding Site (NBS) Serine/Threonine protein kinase Chitinase ~30 % with known homologs in PDB 6813 coding sequences 3149 without a product annotation 816 classified as hypothetical protein 1187 with a hypothetical function

Bioverse webserver sequence structuresummary functionsummary see another variant open/close subgroup list links (or follow) mapping to sequence

Bioverse webserver sequence structuresummary secondary structure tertiary structure summary sequence evidence for sequence evidence for tertiary structure structural similarity to another protein evidence for similarity

Bioverse webserver sequence structuresummary functionsummary function 1 function 2 evidence for function 2 functional similarity to another protein evidence for similarity

Take home message Prediction of protein structure and function can be used to model whole genomes to understand organismal function and evolution Jason McDermott Yi-Ling Chen Levitt and Moult groups Acknowledgements