Workshop on Biological Macromolecular Structure Models RCSB PDB Piscataway, NJ November 19-20, 2005 Topic 3: Structural Genomics and Models Contributors:

Slides:



Advertisements
Similar presentations
Transmembrane Protein Topology Prediction Using Support Vector Machines Tim Nugent and David Jones Bioinformatics Group, Department of Computer Science,
Advertisements

Assignment of PROSITE motifs to topological regions: Application to a novel database of well characterised transmembrane proteins Tim Nugent.
Discovery Studio AtlasStore: Protein/Ligand Database Steve Potts, Ph.D., MBA Product Manager Biological Informatics
SG KB 2009 NIGMS Workshop: Enabling Technologies for Structural Biology Section on Structural Analysis Margaret J. Gabanyi March 4, 2009 How to Use the.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Homology Based Analysis of the Human/Mouse lncRNome
Protein Structure Database Introduction Database of Comparative Protein Structure Models ModBase 生資所 g 詹濠先.
Pfam(Protein families )
Structural bioinformatics
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Protein structure (Part 2 of 2).
Finding approximate palindromes in genomic sequences.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Workshop on Biological Macromolecular Structure Models RCSB Protein Data Bank Rutgers, The State University of New Jersey.
Protein Fold recognition Morten Nielsen, Thomas Nordahl CBS, BioCentrum, DTU.
MCSG Site Visit, Argonne, January 30, 2003 Genome Analysis to Select Targets which Probe Fold and Function Space  How many protein superfamilies and families.
Protein Fold recognition
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
The Protein Data Bank (PDB)
Protein Modules An Introduction to Bioinformatics.
Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.
Protein structure Classification Ole Lund, Associate professor, CBS, DTU.
Topic 2 Adam Godzik. JCSG approach: no model archives, building models “on the fly”
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Fold recognition Morten Nielsen, CBS, BioCentrum, DTU.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Structures.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
© Wiley Publishing All Rights Reserved. Searching Sequence Databases.
Current Status of Homology Modeling Using MCSG Structures 319 MCSG structures in PDB have over 400,000 sequence homologues. These structures represent.
Pairwise Alignment How do we tell whether two sequences are similar? BIO520 BioinformaticsJim Lund Assigned reading: Ch , Ch 5.1, get what you can.
Protein Tertiary Structure Prediction
Protein domains. Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Rising accuracy of protein secondary structure prediction Burkhard Rost
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
Worldwide Protein Data Bank Worldwide Protein Data Bank History of the PDB  1970s  Community discussions about how to establish.
COURSE OF BIOINFORMATICS Exam_31/01/2014 A.
NIGMS Protein Structure Initiative: Target Selection Workshop ADDA and remote homologue detection Liisa Holm Institute of Biotechnology University of Helsinki.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.
Protein Structure Initiative Mission Statement. The long- range goal of the Protein Structure Initiative is to make the three- dimensional atomic-level.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
November 18, 2000ICTCM 2000 Introductory Biological Sequence Analysis Through Spreadsheets Stephen J. Merrill Sandra E. Merrill Marquette University Milwaukee,
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Globins. Globin diversity Hemoglobins ( , etc) Myoglobins (muscle) Neuroglobins (in CNS) Invertebrate globins Leghemoglobins flavohemoglobins.
Point Specific Alignment Methods PSI – BLAST & PHI – BLAST.
Blast 2.0 Details The Filter Option: –process of hiding regions of (nucleic acid or amino acid) sequence having characteristics.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Marc Robinson-Rechavi Département d'Ecologie et d'Evolution Université de Lausanne Genomique structurale comparative et evolution des proteines What is.
SG KB 2009 NIGMS Workshop: Enabling Technologies for Structural Biology Section on Structural Analysis Helen M. Berman March 4, 2009 How to use the PSI.
PatchFinder. The ConSurf web-server calculates the evolutionary rate for each position in the protein. Surface clusters of spatially close & conserved.
HANDS-ON ConSurf! Web-Server: The ConSurf webserver.
Protein Tertiary Structure Prediction Structural Bioinformatics.
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
Bioinformatics Shared Resource Bioinformatics : How to… Bioinformatics Shared Resource Kutbuddin Doctor, PhD.
Protein Structure Visualisation
Bio/Chem-informatics
Genome Annotation Continued
Protein Sequence Analysis - Overview -
Identify D. melanogaster ortholog
Protein Structures.
Protein Sequence Analysis - Overview -
Homology Modeling.
Protein structure prediction.
Presentation transcript:

Workshop on Biological Macromolecular Structure Models RCSB PDB Piscataway, NJ November 19-20, 2005 Topic 3: Structural Genomics and Models Contributors: S.K. Burley, A. Fiser, A. Godzik, A. Joachimiak, J. Markley, G. Montelione, C. Orengo, A. Sali, and M. Sauder Discussion Leader: Stephen K. Burley

Role of Comparative Protein Structure Modeling in Structural Genomics

Protein Structure Initiative 2: Need for Large-Scale Homology Modeling PSI-2 will yield 3,000-4,000 protein structures, most at course granularity Each structure will represent a large number of sequence homologues Homology modeling must provide “useful” models for distant (15-30%) sequence homologues  protein function assignment and evolutionary insights Models should guide functional characterization Models must be readily accessible Models must be subject to rigorous peer review

Issues Addressed in Contributed Slides Current limitations of homology modeling Role of homology modeling in target selection/execution Role of homology modeling in structure determination Homology modeling pipelines

Current Limitations of Homology Modeling Input from Joachimiak--MCSG Sali--NYSGXRC

Issues with Homology Modeling for Structural Genomics Models for distant (15-30%) homologues are poor quality For very large families only small fraction of sequences can be reliably modeled (<10%) Modeling must guide target selection in fine coverage of protein families Domain parsing needs improvement We should be able to model multi-domain proteins from structures of individual domains We should be able to model side chains and important structural and functional features that currently are difficult to assign and predict correctly We need methods to predict unusual features and departures from the structure that is used for modelling Modelling loop and high B factor regions needs improvement

Good Models >30% seq.id. >30% seq.id. Scope for further improvement (significant e-value, bad model score) Models Based on NYSGXRC Target Structures Only 363 bad-models ≥30% sequence identity. Good Models: E-Value ≤ 1.0e -4 GAScore ≥ 0.7 Good Models <30% seq.id <30% seq.id

Questions for Homology Modeling Community Should models be stored in archives or calculated “on the fly”? Should models from pipeline approaches be centrally accessible? Should the output of pipeline approaches be made interoperable with the PDB? Should there be a publicly available model database for storage of modeling results to facilitate peer review? Should models currently on deposit in the PDB be moved elsewhere? If so, where?