Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modelling Genome Structure and Function Ram Samudrala University of Washington.

Similar presentations


Presentation on theme: "Modelling Genome Structure and Function Ram Samudrala University of Washington."— Presentation transcript:

1 Modelling Genome Structure and Function Ram Samudrala University of Washington

2 Rationale for understanding protein structure and function Protein sequence -large numbers of sequences, including whole genomes Protein function - rational drug design and treatment of disease - protein and genetic engineering - build networks to model cellular pathways - study organismal function and evolution ? structure determination structure prediction homology rational mutagenesis biochemical analysis model studies Protein structure - three dimensional - complicated - mediates function

3 Comparative modelling of protein structure KDHPFGFAVPTKNPDGTMNLMNWECAIP KDPPAGIGAPQDN----QNIMLWNAVIP ** * * * * * * * ** …… scan align refine physical functions build initial model minimum perturbation construct non-conserved side chains and main chains graph theory, semfold de novo simulation

4 CASP4: overall model accuracy ranging from 1 Å to 6 Å for 50-10% sequence identity **T112/dhso – 4.9 Å (348 residues; 24%)**T92/yeco – 5.6 Å (104 residues; 12%) **T128/sodm – 1.0 Å (198 residues; 50%) **T125/sp18 – 4.4 Å (137 residues; 24%) **T111/eno – 1.7 Å (430 residues; 51%)**T122/trpa – 2.9 Å (241 residues; 33%) Comparative modelling at CASP CASP2 fair ~ 75% ~ 1.0 Å ~ 3.0 Å CASP3 fair ~75% ~ 1.0 Å ~ 2.5 Å CASP4 fair ~75% ~ 1.0 Å ~ 2.0 Å CASP1 poor ~ 50% ~ 3.0 Å > 5.0 Å BC excellent ~ 80% 1.0 Å 2.0 Å alignment side chain short loops longer loops

5 Ab initio prediction of protein structure sample conformational space such that native-like conformations are found astronomically large number of conformations 5 states/100 residues = 5 100 = 10 70 select hard to design functions that are not fooled by non-native conformations (“decoys”)

6 Semi-exhaustive segment-based folding EFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDDAEALKKALEEAGAEVEVK generate fragments from database 14-state ,  model …… minimise monte carlo with simulated annealing conformational space annealing, GA …… filter all-atom pairwise interactions, bad contacts compactness, secondary structure

7 Ab initio prediction at CASP CASP1: worse than random CASP2: worse than random with one exception CASP4: consistently predicted correct topology - ~4-6.0 A for 60-80+ residues CASP3: consistently predicted correct topology - ~ 6.0 Å for 60+ residues **T110/rbfa – 4.0 Å (80 residues; 1-80)*T114/afp1 – 6.5 Å (45 residues; 36-80) **T97/er29 – 6.0 Å (80 residues; 18-97) **T106/sfrp3 – 6.2 Å (70 residues; 6-75) *T98/sp0a – 6.0 Å (60 residues; 37-105)**T102/as48 – 5.3 Å (70 residues; 1-70) Before CASP (BC): “solved” (biased results)

8 Application of prediction methods to Invb

9 Computational aspects of structural genomics D. ab initio prediction C. fold recognition * * * * * * * * * * B. comparative modelling A. sequence space * * * * * * * * * * * * E. target selection targets F. analysis * * (Figure idea by Steve Brenner.)

10 Computational aspects of functional genomics structure based methods microenvironment analysis zinc binding site? structure comparison homology function? sequence based methods sequence comparison motif searches phylogenetic profiles domain fusion analyses + experimental data + * * * * G. assign function * * assign function to entire protein space

11 Modelling structure and function of the Oryza sativa (rice) genome Most common functions (from PROSITE) ATP/GTP-binding site motif A (P loop) Serine/Threonine protein kinase active site EF-hand (Calcium binding) Cytochrome C Heme binding site Most common functions (from annotations) Reverse transcriptase Nucleotide Binding Site (NBS) Serine/Threonine protein kinase Chitinase ~30 % with known homologs in PDB 6813 coding sequences 3149 without a product annotation 816 classified as hypothetical protein 1187 with a hypothetical function

12 Bioverse webserver sequence structuresummary functionsummary see another variant open/close subgroup list links (or follow) mapping to sequence http://bioverse.compbio.washington.edu

13 Bioverse webserver sequence structuresummary secondary structure tertiary structure summary sequence evidence for sequence evidence for tertiary structure structural similarity to another protein evidence for similarity

14 Bioverse webserver sequence structuresummary functionsummary function 1 function 2 evidence for function 2 functional similarity to another protein evidence for similarity

15 Take home message Prediction of protein structure and function can be used to model whole genomes to understand organismal function and evolution Jason McDermott Yi-Ling Chen Levitt and Moult groups Acknowledgements


Download ppt "Modelling Genome Structure and Function Ram Samudrala University of Washington."

Similar presentations


Ads by Google