Presentation is loading. Please wait.

Presentation is loading. Please wait.

Identifying Functional signatures in Proteins - a computational design approach David Bernick Rohl group 16-Mar-2005.

Similar presentations


Presentation on theme: "Identifying Functional signatures in Proteins - a computational design approach David Bernick Rohl group 16-Mar-2005."— Presentation transcript:

1 Identifying Functional signatures in Proteins - a computational design approach David Bernick Rohl group 16-Mar-2005

2 The big picture what is function?  hinges  substrate/DNA/protein binding/alignment/recognition  catalytic sites what isn’t function ? (structure)  secondary structures,  fold architecture  thermodynamically required elements nature selects for function (structure is implicit) computational methods select for structure can we predict…quickly ?

3 Some terms pssm - position specific score matrix  a [20 x length] model of residue frequencies for every position of sequence family homolog - natural sequences evolved from a common parent morpholog - computationally derived sequence generated from a parent structure ortholog - common ancestor, derived by speciation (constrained functional divergence) paralog - common ancestor, same species (unconstrained functional divergence)

4 pssm from an alignment ACDEFGHIKLMNPQRSTVWY 1112212 112141

5 structure ensembles Larson (2003) - Improved homology searches Pei (2003) - Homology detection and active site searches Kuhlman (2000) - Structural optimality of Natural sequences

6 Results - SH3 domain 11 Structures 62 additional sequences

7 Results - S100 domain 11 structures 30 additional sequences Ca++ loop1 not detected backbone coordinated residues Ca++ loop2 not detected insufficient homolog depth

8 the protocol Sequence homolog Alignment paralog structures representative structure pssmHpssmM score cogs, pfam, reverse blast blast geometric statistical CE+SCOP TaylorDoms Flexible Design fixed design

9 genome scale high cost step - producing pssmM precalculate pssmM for every domain

10 morpholog pssms genome scale Data Sources  Taylor parsed Domain database  CE all-to-all + SCOP Precompute pssms for every domain ~8000 domains 100 sequences~90% diversity 1000 sequences~99% diversity ~4-8 wks, 70p cluster for initial set

11 scoring compare PSSMh to PSSMm PSSMm contains only structure signal PSSMh contains both function and structure each position represents a count-normalized position in 20-space (H or M) R-position -- average aa position RH and RM define 20 space vectors ‘function vector’ ‘structure vector’

12 next steps complete this set of domains - verification full domain pssmM generation

13 acknowledgements Carol Rohl Kevin Karplus Craig Lowe Rohl group HP


Download ppt "Identifying Functional signatures in Proteins - a computational design approach David Bernick Rohl group 16-Mar-2005."

Similar presentations


Ads by Google