Download presentation
Presentation is loading. Please wait.
1
Identifying Functional signatures in Proteins - a computational design approach David Bernick Rohl group 16-Mar-2005
2
The big picture what is function? hinges substrate/DNA/protein binding/alignment/recognition catalytic sites what isn’t function ? (structure) secondary structures, fold architecture thermodynamically required elements nature selects for function (structure is implicit) computational methods select for structure can we predict…quickly ?
3
Some terms pssm - position specific score matrix a [20 x length] model of residue frequencies for every position of sequence family homolog - natural sequences evolved from a common parent morpholog - computationally derived sequence generated from a parent structure ortholog - common ancestor, derived by speciation (constrained functional divergence) paralog - common ancestor, same species (unconstrained functional divergence)
4
pssm from an alignment ACDEFGHIKLMNPQRSTVWY 1112212 112141
5
structure ensembles Larson (2003) - Improved homology searches Pei (2003) - Homology detection and active site searches Kuhlman (2000) - Structural optimality of Natural sequences
6
Results - SH3 domain 11 Structures 62 additional sequences
7
Results - S100 domain 11 structures 30 additional sequences Ca++ loop1 not detected backbone coordinated residues Ca++ loop2 not detected insufficient homolog depth
8
the protocol Sequence homolog Alignment paralog structures representative structure pssmHpssmM score cogs, pfam, reverse blast blast geometric statistical CE+SCOP TaylorDoms Flexible Design fixed design
9
genome scale high cost step - producing pssmM precalculate pssmM for every domain
10
morpholog pssms genome scale Data Sources Taylor parsed Domain database CE all-to-all + SCOP Precompute pssms for every domain ~8000 domains 100 sequences~90% diversity 1000 sequences~99% diversity ~4-8 wks, 70p cluster for initial set
11
scoring compare PSSMh to PSSMm PSSMm contains only structure signal PSSMh contains both function and structure each position represents a count-normalized position in 20-space (H or M) R-position -- average aa position RH and RM define 20 space vectors ‘function vector’ ‘structure vector’
12
next steps complete this set of domains - verification full domain pssmM generation
13
acknowledgements Carol Rohl Kevin Karplus Craig Lowe Rohl group HP
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.