Download presentation
Presentation is loading. Please wait.
1
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Master Course Sequence Alignment Lecture 11 Database searching Issues (2)
4
C; family: zinc finger -- CCHH-type C; class: small C; reordered by kitschorder 1.0a C; reordered by kitschorder 1.0a C; last update 7/9/98 >P1;1zaa1 structureX:1zaa: 3 :C: 33 :C:zinc-finger (ZIF268, domain 1):Mus musculus:2.10:18.20 ------RPYACPVESCDRRFSRSDELTRHI-RI-HTGQK* >P1;1zaa2 structureX:1zaa: 34 :C: 61 :C:zinc-finger (ZIF268, domain 2):Mus musculus:2.10:18.20 -------PFQCRI--CMRNFSRSDHLTTHI-RT-HTGEK* >P1;1zaa3 structureX:1zaa: 62 :C: 87 :C:zinc-finger (ZIF268, domain 3):Mus musculus:2.10:18.20 -------PFACDI--CGRKFARSDERKRHT-KI-HLR--* >P1;1ard structureN:1ard: 102 : : 130 : :zinc-finger (transcription factor ADR1):Saccharomyces cerevisiae:-1.00:-1.00 ------RSFVCEV--CTRAFARQEHLKRHY-RS-HTNEK* >P1;1znf structureN:1znf: 1 : : 25 : :zinc-finger (XFIN, 31st domain):Xenopus laevis:-1.00:-1.00 --------YKCGL--CERSFVEKSALSRHQ-RV-HKN--* >P1;2drp2 structureX:2drp: 137 :A: 165:A:zinc-finger (tramtrack, domain 2):Drosophila melanogaster:2.80:19.30 ----NVKVYPCPF--CFKEFTRKDNMTAHV-KIIHK---* >P1;3znf structureN:3znf: 1 : : 30 : :zinc-finger (enhancer binding protein):Homo sapiens:-1.00:-1.00 ------RPYHCSY--CNFSFKTKGNLTKHMKSKAHSKK-* >P1;5znf structureN:5znf: 1 : : 30 : :zinc-finger (ZFY-6T):Homo sapiens:-1.00:-1.00 ------KTYQCQY--CEYRSADSSNLKTHIKTK-HSKEK* Example You can also look at superposed structures..
15
Sensitivity and Specificity – medical world + - Test Test + 9990 True Positive (TP) 990 False Positive (FP) All with Positive Test TP+FP Positive Predictive Value= TP/(TP+FP) 9990/(9990+990) =91% - 10 False Negative (FN) 989,010 True Negative (TN) All with Negative Test FN+TN Negative Predictive Value= TN/(FN+TN) 989,010/(10+989,0 10) =99.999% All with Disease 10,000 All without Disease 999,000 Everyone= TP+FP+FN+TN Sensitivity= TP/(TP+ FN) 9990/(99 90+10) Specificity= TN/(FP+TN ) 989,010/ (989,010+99 0) Pre-Test Probability= (TP+FN)/(TP+FP+FN+TN) (in this case = prevalence) 10,000/1,000,000 = 1%
19
Structure-based function prediction SCOP (http://scop.berkeley.edu/) is a protein structure classification database where proteins are grouped into a hierarchy of families, superfamilies, folds and classes, based on their structural and functional similarities
20
Structure-based function prediction SCOP hierarchy – the top level: 11 classes
21
Structure-based function prediction All-alpha protein Coiled-coil protein All-beta protein Alpha-beta proteinmembrane protein
22
Structure-based function prediction SCOP hierarchy – the second level: 800 folds
23
Structure-based function prediction SCOP hierarchy - third level: 1294 superfamilies
24
Structure-based function prediction SCOP hierarchy - third level: 2327 families
25
Structure-based function prediction Using sequence-structure alignment method, one can predict a protein belongs to a –SCOP family, superfamily or fold Proteins predicted to be in the same SCOP family are orthologous Proteins predicted to be in the same SCOPE superfamily are homologous Proteins predicted to be in the same SCOP fold are structurally analogous folds superfamilies families
26
Note: the numbers do not add up in every profile column since a selection of alignment sequences in the MSA and amino acids represented in the profile are taken!
30
ABAB B C C D
35
Conserved hypotheticals >P00001 Conserved hypothetical A substantial fraction of genes in sequenced genomes encodes 'conserved hypothetical' proteins, i.e. those that are found in organisms from several phylogenetic lineages but have not been functionally characterized.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.