Download presentation
Presentation is loading. Please wait.
1
Phytome A Data Analysis Pipline presented by Jason Phillips
2
High Level Flow Chart Retrieve Unigenes Translate Unigenes Families
3
Main Outline ● Unigenes (Where'd they come from, where'd they go?) ● Translation (methods and procedures) ● Building Families (the power of together-ness)
4
phytome » Unigene ● What are? ● Where from? ● Nine Species ● Arabidopsis, a special case ● Storage
5
phytome » Unigene » What Are? Combined EST's that overlap
6
phytome » Unigene » Where From? ● TIGR ● Other sources?
7
phytome » Unigene » Nine Species
8
phytome » Unigene » Arabidopsis Highly annotated... Highly sequenced... Highly translated...
9
phytome » Unigene » Storage species count ------------------- ghir 24350 mcry 8455 osat 60778 hann 20520 mtru 36976 lesc 31012 ljap 11025 lsat 21960 atha 27170 ------------------- total: 242246
10
phytome » Translation ● Methods ● Estwise ● Estscan ● FrameFinder ● Procedure ● Numbers
11
phytome » Translation » methods EST-WISE ESTSCAN FRAMEFINDER AB INITIO HOMOLOGIES via BLAST sprot + trembl
12
phytome » Translation » procedure ● EST-WISE (Mac OSX Cluster) – blast swiss prot: 10.3 hours, 35 nodes (~15 days) – blast trembl: 35.7 hours, 35 nodes (~52 days) ● ESTSCAN (Mustard) ● FrameFinder (Mustard)
13
phytome » Translation » numbers 242,246 Unigenes 242,246 Unigenes ESTWISE FRAMEFINDER ESTSCAN 151,83 0 226,988 242,24 2 90,416 15,258 4
14
phytome » Families ● Relationships ● Clustering ● Numbers
15
phytome » Families » Relationships Blast everything against everything sequences blastable db of sequences query sbjct e-value ------- -------- ----------- mtru302 ljap4523 1 29 mtru302 lesc25072 1 26 mtru302 hann20270 5 24 osat59606 osat59606 1 157 osat59606 osat4002 1 96 osat59606 atha25166 1 88..............
16
phytome » Families » Relationships But we have 4 set's of sequences! tblastx 242,246 nucleotides blastp 151,830 estwise blastp 226,988 estscan blastp 242,242 framefinder Which method do we trust?
17
phytome » Families » Relationships 4 data sets...4 family interpretations tb ew es ff ~3 days, 28 nodes (~84 days) ~1/4 day, 21 nodes (~5days) BLAST OFF!
18
phytome » Families » Relationships Method size no blast no trans attrition ------ -------- -------- -------- ---------- tb 242246 153 0 153 ew 151830 22 90416 90438 ff 242242 24563 4 24567 es 226988 1345 15258 16603 BLAST RESULTS
19
phytome » Families » Clustering TRIBE MCL evalue gene
20
phytome » Families » Clustering TRIBE MCL evalue gene
21
phytome » Families » Clustering fam id member ------........... 4035 atha7499 4035 atha7503 4035 atha8483 4036 atha10704 4036 osat23081 4036 osat36667 4037 atha1072 4037 atha5059 4037 lsat15421 4037 lsat21190..................... query sbjct evalue -------- -------- ------ atha7499 atha8483 6 78 atha7499 atha7503 4 90 osat23081 atha10704 8 78 osat23081 osat36667 8 78 atha1072 atha5059 2 68 atha1072 lsat15421 2 60 atha1072 lsat21190 1 102 atha1072 atha5059 9 54............... tribe mcl
22
phytome » Families » Clustering tb ff es ew tb ff es ew TRIBE MCL blast results families
23
phytome » Families » Clustering Let's look as some histograms!
24
What should we do next round?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.