Download presentation
Presentation is loading. Please wait.
Published byWillow Linscott Modified over 9 years ago
1
BIOINFORMÁTICA UFMG A T G C
2
A T G C Genômica e Bioinformática ESTs mesmo que redundantes Genoma completo ou morte! 19952000
3
BIOINFORMÁTICA UFMG A T G C O fim de uma EST (A) 20 0 AUG (A) 20 0 (T) 18 cDNA (fita -) AUG (A) 18 cDNA (fita +) (T) 18 cDNA (fita -) (A) 18 ATG ATCATGACTTACGGGCGCGCGATxxxxxx GGCGCGCGATATCCxxxx A A A T T T A T T A T C C x x x x x 3’EST 5’EST A A A T T T A T T A T C C A T C T A C G x x x x Uma foto de um novo transcriptoma [otorrin...] [...damonh...] start end
4
BIOINFORMÁTICA UFMG A T G C Vida depois de PHRED 15 Query: 469 TTAGGAGGATCGTTTTTAGAATCCCCTGCAACGTTACCACGGTGGATTTCACTGACTGCG 528 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1038 ttaggaggatcgtttttagaatcccctgcaacgttaccacggtggatttcactgactgcg 979 Query: 529 ACGTTCTTAACGTTGAATCCAACGTTGCTACCAgggagagcctcagtaagtgcttcatga 588 ||||||||||||||||| || |||||||||||||||||| |||||||||||||||||||| Sbjct: 978 acgttcttaacgttgaagcccacgttgctaccagggagaccctcagtaagtgcttcatga 919 Query: 589 tgcatttcgacagaattgacttcagtcgacaaaccttgcggagcaaaagtgacgaccata 648 |||||||||||||| |||||||||| |||| ||||||||||| ||||||||||||||||| Sbjct: 918 tgcatttcgacagacttgacttcagccgaccaaccttgcggaccaaaagtgacgaccata 859 Query: 649 ccaggcttgatgataccagtttcaacgc 676 |||||||||||||||||||||||||||| Sbjct: 858 ccaggcttgatgataccagtttcaacgc 831 Query: non trimmed read. Subj: published sequence
5
BIOINFORMÁTICA UFMG A T G C When PHRED meets BLAST pUC18 (published sequence) Sequencing reaction: single pool distributed over 3 96-well plates 3 MegaBACE 3 reads each - 846 reads total Processing: MegaBLAST (BLASTn, SWAT) Phred –trim: a chromatogram analyzer –trim_alt: increasing trim_cutoff from 1% up to 25%
6
BIOINFORMÁTICA UFMG A T G C O fim de uma EST PHRED 10 (10% error): only losses
7
BIOINFORMÁTICA UFMG A T G C
8
A T G C 0,00% 5,00% 10,00% 15,00% 20,00% 25,00% 30,00% 1%2%3%4%5%6%7%8%9%10%11%12%13%14%15%16%17%18%19%20%21%22%23%24%25% total miscallstepwise miscall 16%17% Trimmed reads % error in sequence Added bases 3% % error in the tip Error occurrence:
9
BIOINFORMÁTICA UFMG A T G C Virtual pUC18 protein: STOP = * >protein_puc18 RQGFPSHDVVKRRPVPSLHACRSTLEDPRVPSSNS*SWS*LFPV*NCYPLTIPHNIRAGS IKCKAWGA**VS*LTLIALRSLPAFQSGNLSCQLH**IGQRAGRGGLRIGRSSASSLTDS LRSVVRLRRAVSAHSKAVIRLSTESGDNAGKNM*AKGQQKARNRKKAALLAFFHRLRPPD EHHKNRRSSQRWRNPTGL*RYQAFPPGSSLVRSPVPTLPLTGYLSAFLPSGSVALSHSSR CRYLSSV*VVRSKLGCVHEPPVQPDRCALSGNYRLESNPVRHDLSPLAAATGNRISRARY VGGATEFLKWWPNYGYTRRTVFGICALLKPVTFGKRVGSS*SGKQTTAGSGGFFVCKQQI TRRKKGSQEDPLIFSTGSDAQWNENSR*GILVMRLSKRIFT*ILLN*K*SFKSI*SIYE* TWSDSYQCLISEAPISAICLFRSSIVA*LPVV*ITTIREGLPSGPSAAMIPRDPRSPAPD LSAINQPAGRAERRSGPATLSASIQSINCCREARVSSSPVNSLRNVVAIATGIVVSRSSF GMASFSSGSQRSRRVT*SPMLCKKAVSSFGPPIVVRSKLAAVLSLMVMAALHNSLTVMPS VRCFSVTGEYSTKSF*E*CMRRPSCSCPASIRDNTAPHSRTLKVLIIGKRSSGRKLSRIL PLLRSSSM*PTRAPN*SSASFTFTSVSG*AKTGRQNAAKKGIRATRKC*ILILFLFQYY* SIYQGYCLMSGYIFECI*KNKQIGVPRTFPRKVPPDV*ETIIIMTLTYKNRRITRPFRLA RFGDDGENL*HMQLPETVTACL*ADAGSRQARQGASAGVGGCRGWLNYAASEQIVLRVHH MRCEIPHRCVRRKYRIRRHSPFRLRNCWEGRSVRASSLLRQLAKGGCAARRLSWV
10
BIOINFORMÁTICA UFMG A T G C tBLASTn (BLASTx) maximize with PHRED 8 15 8 Trim_cutoff parameter value (%) BLASTx score
11
BIOINFORMÁTICA UFMG A T G C Summarizing PHRED meets BLAST as errors in tip are 16% Molecules carry 3% global error And scores for EST vs aa comparisons maximize Real life: crossmatch ends with X’s Authors: –Fabiano Peixoto (CENAPAD) –Francisco Prosdocimi (Lab Biodiversidade) –Maurício Mudado (Lab Biodados)
12
BIOINFORMÁTICA UFMG A T G C pUC18 proteina virtual
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.