This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show
Welcome to Introduction to Bioinformatics Wednesday, 28 February 2007 Introduction to Viral Metagenome Project Discussion of Edward & Rohwer (2005)* Exam retrospective (Problem 12) Other matters? *Unless otherwise noted, all figures herein are from: Edwards RA, Rohwer F (2005). Viral metagenomics. Nature Rev Microbiol (2005) 3:
Edwards & Rohwer (2005) Phage phylogeny and taxonomy Placement of unknown phage into phylogeny SQ11. How to test? Result of test? ~50,000 nt Blast ~500 nt
Edwards & Rohwer (2005) The proviral metagenome SQ11. What's a provirus or prophage? Why would a virus do such a thing?
Infection Phage Bacterial chromosome Phage genome Lysogenic pathway Phage genome Deat h General transduction Edwards & Rohwer (2005) The proviral metagenome Lytic pathway
Infection Phage Bacterial chromosome Phage genome Life! Lytic pathwayLysogenic pathway Edwards & Rohwer (2005) The proviral metagenome
Edwards & Rohwer (2005) Viral community structure and ecology SQ14. What means ~10 12 viruses but only ~1000 viral genotypes? Two scenarios?
Edwards & Rohwer (2005) Viral community structure and ecology SQX. How to measure complexity? - Sample How many counted once? - How many counted twice? - How many counted zero times? - Model the process Use different number of types
Edwards & Rohwer (2005) Viral community structure and ecology SQX. How to measure complexity? 200 types Times encountered ProbabilItyProbabilIty
Edwards & Rohwer (2005) Viral community structure and ecology SQX. How to measure complexity? 200 types Times encountered ProbabilItyProbabilIty 5000 types
Edwards & Rohwer (2005) Viral community structure and ecology SQX. How to measure complexity? Times encountered ProbabilItyProbabilIty
Edwards & Rohwer (2005) Bioinformatics and viral metagenomics 1. How to identify genes? 2. How to identify genes' viruses?
Edwards & Rohwer (2005) Bioinformatics and viral metagenomics How to identify genes? Sequence Open reading frames Sequence 151 TATTTCGTAG TTATGTTGAA CCGATGAAAC TTGTTTGTTC TCAAATTGAG Translation-Frame Y F V V M L N R * N L F V L K L S Translation-Frame I S * L C * T D E T C L F S N * A Translation-Frame F R S Y V E P M K L V C S Q I E Complement 151 ATAAAGCATC AATACAACTT GGCTACTTTG AACAAACAAG AGTTTAACTC Translation-Frame I E Y N H Q V S S V Q K N E F Q Translation-Frame Y K T T I N F R H F K N T R L N L Translation-Frame T N R L * T S G I F S T Q E * I S Sequence 201 CTCAATACAG CTCTTCAACT AGTTAGTAGA GCTGTAGCCA CTAGGCCTTC Translation-Frame S I Q L F N * L V E L * P L G L R Translation-Frame Q Y S S S T S * * S C S H * A F Translation-Frame L N T A L Q L V S R A V A T R P S Complement 201 GAGTTATGTC GAGAAGTTGA TCAATCATCT CGACATCGGT GATCCGGAAG Translation-Frame A * Y L E E V L * Y L Q L W * A K Translation-Frame E I C S K L * N T S S Y G S P R Translation-Frame S L V A R * S T L L A T A V L G E Open reading frame finder + ORF characteristics E.g. GeneMark
Edwards & Rohwer (2005) Bioinformatics and viral metagenomics How to identify genes? Sequence Open reading frames Predicted function BlastP
Edwards & Rohwer (2005) Bioinformatics and viral metagenomics How to identify genes? Sequence Open reading frames Predicted function BlastN? SQ16. Other Blasts? TBlastX? Why so much time?
Edwards & Rohwer (2005) Bioinformatics and viral metagenomics How to identify genes' viruses?
Codon usage in different organisms SQ16. What means "codon usage"? How useful?
GC content in different organisms SQ18. GC/AT differences in cyanobacterial genomes?
GC content in different organisms S S P S Npun A Tery PRO S Gvi TeBP PMED Cwat A
Constancy of sequence characteristics - GC content - Codon frequencies - Dinucleotide frequencies DNA sequence
Constancy of sequence characteristics DNA sequence - GC content - Codon frequencies - Dinucleotide frequencies
Constancy of sequence characteristics DNA sequence - GC content - Codon frequencies - Dinucleotide frequencies
Constancy of sequence characteristics Karlin S (2001). Trends Microbiol 9:
Edwards & Rohwer (2005) Bioinformatics and viral metagenomics How to identify genes' viruses? - GC content - Codon frequencies - Dinucleotide frequencies Virus #1 Virus #2 Virus #3 Virus #4 Virus #5 Virus #6... Viral fragment