Joanna Klein, Ph.D. Northwestern Scholarship Symposium May 4, 2012
What are Bacteria? Single celled microorganism Friend or Foe? Friend: health, environment, industry Foe: cause a variety of infectious diseases
Cellulophaga lytica Marine bacterium Isolated from beach mud near Limon, Costa Rica in limon_vacations-i
Cellulophaga lytica Gram negative Filamentous Yellow pigmentation Exhibits gliding motility
Gliding Motility
Cellulophaga lytica Member of the Cytophaga-Flavobacterium- Bacteroides (CFB) group of bacteria Poorly characterized branch
Phylogenetic tree of Bacteria Proteobacteria E. coli, Salmonella, Bordetella, Helicobacter, Vibrio Firmicutes Staphylococcus, Streptococcus, Lactobacillus, Clostridium
Cellulophaga lytica Target organism in the Genomic Encyclopedia of Bacteria and Archaea (GEBA) Research Program of the Department of Energy/Joint Genome Institute GEBA organisms 100 representative organisms from each of the branches Organisms with potential energy applications
Biofuel production C. lytica produces a variety of enzymes that may have applications in biotechnology and biofuel production
Deconstruction by C. lytica C. lytica contains many polysaccharide degrading enzymes Polysaccharides Large molecules that store energy or provide structure Carbohydrates/starches Cellulose in plant cell walls Enzymes break down polysaccharides into simple sugars that that can be fermented to produce energy Polysaccharide degrading enzymes 3 cellulases 3 fucoidases 1 xylosidase
Plant Cell Wall
Ethanol production Ethanol produced as a byproduct of starch degradation and subsequent fermentation Well developed technology Enzymes digest starch into simple sugars which are readily fermented by known microorganisms to produce ethanol Issues…
Cellulosic ethanol production Goal is to use the cellulose biomass found in plant cell walls of leaves and wood to produce ethanol Problems to overcome: Lignin, also found in cell wall, hinders digestion of cellulose from wood Enzymes that digest cellulose into simple sugars are poorly understood Organisms that ferment these simple sugars to produce ethanol are poorly understood Can C. lytic help achieve this goal?
Why study C. lytica? Model organism to understand the CFB group better Contribute to biofuel research and applications
Genome Annotation of Cellulophaga lytica One way to understand more about the life processes of C. lytica is through a study of its genome. Genome All of the genetic material, DNA, of an organism DNA is made up 4 smaller molecules known as the bases A,C,G &T
Sequencing genomes We can easily determine the entire DNA sequence of an organism – it’s genome. DNA sequencing technology has developed rapidly since the human genome project, completed in 2003 Took 13 years to complete, involved 100’s of researchers around the globe, and cost a total of of $2.7 billion Entire 3 billion base-pair sequence is available in a public database
Genome projects Currently, there are more than 3000 complete or nearly complete genome sequences of microbes available. Over 1200 genome sequencing projects in higher organisms (plants, animals, fungi, protists) The complete genome of Cellulophaga lytica was sequenced by the DOE and published in ,765,936 bases
Genome Projects Determine the genome sequence Annotate genome: Process of attaching biological information to sequences Study function of genes
Computer annotation of C. lytica Number of genes and predicted function of each gene product.
TATCAAAGAGATGATTGAGAACTGGTACGGAGGGAGTCGAGCCGGGCTCACTTAAGGGCTACGACTTAAC GGGCCGCGTCACTCAATGGCGCGGACACGCCTCTTTGCCCGGGCAGAGGCATGTACAGCGCATGCCCACA ACGGCGGAGGCCGCCGGGTTCCCTGACGTGCCAGTCAGGCCTTCTCCTTTTCCGCAGACCGTGTGTTTCT TTACCGCTCTCCCCCGAGACCTTTTAAGGGTTGTTTGGAGTGTAAGTGGAGGAATATACGTAGTGTTGTC TTAATGGTACCGTTAACTAAGTAAGGAAGCCACTTAATTTAAAATTATGTATGCAGAACATGCGAAGTTA AAAGATGTATAAAAGCTTAAGATGGGGAGAAAAACCTTTTTTCAGAGGGTACTGTGTTACTGTTTTCTTG CTTTTCATTCATTCCAGAAATCATCTGTTCACATCCAAAGGCACAATTCATTTTGAGTTTCTTTCAAAAC AAATCGTTTGTAGTTTTAGGACAGGCTGATGCACTTTGGGCTTGACTTCTGATTACCCTATTGTTAAATT AGTGACCCCTCTTAGTGTTTTCCTGTCCTTTATTTCGGAGGACGCACTTCGAAGATACCAGATTTTATGG GTCATCCTTGGATTTTGAAGCTTATAACTGTGACAAAAAATGTGAAGGGAAGAGATTTGAAACATGTGGA AGGAAAAGTGAGTGCAGACTATAAACTTCCAAAAAGACAAGCCCAAAATACACCTAAACGTTATGTCAGA TTATTTTGTTAAAATCAGTTGTTAGTGACGTCCGTACGTTAATAGAAAAAAGAATGCTTCAGTTTGGAGT GGTAGGTTTCTAGAGGGATTTATTGTGAAAGTATAAACTATTCAGGGCAATGGGACTGAGAGAACAGTGG GTAGAAAGGACCACTGAAGGAAAGGAAGAGAATTGGAAGGTAGATGAAAGAAGGAGCAAGAACCTGGGG TGTTTTTTCCTTTTCACTTGTAATAGTAGTAACAGAAGCAATGGCAGACTGGCTTTTGTTTCTACTGTGT TAGAATGAATTGACAGGACAACTGGGCCTATTATTGTACTGTGCCAGAATACTGTAAAACAAAACTAAAC ATACTAGCTTGGTGGCTTGTAATTAATTACTTAAGTGGAGATTTTTATTTTTTTTTTATTTTTTTTTTAG ACGGAGTCTCACTTTGTCACCCAGGCTGGAGTGCAGTGGCGCGATCTCAGCTGACTGCAACCTCCTCCTC Cellulase
Process of annotation Automatic annotation - done automatically using computer software 35% of computer generated annotations are wrong or are missing information due to limitations of computer algorithms Manual Annotation – humans analyze the information generated by computers and make corrections as necessary. Labor intensive and time consuming Solution: Train students to participate in the process
IMG-ACT is a toolkit of online gene and genome analysis programs. Using IMG-ACT, students annotate genomes provide human expertise necessary for accurate, up- to-date, reliable annotation Students contribute to the scientific community and learn biological concepts through participating in original research
JGI Genome Annotation Workshop Walnut Creek, CA January 2011
Genome annotation of C. lytica at NWC 39 NWC students have participated in this research endeavor Science Research Institute, Summer 2011 Genetics, Fall 2011 Microbiology, Spring genes have been fully annotated 10 genes have been partially annotated
Restriction endonuclease type I What is the amino acid sequence of the protein encoded by this gene? Used Integrated Microbial Genomes (IMG) database Amy Knight and Allison Lothe
DNA topoisomerase III How does this protein compare to the sequence of other proteins? Used BLAST program Libby Nelson and Chelsey Fiecke
RNA polymerase sigma subunit 24 What are key functional amino acid residues in the protein? Web Logo Program Silas Baalke and Laura Torgerson
DNA Replication Protein A What enzymatic pathway is the protein involved in? Used KEEG Pathway database Marie Abeler and Gabe Jefferson
-galactosidase What pathway is this enzyme found in? KEEG database Daniel Plack, Michael Lowry
Prolyl-tRNA synthase What is the 3D structure of similar proteins? ProteinDataBank (PDB) Sarah Ivanca and Victoria Hanson
NusA, B, G anti-termination factors Where is the gene in relation to other genes? Used Gene neighborhood feature of IMG Matt Takata and Zach Fredman
RNase H What reaction does the enzyme catalyze? Used Metacyc database Chelsey Fiecke
Elongation Factor Ts How closely related is this protein to proteins in other bacteria? Used Phylogeny FR program Ellen Chae, Holly Tomaz
Cytochrome C oxidase subunit 3 Where is this protein located in the cell? TMHMM algorithm Alannah Pratt, Michael Lowry, SRI high school students
MutS Are there paralogs of this gene? IMG database query Ryan Bradbury and Luke Delain
RNA polymerase sigma-70 factor Was this gene named properly? Multiple lines of evidence used to change name to RNA Polymerase anti-sigma 70 factor Camaren Terrill and Ben Sorenson
Future work 3,348 genes left to annotate! Special interest in: Polysaccharide degrading enzymes Motility proteins Proteins with unknown function Study the function of interesting genes in the lab
Genome Projects Determine the genome sequence Annotate genome: Process of attaching biological information to sequences Study function of genes
Acknowledgements NWC students who have participated in this research. Genetics, Microbiology and SRI courses Research students Steven Erickson and Andy Jaeger Northwestern College for providing the opportunity and support for the sabbatical during which this project was initiated. Additional funding received from a Faculty Development Grant