Bioinformatics in the Dynamic Genome Course Introducing Freshmen to computational biology University of California, Riverside
Dynamic Genome Course Give Freshmen a taste of research: First Half Core Biological Concepts Key molecular biology skills Basic Bioinformatics Second Half Open ended guided research project
Some history…DG at UGA Research Project: Identify transposable elements in newly sequenced genomes by homology Steps: BLAST exemplar(s) against genome Align top scoring “hits” Build gene trees of newly identified elements
History con’t… >Copia_RT WVYRVKHKQDGSIDRYKARLVAKGYTQVEGLDYLDTFSP VAKTTTLRLLLALAASQGWFLHQLDVDNAFLHGTLDEEI YMRLPPGVSSPRPNQVCLLQKSLYGLK
Next design primers and verify location in genome. POL Next design primers and verify location in genome. Simple, right…...?
Students hated it File Formats BLAST FASTA Command Line Clunky web based tools (2007) It took several weeks to get to the gene tree Solution…
TARGeT: Tree Analysis of Related Genes and TEs. Graduate student wrote a scritp First chapter in thesis Yujun Han, James Burnette, and Susan Wessler (2009). TARGeT: a web-based pipeline for retrieving and characterizing gene and transposable element families from genomic sequences. Nucleic Acids Research Hosted by CyVerse (a.k.a iPlant Collaborative) target.iplantcollaborative.org
Quick TARGeT Demo
TARGeT Recap Removed tedium Results mostly within attention span Spend more time on biology in class…
Ping protein query against Rice and Soybean
Ping Transposase query against Rice genome Protein query Nucleotide query Ask students: Why does the protein query find more putative homologs?
Gene Families: Actin Rice Maize Query: Rice Actin
Actin compared with Ping Tpase
An aside: This summer 17 rising sophomores Investigating genetic variation within gene families in Citrus and related genera
Transpose to Riverside Quarter system 20 class meetings of three hours each 4-5 weeks for background 5-6 weeks for project As of fall 2016, 6 sections per quarter Neil A. Campbell Science Learning Laboratory Most diverse R1 University 60% First to college Must provide the technology Be very careful with terminology
Module 1: Genetic Information Flow Students review central dogma outside of class Review in class with concept maps Experiment: Amplify the Actin gene from gDNA and cDNA
Module 1: Locating Introns: Step 1 1. BLAST gDNA sequence vs cDNA sequence using BLAST2Sequences
Step 2: Find locations
Step 3: Draw gene structure This analysis can be done on tablets!!
Module 2: DNA Sequence Polymorphism Experiment: Amplify a locus from many strains of maize Introduce idea of reference genome (B73) Sometimes introduce genome browsers, PCR primer design
Sequence Analysis Multiple Sequence Alignment Burnette and Wessler Genetics 2013
Electronic Laboratory Notebook The Dynamic Genome Laboratory Notebook is completely electronic We developed our own: FERPA compliant FREE Allows combining bioinformatics and wet lab data Allows collecting “big data” Robb et al. Course Source
Demo eNotebook, data collection Robb et al. Course Source
Example Research Projects Verify predicted TE insertions in rice and maize Phenotypes of transcription factor knock-outs in planaria and C. elegans. Verify knock out with PCR Characterize Ruby alleles in Citrus Polyembryony in Citrus and Poncirus (if time show data collection 3326)
Challenges Students are not as computer literate as we are lead to believe. Simple curosity – “Did you google it?” “Did you Pubmed it?” Resistance to anything non-Facebook Good interfaces: DNA Subway Lots of support Diversity of examples TERMINOLOGY Three vocabularies Biological Words (transcription, translation) (gene, locus, ROI) Laboratory Words (PCR, gel, mini-prep) Computer Words (parameter, input, format, file type)
Acknowledgments Dr. Sofia Robb Dr. Matthew Collin Dr. Yujan Han Alex Cortez Rochelle Campbell