Presentation is loading. Please wait.

Presentation is loading. Please wait.

The iPlant Collaborative Vision www.iPlantCollaborative.org Enable life science researchers and educators to use and extend cyberinfrastructure.

Similar presentations


Presentation on theme: "The iPlant Collaborative Vision www.iPlantCollaborative.org Enable life science researchers and educators to use and extend cyberinfrastructure."— Presentation transcript:

1 The iPlant Collaborative Vision www.iPlantCollaborative.org Enable life science researchers and educators to use and extend cyberinfrastructure

2 1986 DOE announces Human Genome Initiative-- $5.3 million to develop technology. 1990 DOE & NIH present their HGP plan to Congress. 1997 Escherichia coli genome published 1997 Yeast genome published 2000 Fruit fly (Drosophila) genome published. 2000 Working draft of the human genome announced. 2000 Thale cress (Arabidopsis) genome published (2x). 2002 Rice genome published (2x). 2003 Human genome published. 2006 First tree genome published in Science. 2007 First metagenomics study published Important Dates in Genomics

3 ¢0.57 ¢0.19 ¢0.35 Sequence production (Billions of bases/month) ¢0.50 ¢1 0 0 Cost: Cents per base 1.0 0 0 2.0 3.0 1989 1991 1993 1995 1997 1999 2003 2005 2001 ¢0.46 ¢0.08 2007 Human Genome completed Economics of Scale Human Genome launched > ¢0.05 Slide: JGI, 2009

4 Another angle Slide: Stein, 2010

5 Just as computer software is rendered in long strings of 0s and 1s, the GENOME or “software” of life is represented by a string of the four nucleotides, A, G, C, and T. To understand the software of either - a computer or a living organism - we must know the order, or sequence, of these informative bits. What is sequencing? Slide: JGI, 2009

6 A GENOME is all of a living thing’s genetic material. The genetic material is DNA (DeoxyriboNucleic Acid) DNA, a double helical molecule, is made up of four nucleotide “letters”: A-- G-- T-- C-- What is a genome? Slide: JGI, 2009

7 Exciting? >mouse_ear_cress_1080 GAAATAATCAATGGAATATGTAGAGGTCTCCTGTACCTTCACAGAGATTCTAGGCTGAGAGCAGTGCATATAGATATCTTT CGTACTCATCTGCTTTTTCTGGTCTCCATCACAAAAGCCAACTAGGTAATCATATCAATCTCTCTTTACCGTTTACTCGAC CTTTTCCAATCAGGTGCT TCTGGTGTGTCTACTACTATCAGTTTTAGGTCTTTGTATACCTGATCTTATCTGCTACTG AGGCTTGTAAAAGTGATTAAAACTGTGACATTTACTCTAAGAGAAGTAACCTGTTTGATGCATTTCCCTAATATACCGGTG TGGAAAAGTGTAGGTATCTGTACTCAGCTGAAATGGTGGACGATTTTGAAGAAGATGAACTCTCATTGACTGAAAGCGGGT TGAAGAGTGAAGATGGCGTTATTATCGAGATGAATGTCTCCTGGATGCTTTTATTATCATGTTTGGGAATTTACCAAGGGA GAGGTATCAGAATCTATCTTAGAAGGTTACATTTAGCTCAAGCTTGCATCAACATCTTTACTTAGAGCTCTACGGGTTTTA GTGTGTTTGAAGTTTCTTAACTCCTAGTATAATTAGAATCTTCTGCAGCAGACTTTAGAGTTTTGGGATGTAGAGCTAACC AGAGTCGGTTTGTTTAAACTAGAATCTTTTTATGTAGCAGACTTGTTCAGTACCTGAATACCAGTTTTAAATTACCGTCAG ATGTTGATCTTGTTGGTAATAATGGAGAAACGGAAGAATAATTAGACGAAACAAACTCTTTAAGAACGTATCTTTCAGTTT TCCATCACAAATTTTCTTACAAGCTACAAAAATCGAACTATATATAACTGAACCGAATTTAAACCGGAGGGAGGGTTTGAC TTTGGTCAATCACATTTCCAATGATACCGTCGTTTGGTTTGGGGAAGCCTCGTCGTACAAATACGACGTCGTTTAAGGAAA GCCCTCCTTAACCCCAGTTATAAGCTCAAAGTTGTACTTGACCTTTTTAAAGAAGCACGAAACGAAAAACCCTAAAATTCC CAAGCAGAGAAAGAGAGACAGAGCAAGTACAGATTTCAACTAGCTCAAGATGATCATCCCTGTTCGTTGCTTTACTTGTGG AAAGGTTGATATTTTCCCCTTCGCTTTGGTCTTATTTAGGGTTTTACTCCGTCTTTATAGGGTTTTAGTTACTCCAAATTT GGCTAAGAAGAGATCTTTACTCTCTGTATTTGACACGAATGTTTTTAATCGGTTGGATACATGTTGGGTCGATTAGAGAAA TAAAGTATTGAGCTTTACTAAGCTTTCACCTTGTGATTGGTTTAGGTGATTGGAAACAAATGGGATCAGTATCTTGATCTT CTCCAGCTCGACTACACTGAAGGGTAAGCTTACAATGATTCTCACTTCTTGCTGCTCTAATCATCATACTTTGTGTCAAAA AGAGAGTAATTGCTTTGCGTTTTAGAGAAATTAGCCCAGATTTCGTATTGGGTCTGTGAAGTTTCATATTAGCTAACACAC TTCTCTAATTGATAACAGAAGCTATAAAATAGATTTGCTGATGAAGGAGTTAGCTTTTTATAATCTTCTGTGTTTGTGTTT TACTGTCTGTGTCATTGGAAGAGACTATGTCCTGCCTATATAATCTCTATGTGCCTATCTAGATTTTCTATACAATTGATA TTTGATAGAAGTAGAAAGTAAGACTTAAGGTCTTTTGATTAGACTTGTGCCCATCTACATGATTCTTATTGGACTAATCAT TCTTTGTGTGAAAATAGAATACTTTGTCTGAACATGAGAGAATGGTTCATAATACGTGTGAAGTATGGGATTAGTTCAACA ATTTCGCTATTGGAGAAGCAAACCAAGGGTTAATCGTTTATAGGGTTAAGCTAATGCTCTGCTCTTTATATGTTATTGGAA CAGACTATTGTTGTGCCTATCTTGTTTAGTTGTAGATTCTATCTCGACTGTTATAAGTATGACTGAAGGCTTGATGACTTA TGATTCTCTTTACACCTGTAGAAGGATTTAAGCTTGGTGTCTAGATATTCAATCTGTGTTGGTTTTGTCTTTCTTTTGGCT CTTAGTGTTGTTCAATCTCCTCAATAGGTATGAAGTTACAATATCCTTATTATTTTGCAGGGACGCACTTGATGCACTCCA GCTAGTCAGATACTGCTGCAGGCGTATGCTAATGACCTTGCATCAACATCTTTACTTAGAGCTCTACGGGTTTTAGTGTGT

8 This better?

9 Using Plants to Explore Genomics A large number of genomes is publicly & freely available for analysis.

10 Find Gene Families Generate mathematical evidence Analyze large data amounts Browse in context Build gene models Gather biological evidence Annotation workflow Get DNA sequence

11 Walk or…

12 Early concept (2009)

13 DNA Subway 2014

14 Coming into the Genome Age For the first time in the history of science students can work with the same data and tools that are used by researchers. Learning by asking and answering question. Students generate new knowledge.

15 Workshop Objectives Illustrate the evolving concept of “gene.” Conceptualize a “big picture” of complex, dynamic genomes. Guide students to address real problems through modern genome science. Use educational and research interfaces for bioinformatics. Work with “real” genome sequences gathered by students – in the lab or online.

16 Molecular biology and bioinformatics concepts RepeatMasker Eukaryotic genomes contain large amounts of repetitive DNA. Transposons can be located anywhere; they can mutate like any other DNA. FGenesH Gene Predictor Protein-coding information begins with start, followed by codons, ends in stop. Codons in mRNA (AUG, UAA,…) have sequence equivalents in DNA (ATG, TAA,…). Most eukaryotic introns have “canonical splice sites,” GT---AG (mRNA: GU---AG). Gene prediction programs search for patterns to predict genes and their structure. Different gene prediction programs may predict different genes and/or structures. Multiple Gene Predictors The protein coding sequence of a mRNA is flanked by untranslated regions (UTRs). UTRs hold regulatory information. BLAST Searches Gene or protein homologs share similarities due to common ancestry. Biological evidence is needed to curate gene models predicted by computers. mRNA transcripts and protein sequence data provide “hard” evidence for genes.

17 What is a gene? Can we define a gene? Has the definition of a gene changed? How can we find genes?

18 An Evolution of Sorts… Genes as “independent hereditary units (1866), Mendel Genes as “beads on strings” (1926), Morgan One gene, one enzyme (1941), Beadle & Tatum DNA is molecule of heredity (), Avery DNA > RNA > Protein (1953), Crick, Watson, Wilkins Transposons (1940s-50s), McClintock Reverse transcription (1970), Temin & Baltimore Split genes (1977), Roberts & Sharp RNA interference (1998), Fire and Mello

19 Sequence & course material repository http://gfx.dnalc.org/files/evidence Don’t open items, save them to your computer!! Annotation (sequences & evidence) Manuals (DNA, Subway, Apollo, JalView) Presentations (.ppt files) Prospecting (sequences) Readings (Bioinformatics tools, splicing, etc.) Worksheets (Word docs, handouts, etc.) BCR-ABL (temporary; not course-related)

20


Download ppt "The iPlant Collaborative Vision www.iPlantCollaborative.org Enable life science researchers and educators to use and extend cyberinfrastructure."

Similar presentations


Ads by Google