Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA Subway Green Line Overview. Growth of Sequence Read Archive (SRA) 2.2 Quadrillion bases Log Scale!

Similar presentations


Presentation on theme: "DNA Subway Green Line Overview. Growth of Sequence Read Archive (SRA) 2.2 Quadrillion bases Log Scale!"— Presentation transcript:

1 DNA Subway Green Line Overview

2 Growth of Sequence Read Archive (SRA) 2.2 Quadrillion bases Log Scale! http://www.ncbi.nlm.nih.gov/Traces/sra/

3 DNA Subway Green Line: Transcriptome analysis Green Line Examine RNA-Seq data for differential expression or annotate sequenced genome Use high-performance computing to analyze complete datasets Generate lists of genes and fold-changes; add results to Red Line projects

4 RNA Collected from multiple samples/time points Library prep and sequencing QC of Reads Assembly and mapping Abundance estimation RNA-Seq Overview Green Line: Differential expression

5 Next Generation RNA-Sequencing for Undergraduates 2. Isolate RNA 1. Design experiment (Differential expression or genome annotation) 3. Sequence RNA 4. Analyze RNA sequence datasets using the Green Line of DNA Subway and other bioinformatics tools 5. Follow-up validations

6 Working Group Faculty Projects 2014 Agnes Ayme-SouthgateCollege of Charleston, SCGene expression changes in Apis melifera flight muscle during life-stage transitions Judy BrusslanCalifornia State University, Long Beach, CA Gene expression changes during leaf development and senescence in Arabidopsis thaliana Raymond EnkeJames Madison University, VAGene expression changes during retina development in Gallus gallus Shaye LewisPrarie View A&M University, TXGene expression in caprine testes during juvenile development to puberty Irina MakarevitchHamline University, MNGene expression changes in maize in response to cold stress Judith OgilvieSaint Louis University, MOGene expression changes in the retinas of mice with retinitis pigmentosa Jeremy SetoNew York City College of Technology – CUNY, NY Gene expression changes during differentiation of rat pheochromocytoma line cells (PC12) to a neuronal-like phenotype Carrie ThurberAbraham Baldwin Agricultural College, ILGene expression changes during seed abscission in Sorghum bicolor George UdeBowie State University, MDTranscriptome analysis of floral inflorescence genes in banana/plantains Deirdre VadenPrairie View A&M University, TX Gene expression changes in peripheral blood mononuclear cells from hypertensive rats treated with captopril Scott WoodyUniversity of Wisconsin, WI Gene expression changes upon gibberellic acid exposure in Brassica rapa (Fast Plants, self-compatible) gibberellic acid (gad) mutants

7 RNA-Seq Overview Green Line: Differential expression and genome annotation Biologically rich Technical challenge (bottleneck) Molecular techniques Sequencing technologies Expression plots Validation experiments Image from Advanced Sequencing Technologies & Applications http://meetings.cshl.edu/courses.html

8 RNA-Seq Overview Green Line: Technical challenges 1.Crowded field of choices

9 RNA-Seq Overview Green Line: Technical challenges 2.Bioinformatics skill level @SRR070570.4 HWUSI-EAS455:3:1:1:1096 length=41 CAAGGCCCGGGAACGAATTCACCGCCGTATGGCTGACCGG C + BA?39AAA933BA05>A@A=?4,9################# @SRR070570.12 HWUSI-EAS455:3:1:2:1592 length=41 GAGGCGTTGACGGGAAAAGGGATATTAGCTCAGCTGAATCT + @=:9>5+.5=?@ A?@6+2?:,%1/=0/7/>48## @SRR070570.13 HWUSI-EAS455:3:1:2:869 length=41 TGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCA + Bioinformatician 0 1 0 0 1 1 0 1 0 1

10 RNA-Seq Overview Green Line: Technical challenges 3.Data and computation

11 DNA Subway Green Line: Transcriptome analysis Tuxedo Workflow Simple layout HPC powered Integrated with iPlant Data Store

12 DNA Subway Green Line: Tuxedo Workflow

13 DNA Subway Green Line: HPC through iPlant Agave API Base Cluster (Dell/Intel/Mellanox): Intel Sandy Bridge processors Dell dual-socket nodes w/32GB RAM (2GB/core) 6,400 nodes 56 Gb/s Mellanox FDR InfiniBand interconnect More than 100,000 cores, 2.2 PF peak performance Co-Processors: Intel Xeon Phi “MIC” Many Integrated Core processors Special release of “Knight’s Corner” (61 cores) All MIC cards are on site at TACC o more than 6000 installed 7+ PF peak performance Max Total Concurrency: exceeds 500,000 cores 1.8M threads Stampede

14 DNA Subway Green Line: iPlant Data Store Initial 100 GB allocation – TB allocations available Automatic data backup Easy upload /download and sharing

15 DNA Subway Public Maize RNA-Seq Dataset http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?study=SRP010680

16 DNA Subway Green Line: Differential expression

17 DNA Subway Green Line: Differential expression

18 DNA Subway Green Line: Differential expression

19 DNA Subway Green Line: Differential expression

20 DNA Subway Green Line: Differential expression

21 DNA Subway Green Line: Differential expression

22 DNA Subway Green Line: Differential expression

23 DNA Subway Green Line: Differential expression

24 DNA Subway Green Line: Differential expression

25 DNA Subway Green Line: Differential expression

26 DNA Subway “Power Desktop” Intuitive interface to support seamless genome “round trip” for eukaryote of choice Access high performance computing to analyze whole genome data (RNA- seq, initially) Scaffold data to sequenced genomes available in iPlant Data Store Directly upload RNA-seq reads as biological evidence for genome annotation using Red Line

27 The iPlant Collaborative is funded by a grant from the National Science Foundation Plant Cyberinfrastructure Program (#DBI-0735191).


Download ppt "DNA Subway Green Line Overview. Growth of Sequence Read Archive (SRA) 2.2 Quadrillion bases Log Scale!"

Similar presentations


Ads by Google