Download presentation
Presentation is loading. Please wait.
Published byJeffery Newton Modified over 9 years ago
1
Data Workflow Overview Genomics High- Throughput Facility Genome Analyzer IIx Institute for Genomics and Bioinformatics Computation Resources Storage Capacity Public Web Servers ● ~ 800 processors ● Sun Grid Engine ● ~ 100TB (secured) ● Fast drives ● 30TB for HTS ● HTTP, FTP ● Dedicated hosts ● User accounts HTS: 700GB/day Bandwidth: 10Gb/s USER Sample Analysis Requests (via web interface) Analysis Results (FTP server)
2
Data Analysis Workflow IMAGES 2-4 TB INTENSITIES 100-200 GB Image Analysis Firecrest Base Calling Bustard BASE CALLS 50-100 GB SEQUENCES + SCORES 20/30 GB Synthesis Gerald GENOME ALIGNMENT >100 GB Alignment ELAND + Reference Genome READ COUNTS Read Counting Casava VDC Sample-Specific Analysis, Visualization… e.g. Genome alignment, RNAseq, CHIPseq analysis Downloadable files for HTS users FASTQ files
3
Sequences, Scores (FASTQ) @HWUSI-EAS1562_0001:8:1:1119:18138#0/1 ATATTCTTATATAAAAATATAATTATTTTAATATTTGGTCCTTTCGTACTAAAATAT +HWUSI-EAS1562_0001:8:1:1119:18138#0/1 aaY`_aaY^a``[[`a\\\\aaa_^[aaZZWaaaXXY[VYaW^aaaa[aaa]a[a` @HWUSI-EAS1562_0001:8:1:1119:13476#0/1 AGAAAGCTTTGAAAATTATGTATACGCCTCGTAAGCCCAGTCCAAAGTCAAGACCA +HWUSI-EAS1562_0001:8:1:1119:13476#0/1 a_^`a`_a[[NOONN__V__`Y^`^X]R[]]]]]Q```Y````__`^W`YVUPR]] Sequence identifierRaw Sequence Phred base calling quality scores (0 to 62 encoded using ASCII 64 to 126)
4
Genome Alignment (ELAND) HWUSI-EAS1562_0001:8:1:1119:18138#0/1 ATATTCTTATATAAAAATATAATTATTTT AATATTTGGTCCTTTCGTACTAAAATAT U1 0 147 255 chr1.fa 26532086 F 23G HWUSI-EAS1562_0001:8:1:1119:13476#0/1 AGAAAGCTTTGAAAATTATGTATACGCC TCGTAAGCCCAGTCCAAAGTCAAGACCA U0 1 0 0 chr12.fa 90535786 F Sequence identifier Raw Sequence Type of match Number of exact/1-error/2-error matches Chromosome/Position/Direction Substitution
5
Read Counts (Casava VDC) Matchs with Genes, Exons, Splice junctions ChromosomeGeneMatchs Files for visualization (GenomeStudio) Genome alignment, Gene expression, RNAseq and CHIPseq analysis
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.