Ion Torrent Semiconductor Sequencing

Slides:



Advertisements
Similar presentations
Higher Computing Computer Systems S. McCrossan Higher Grade Computing Studies 7. Systems Software 1 System Software This software is used to provide the.
Advertisements

IMGS 2012 Bioinformatics Workshop: File Formats for Next Gen Sequence Analysis.
Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.
SOLiD Sequencing & Data
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Doc Document Management Systems For Manufacturing Industry Infocrew Solutions Pvt.Ltd.
Multimedia Authoring Tools Lecture 13
Management Information Systems
Addressing design Goals  We decompose the system to address complexity  Assigning each subsystem to team to work on  But we also need to address system.
Ultra Scale High Density Hybrid DNA Memory Mohamad Al-Sheikhly, William Bentley, Aris Christou, Joseph Silverman Department of Materials Science and Department.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
GBS Bioinformatics Pipeline(s) Overview
- Raghavi Reddy.  With traditional desktop computing, we run copies of software programs on our own computer. The documents we create are stored on our.
Home Guard Security System. Introduction & Basic Ideas Home Guard Security System.
Quick introduction to genomic file types Preliminary quality control (lab)
Application Block Diagram III. SOFTWARE PLATFORM Figure above shows a network protocol stack for a computer that connects to an Ethernet network and.
De Novo Genome Assembly - Introduction Henrik Lantz - BILS/SciLife/Uppsala University.
Alexis DereeperCIBA courses – Brasil 2011 Detection and analysis of SNP polymorphisms.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
EE3A1 Computer Hardware and Digital Design
Data Workflow Overview Genomics High- Throughput Facility Genome Analyzer IIx Institute for Genomics and Bioinformatics Computation Resources Storage Capacity.
Edline and GradeQuick Training Welcome! Please Sign In.
Trinity College Dublin, The University of Dublin Data download: bioinf.gen.tcd.ie/GE3M25/project Get.fastq.gz file associated with your student ID
Genome STRiP ASHG Workshop demo materials
P.M. VanRaden and D.M. Bickhart Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, USA
De Novo Genome Assembly - Introduction
Project Planning Defining the project Software specification Development stages Software testing.
Canadian Bioinformatics Workshops
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
SEPTEMBER 8, 2015 Computer Hardware 1-1. HARDWARE TERMS CPU — Central Processing Unit RAM — Random-Access Memory  “random-access” means the CPU can read.
Canadian Bioinformatics Workshops
© 2014 IBM Corporation e-config RPO MES Training Bill Luken September 29 th, 2014 Global Client Value.
1. Ion Proton I well Ion 300 series well 454 Titanium well.
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Introduction to Illumina Sequencing
Million Veteran Program: Industry Day Genomic Data Processing and Storage Saiju Pyarajan, PhD and Philip Tsao, PhD Million Veteran Program: Industry Day.
Galaxy for analyzing genome data Hardison October 05, 2010
1 © 2016 Samsung Electronics America - Confidential Introducing MagicInfo Lite I 4.0.
INTRO. To I.T Razan N. AlShihabi
What is the database of a server. Web server. Print Server
Computing challenges in working with genomics-scale data
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Processing Device and Storage Devices
Cancer Genomics Core Lab
Ultra Scale High Density Hybrid DNA Memory Mohamad Al-Sheikhly, William Bentley, Aris Christou, Joseph Silverman Department of Materials Science.
Sequencing technologies
Joslynn Lee – Data Science Educator
Ion Torrent: Open, Accessible, Enabling
Berkeley Cluster Projects
IS301 – Software Engineering Dept of Computer Information Systems
Introducing MagicInfo Lite I 4.1
CP220x The Industry’s Smallest Ethernet Controller
ChIP-Seq Analysis – Using CLCGenomics Workbench
A basic look at the mechanics
Computer Basics Section 2.1 YOU WILL LEARN TO… Identify hardware
B3- Olympic High School Bioinformatics
Windows Azure 講師: 李智樺, Ruddy Lee
The Celera Genome Browser: A Tool for Visualizing and Annotating the Human Genome
Genomic Formats and the HLA Data Standard
Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine
High-Throughput Sequencing Technologies
High-Throughput Sequencing Technologies
Next-generation DNA sequencing
BF528 - Genomic Variation and SNP Analysis
Canadian Bioinformatics Workshops
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
IEEE MEDIA INDEPENDENT HANDOVER DCN: xx-00-sec
The Variant Call Format
Presentation transcript:

Ion Torrent Semiconductor Sequencing Mike Lelivelt, Ph.D., Director of Bioinformatics The content provided herein may relate to products that have not been officially released and is subject to change without notice.

Confidential and Proprietary—DO NOT DUPLICATE Who am I? – Mike Lelivelt Ph.D. from Univ of N Carolina in Microbial Genetics Post-Doc at Univ of WI Madison in Yeast Genomics 9 years at Affymetrix – software developer outreach 2 years at Partek – data anaylsis for arrays & NGS 3 years at Ion Torrent/Life Tech in bioinformatics Familiar with the challenges of applying genomic scale assays into discrete, actionable decisions via software. I’m here to educate about semiconductor sequencing. I’m here to listen to your needs. Confidential and Proprietary—DO NOT DUPLICATE

Confidential and Proprietary—DO NOT DUPLICATE Opening thoughts… "The wonderful thing about standards is that there are so many of them to choose from." –Andrew Tanenbaum Driven more by the technology that we’d like to admit Each technology platform serves multiple applications A data standard implies a file format, but it’s really more about understanding data process flows Broad scope of NGS will drive multi-marker haplotypes and introduces allele frequency measurements into the decision process. Software is a tough business model. We’ll need to work together on this. Confidential and Proprietary—DO NOT DUPLICATE

Simple Natural Chemistry Eliminate source error: Modified bases Fluorescent bases Laser detection Eliminate read length limitations: Unnatural bases Protect/de-protect Slow cycle time H+ Sequence is determined by measuring hydrogen ions released (1 per base added per DNA strand) during 2nd strand synthesis when complementary base (A, C, G or T) are sequentially incorporated by DNA polymerase.

Massively Parallel Post-Light Sequencing The chip that see chemistry Read clockwise from Wafer. Ion Torrent chip manufacturing process leverages the cumulative investment in semiconductor manufacturing over the last 30 years. Chips are produced in std semiconductor factories (a.k.a. foundries) and are similar in design to CMOS image sensor chips found in iPhones and Blackberry’s. Wheras the image sensors collect photons. Ion Torrent chips collect Protons Cross sectional view shows the Ion sensitive layer in green with the microwells on the top surface (3um) with the transistor stack underneath

Torrent Browser runs on Torrent Server Local compute and storage with an integrated web interface Torrent Server – hardware appliance Torrent Browser – easy web access to Ion data Plugins for secondary analysis e.g. variant calling For Research Use Only. Not intended for any animal or human therapeutic or diagnostic use.

Data Flow Leverages Several Formats Incorporation for 1 Flow (DAT) Incorporation over many flows (DAT) Raw signals per flow (WELLS) 0.1 1.2 0.3 2.1 0.1 0.2 2.1 3.1 0.0 0.2 2.1 3.1 0.0 0.1 1.2 0.3 2.1 0.0 0.0 0.0 3.2 1.4 0.1 1.3 1.0 0.2 0.1 Processed incorporations (SFF), but moving to unmapped BAM Flow space converted to base space (FASTQ) @7D8NM:4:9 GGGATCAGGCTGTCGAACGCGTGATTACATCTAGCTA + AA*ABBBB?BBBBBBBABBB@@@BB?BABABCDA!@$ 0 1 0 2 0 0 3 0 0 1 0 4 0 1 0 3 0 2 2 0 0 1 0 0 0 3 0 4 0 1 1 0 2 0 0 3 0 0 3 0 4 0 4 0 1 0 3 0 2 0 0 0 1 1 TMAP ##FORMAT=<ID=DP,Number=1 ##FORMAT=<ID=HQ,Number=2 #CHROM POS ID REF ALT QUAL FILTER 20 14370 rs6054257 G A 29 PASS binary TVC Variant Call Format (VCF) BAM 7 7

What is raw data? Do you really want it? Process Description File Type 314 chip 316 chip 318 chip Raw Voltage Data DAT 40 GB 180 GB 320 GB Signal Processing WELLS 1 GB 8 GB 12 GB Base Calls - Flow SSF/BAM 1 GB 5 GB 8 GB Base Calls - Base FASTQ 0.3 GB 1.5 GB 2 GB Base Calls - Aligned BAM 0.1 GB 0.6 GB 3.5 GB 8 *1.5 v run 200bp runs (440 flows, 110 cycles), Nov 2011

Confidential and Proprietary—DO NOT DUPLICATE Questions to Address Are allele calls alone sufficient to call HLA types? Likely not. More data is usually better. Should HLA software be required to call novel alleles? Speak no evil. See no evil. Hear no evil. But software will serve the market. Should novel alleles be submitted to IMGT/HLA? Balance between social curation & data security. More than just allele info? How should data be formatted to handle NGS richness? Format is a snapshot in time. Confidential and Proprietary—DO NOT DUPLICATE

Confidential and Proprietary—DO NOT DUPLICATE All products mentioned in this presentation are for Research Use Only, not intended for any animal or human therapeutic or diagnostic use. Confidential and Proprietary—DO NOT DUPLICATE