SeqWare for NGS analysis MGI meeting, 12/17/2012 Jianying Li.

Slides:



Advertisements
Similar presentations
Ch-11 Project Execution and Termination. System Testing This involves two different phases with two different outputs First phase is system test planning.
Advertisements

Next–generation DNA sequencing technologies – theory & practice
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS Ravi K Madduri University of Chicago and ANL.
My First Building Block Presented By Tracy Engwirda 28 September, 2005.
MCB Lecture #21 Nov 20/14 Prokaryote RNAseq.
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
06/12/2007SE _6_12_Design.ppt1 Design Phase Outputs: Completed & Inspected SDS & Integration Test Plan Completed & Inspected System Test Plan.
“How Perl Saved the Human Genome Project” DATE: Early February, 1996 LOCATION: Cambridge, England, in the conference room of the largest DNA sequencing.
High Throughput Sequencing
Presented by Mina Haratiannezhadi 1.  publishing, editing and modifying content  maintenance  central interface  manage workflows 2.
VMware vCenter Server Module 4.
11 © 2009 PerkinElmer © 2010 PerkinElmer November 20, 2012 DNA Services Overview.
NGS Data Generation Dr Laura Emery. Overview The NGS data explosion Sequencing technologies An example of a sequencing workflow Bioinformatics challenges.
Before we start: Align sequence reads to the reference genome
Chapter 5 Using SAS ® ETL Studio. Section 5.1 SAS ETL Studio Overview.
NGS Analysis Using Galaxy
DRAW+SneakPeek: Analysis Workflow and Quality Metric Management for DNA-Seq Experiments O. Valladares 1,2, C.-F. Lin 1,2, D. M. Childress 1,2, E. Klevak.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.

An Automated Component-Based Performance Experiment and Modeling Environment Van Bui, Boyana Norris, Lois Curfman McInnes, and Li Li Argonne National Laboratory,
M1G Introduction to Programming 2 4. Enhancing a class:Room.
Customized cloud platform for computing on your terms !
Bioinformatics Core Facility Ernesto Lowy February 2012.
Linux in More Detail Shirley Moore CPS5401 August 29,
|Tecnologie Web L-A Anno Accademico Laboratorio di Tecnologie Web Introduzione ad Eclipse e Tomcat
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
GenePattern Overview for MAGE-TAB Workshop Ted Liefeld January 24, 2007.
NGS data analysis CCM Seminar series Michael Liang:
File System Management File system management encompasses the provision of a way to store your data in a computer, as well as a way for you to find and.
RNA-Seq in Galaxy Igor Makunin QAAFI, Internal Workshop, April 17, 2015.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Software Quality Assurance
David R. McWilliams, Ph.D. Section of Statistical Genetics, Department of Biostatistical Sciences, Center for Public Health Genomics Bioinformatician IV.
RNA surveillance and degradation: the Yin Yang of RNA RNA Pol II AAAAAAAAAAA AAA production destruction RNA Ribosome.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Chip-Seq Peak Calling in Galaxy | Lisa Stubbs | PowerPoint by Casey Hanson.
Operating System What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. An operating.
Next Generation Sequencing pipeline: a joint LONI – BIRN [UCLA – UCI] collaborative project F. Macciardi – March 16, 2011.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
RNA-Seq in Galaxy Igor Makunin DI/TRI, March 9, 2015.
__________________________________________________________________________________________________ Fall 2015GCBA 815 __________________________________________________________________________________________________.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Lisa Stubbs | Chip-Seq Peak Calling in Galaxy1.
JRA1 Meeting – 09/02/ Software Configuration Management and Integration EGEE is proposed as a project funded by the European Union under contract.
User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen.
MGvizCE: Clinical Exome QC and Analytics Pablo Marin-Garcia, Daniel Perez-Gil, Cristian Perez-Garcia, Alba Sanchis-Juan,Azahara Fuentes, Jose M. Juanes,
Scaling bio-analyses from computational clusters to grids George Byelas University Medical Centre Groningen, the Netherlands IWSG-2013, Zürich, Switzerland,
Using Galaxy to build and run data processing pipelines Jelle Scholtalbers / Charles Girardot GBCS Genome Biology Computational Support.
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
Introduction to Illumina Sequencing
Architecture Review 10/11/2004
Short Read Sequencing Analysis Workshop
Cancer Genomics Core Lab
Data Platform and Analytics Foundational Training
University of Chicago and ANL
Customizing Galaxy for a Hospital Environment
Android.
Pipeline Execution Environment
Invest. Ophthalmol. Vis. Sci ;57(10): doi: /iovs Figure Legend:
USF Health Informatics Institute (HII)
HII Technical Infrastructure
Misleading Bioinformatics Mistakes, Biases, Mis-Interpretations and how to avoid them Festival of Genomics 2017.
Material for today’s workshop is at:
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
Zhangxy Zhangxm Huangxt Dec 17 ,2003
Presentation transcript:

SeqWare for NGS analysis MGI meeting, 12/17/2012 Jianying Li

Keeping track of NGS pipeline analysis A sample tracking and laboratory assay tracking system – Clinical samples and their associated clinical data – Library process, DNA/RNA extraction, etc. – Aliquot, – Sample label, storage, etc. Sequencing run – Platform: Solid, 454, Illumina, etc – Library preparation – Exome, genome, RNAseq, ChIPseq, etc. – SE vs PE – HiSeq -- multiplexing Analytical processes – Software and version, dependency – Aligner: BWA, Botie, Bfast, etc. – Reference DB: hg18/19, mm9/10 – Data QC – Alignment, variant calls, other processes – Analysis log – Results Use of the analytical results – Variant call results – Data sharing – Further statistical analysis – Data mining

SeqWare pipeline A collection of sequence analysis tools A collection of third-party analysis tools A programmatic interface to wrap SeqWare Pipeline and third party tools A mechanism to run these tools in a consistent way interactively A mechanism to string these tools together and execute them on any cluster

SeqWare pipeline/workflow layout

Workflow convention # key=input_file:type=file:display=F:file_meta_type=text/plain input_file=${workflow_bundle_dir}/bundle_hello_world/0.9.0/data/input.txt # key=greeting:type=text:display=T:display_name=Greeting greeting=Testing # this is just a comment, the output directory is a conventions and used in many workflows to specify a relative output path output_dir=results # the output_prefix is a convension and used to specify the root of the absolute output path or an S3 bucket name output_prefix=./ Running Plugin: net.sourceforge.seqware.pipeline.plugins.BundleManager Setting Up Plugin: ===================================================== ===============INSTALLED WORKFLOWS=================== ===================================================== Name Version Creation Date SeqWare Accession Bundle Location FileImport Wed Jan 04 13:51:00 EST null First Fri Nov 30 16:17:31 EST null HelloWorldWorkflow 1.0 Wed Aug 15 19:00:11 EDT “/home/seqware/SeqWare/released-bundles/Workflow_Bundle_H elloWorldWorkflow_1.0_SeqWare_ zip”

QC modules Prior alignment

 Let’s take a look at a VERY brief examples on my SeqWare Virtual Machine