Cloud Implementation of GT-FAR (Genome and Transcriptome-Free Analysis of RNA-Seq) University of Southern California.

Slides:



Advertisements
Similar presentations
NGS Bioinformatics Workshop 2.1 Tutorial – Next Generation Sequencing and Sequence Assembly Algorithms May 3rd, 2012 IRMACS Facilitator: Richard.
Advertisements

IMGS 2012 Bioinformatics Workshop: RNA Seq using Galaxy
Natasha Pavlovikj, Kevin Begcy, Sairam Behera, Malachy Campbell, Harkamal Walia, Jitender S.Deogun University of Nebraska-Lincoln Evaluating Distributed.
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS Ravi K Madduri University of Chicago and ANL.
Peter Tsai Bioinformatics Institute, University of Auckland
MCB Lecture #21 Nov 20/14 Prokaryote RNAseq.
RNA-seq Analysis in Galaxy
Gene Finding Genome Annotation. Gene finding is a cornerstone of genomic analysis Genome content and organization Differential expression analysis Epigenomics.
Bacterial Genome Assembly | Victor Jongeneel Radhika S. Khetani
Before we start: Align sequence reads to the reference genome
NGS Analysis Using Galaxy
Introduction to RNA-Seq and Transcriptome Analysis
LECTURE 2 Splicing graphs / Annoteted transcript expression estimation.
Li and Dewey BMC Bioinformatics 2011, 12:323
Customized cloud platform for computing on your terms !
Expression Analysis of RNA-seq Data
Bioinformatics and OMICs Group Meeting REFERENCE GUIDED RNA SEQUENCING.
Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015.
A Web interface to cloud-based Monte Carlo simulations for TrueBeam and C-linac Good morning. I will be introduce the ‘ cloud-based Monte Carlo simulations.
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
RNAseq analyses -- methods
UWG 2013 Meeting PO.DAAC Web Services Demo. What are PO.DAAC Web Services?
Introduction to RNA-Seq & Transcriptome Analysis
Managing large-scale workflows with Pegasus Karan Vahi ( Collaborative Computing Group USC Information Sciences Institute Funded.
Experimental validation. Integration of transcriptome and genome sequencing uncovers functional variation in human populations Tuuli Lappalainen et al.
A framework to support collaborative Velo: Knowledge Management for Collaborative (Science | Biology) Projects A framework to support collaborative 1.
Next Generation DNA Sequencing
Transcriptome Analysis
RNA-Seq in Galaxy Igor Makunin QAAFI, Internal Workshop, April 17, 2015.
RNA-seq workshop ALIGNMENT
NIH Extracellular RNA Communication Consortium 2 nd Investigators’ Meeting May 19 th, 2014 Sai Lakshmi Subramanian – (Primary
Welcome to DNA Subway Classroom-friendly Bioinformatics.
RNA-Seq Assembly 转录组拼接 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath,
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
Data Workflow Overview Genomics High- Throughput Facility Genome Analyzer IIx Institute for Genomics and Bioinformatics Computation Resources Storage Capacity.
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
6 February 2009 ©2009 Cesare Pautasso | 1 JOpera and XtremWeb-CH in the Virtual EZ-Grid Cesare Pautasso Faculty of Informatics University.
Introduction to RNAseq
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
The iPlant Collaborative
Making Software Executable by Others Varun Ratnakar USC/ISI April 17, 2015
The iPlant Collaborative
Manuel Holtgrewe Algorithmic Bioinformatics, Department of Mathematics and Computer Science PMSB Project: RNA-Seq Read Simulation.
IPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment Sriram Srinivasan.
Canadian Bioinformatics Workshops
User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
JAX: Exploring The Galaxy Glen Beane, Senior Software Engineer.
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing.
Konstantin Okonechnikov Qualimap v2: advanced quality control of
Placental Bioinformatics
WS9: RNA-Seq Analysis with Galaxy (non-model organism )
RNA Sequencing Day 7 Wooohoooo!
University of Chicago and ANL
Cloud based NGS data analysis
Detect alternative splicing
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
High-Throughput Analysis of Genomic Data [S7] ENRIQUE BLANCO
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Part II SeqViewer AraCyc Help
Computational Pipeline Strategies
RNA-Seq Data Analysis UND Genomics Core.
Presentation transcript:

Cloud Implementation of GT-FAR (Genome and Transcriptome-Free Analysis of RNA-Seq) University of Southern California

GT-FAR Pipeline

GT-FAR Components 1.Read Quality Control and Adaptor Trimming for Input Read File 2.Sequential Ungapped Mapping to Reference Gene-Models/Genome* 3.Gapped alignment to Reference Gene Models/Genome to faciliate Splice Variant Prediction* 4.Sample Quantification a)A reference based version concerning gene/junction/exon/pre-mRNA expression* b)A reference free quantification of read/kmer sequences 5.Output a)Quantification data, visualization, and an alignment sam file for further analysis b)Capable of including >99% in reference based output in high quality human samples 6.* When a reference genome and gtf file are available. If one is not available only a sequence/kmer based analysis (4b) is performed.

Pegasus WMS on the Cloud Allows scientist to design an analysis at a high-level without worrying about how to invoke it, execute it Provides Python, Java, and Perl APIs for workflow creation Automatically executes computations on computational resources available to the community or individual When failures occur, it tries to recover from them using a variety of mechanisms Records provenance Used in a number of domains: astronomy, bioinformatics, earthquake science, helioseismology, gravitational-wave physics, seismology, etc.. Detailed documentation on workflow design and execution at Pegasus tutorial on Amazon AWS User support available

GT-FAR Cloud Based Pipeline Investigators can start an EC2 instance with a GUI/GT- FAR Users can upload input files (FastQ file in gzip format) using web browser Tracks running workflows Users are able to download the outputs to their local laptops Outputs are also made available in Amazon S3 Allows for error reporting and debugging GT-FAR pipeline is available as a cloud-based solution hosted on Amazon EC2. ( ) The pipeline is executed on distributed resources using the Pegasus Workflow Management System ( ) Capabilities

GTFAR Success

GTFAR Failure

Expression of APOL1 APOL1 has moderate expression –we can notice that it all comes from a few exons and matching junctions –Hence, it is driven by a single transcript.

RNA-seq Analysis Workflows GT-FAR (Read-based RNA-seq Analysis) – New Functions: Novel Splice Junctions, Reference-free analysis – Pegasus WMS: – Pegasus GT-FAR (genome and transcriptome free analysis of RNA): – Pegasus tutorial on Amazon AWS – GitHub: RseqFlow (Standard RNA-seq Analysis) – Command line based – Functions: RPKM, Differential Expression, Variants – Google: – GitHub: – SourceForge: