Oleksiy Karpenko, Neil J. Bahroos Galaxy Community Conference Chicago

Slides:



Advertisements
Similar presentations
PUBLIC ChemAxon European UGM Building an Electronic Research Habitat at ETC Peter Condron.
Advertisements

Analysis of Affymetrix expression data using R on Azure Cloud Anne Owen Department of Mathematical Sciences University of Essex 15/16 March, 2012 SAICG.
Copyright © 2008, SAS Institute Inc. All rights reserved. Discovering Meaningful Patterns in Genomics Data with JMP Genomics Jordan Hiller JMP Genomics.
BST 775 Lecture PLINK – A Popular Toolset for GWAS
Affymetrix MDS Cytoscan HD Project
Which Phenotypes Can be Predicted from a Genome Wide Scan of Single Nucleotide Polymorphisms (SNPs): Ethnicity vs. Breast Cancer Mohsen Hajiloo, Russell.
CaRE Center Informatics NHLBI CaRE Center Meeting Bethesda, MD July 25, 2006 Marcia Nizzari.
High Throughput Sequencing
PI of TAIR, The Arabidopsis Information Resource Carnegie Institution for Science, located at Stanford, California Background.
:NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BIOSTATISTIC/BIOINFORMATIC TOOLS FOR GENETICS DATA: DATA MANAGEMENT AND ANALYSIS RICHARD.
Candidate Gene Resource Steering Committee Meeting July 25, 2006.
Ken Quinn Research Applications Administrator Information Technology Roswell Park Cancer Institute Christine O’Connell Sr. Director Laboratory Research.
BIOCMS: Resource Integration and Web Application Framework for Bioinformatics DHUNDY R BASTOLA †, *, ANIL KHADKA †, MOHAMMAD SHAFIULLAH † AND HESHAM ALI.
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago.
NGS Analysis Using Galaxy
Polymorphism and Variant Analysis Lab
ARC Biotechnology Platform: Sequencing for Game Genomics Dr Jasper Rees
Factors to Consider in Selecting a Genotyping Platform Elizabeth Pugh June 22, 2007.
Genomics – Next-Gen sequencing and Microarrays
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
Polymorphism & Variant Analysis Lab Saurabh Sinha Polymorphism and Variant Analysis Lab v1 | Saurabh Sinha 1 Powerpoint by Casey Hanson.
NGS data analysis CCM Seminar series Michael Liang:
Clinical Genomics Joint with RCRIM Amnon Shabo Joyce Hernandez Mukesh Sharma.
Bioinformatics Core Lang Li Center for Computational Biology and Bioinformatics Division of Biostatistics Department of Medicine.
Building and Running caGrid Workflows in Taverna 1 Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL, USA 2 Mathematics.
Informatics Software and Services Jim Shaw BergenShaw International Integrate. Automate. Manage. Your company Logo In collaboration.
Introduction to caArray caBIG ® Molecular Analysis Tools Knowledge Center April 3, 2011.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
HPDC Report Domenico Vicinanza CERN IT-GD-OPS CERN, July 12 th weekly OPS section meeting.
Association of variations in I kappa B-epsilon with Graves' disease using classical and my Grid methodologies Peter Li School of Computing Science University.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up.
PLINK / Haploview Whole genome association software tutorial
GenABEL: an R package for Genome Wide Association Analysis
Kathleen Giacomini, Mark J. Ratain, Michiaki Kubo, Naoyuki Kamatani, and Yusuke Nakamura NIH Pharmacogenomics Research Network III & RIKEN Center for Genomic.
The iPlant Collaborative
Copyright OpenHelix. No use or reproduction without express written consent1.
Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.
1 Circuitscape Capstone Presentation Team Circuitscape Katie Rankin Mike Schulte Carl Reniker Sean Collins.
Current Data And Future Analysis Thomas Wieland, Thomas Schwarzmayr and Tim M Strom Helmholtz Zentrum München Institute of Human Genetics Geneva, 16/04/12.
Canadian Bioinformatics Workshops
BRIITE SEATTLE, 2003 SCIENTIFIC DATA MANAGEMENT WORKING GROUP.
JAX: Exploring The Galaxy Glen Beane, Senior Software Engineer.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
Inheritance Model testing Andrew Stubbs Dept. Bioinformatics.
Million Veteran Program: Industry Day Genomic Data Processing and Storage Saiju Pyarajan, PhD and Philip Tsao, PhD Million Veteran Program: Industry Day.
Enhancements to Galaxy for delivering on NIH Commons
Cancer Genomics Core Lab
Development of an interactive pipeline for Genome wide association analysis Falola Damilare & Adigun Taiwo – Covenant University Bioinformatics research.
CyVerse Discovery Environment
Week 5 Theory and application for setting up an RNA-Seq pipeline
Human Genomics Analysis Interface (U01-DE024425)
electronic PharmacoGenomics Assistant (ePGA)
Using Galaxy for Molecular Assay Design
Introduction to Data Formats and tools
West of England Genomics Medicine Centre Overview
Genetics and Genomics Analysis Platform
Consortium: National networks in 16 European countries.
Genotype Imputation with Millions of Reference Samples
Development Goals for Year 2
Software Prototyping, Training and Networking
eQTL Tools a collaboration in progress
Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms  Carl A. Anderson, Fredrik H. Pettersson,
Automating NGS Gene Panel Analysis Workflows
Presentation transcript:

Galaxy Pipeline for Faster Whole Genome Genotype Calling on the GeneTitan Platform Oleksiy Karpenko, Neil J. Bahroos Galaxy Community Conference Chicago July 26, 2012

Overview What is Affymetrix GeneTitan? Axiom® Genome-Wide Population-Optimized Human Arrays Affymetrix Power Tools (APT) vs Affymetrix Genotyping Console APT in Galaxy at UIC Galaxy workflow for Axiom® arrays Future work

Affymetrix GeneTitan Automated high-throughput solution for monitoring gene expression and genome-wide SNP genotyping 16-, 24- and 96-format array plates On campus at RRC Core Genomics Facility, College of Medicine West (CMW), Room A-301

AXIOM® GENOME-WIDE POPULATION-OPTIMIZED HUMAN ARRAYS One Axiom Genome-Wide PanAFR Plate for 96 samples ~700K SNPs per plate One Axiom GW (~2.1M) PanAFR Array Set is a set of 3 plates Also available CEU (European), ASI (Asian), CHB (Chinese Beijing), and custom myDesign plates

POWER TOOLS VS GENOTYPING CONSOLE Affymetrix Power Tools Genotyping Console

AFFYMETRIX POWER TOOLS IN GALAXY apt-geno-qc for quality control apt-probeset-genotype for genotype calling Write calls in plink format SNP metrics

MINOR CUSTOMIZATIONS Registered ".CEL.TAR" file formats for batch uploads Modified Galaxy upload tool for better handling custom binary files Registered APT ".calls.txt", ".report.txt", "snp-posteriors.txt“ file formats Registered PLINK formats ".tped", "tmap"

#1. Upload a tarball of .CEL files

#2. Perform QC for each .CEL file

#3, #4. Filter .CEL files above 0.82 DQC #2. Perform QC for each .CEL file

#5. First run of genotyping

#6, #7. Filter .CEL files above 97% Call Rate #2. Perform QC for each .CEL file

#8. Second run of genotyping

#9. Calculate different SNP metrics

#10. Upload phenotypes

#11. Output genotypes in PLINK format

Future work Integrate a few more Affymetrix Power Tools into Galaxy Publish the APT tools in Galaxy Tool Shed Better support for binary/compressed data files Publish the APT genotyping workflow

Acknowledgements Research Resources Center and Center for Clinical and Translational Science at UIC Dr. Damir Herman, Affymetrix Morris Chukhman, Tommie Jackson, and the team at the Center for Research Informatics at UIC Dr. Rick Kittles, UIC Institute of Human Genetics Dr. Zarema Arbieva, UIC Core Genomics Facility