Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts.

Slides:



Advertisements
Similar presentations
Obesity e-Lab Enabling obesity research using the Health Surveys for England: The Obesity e-Lab project Dexter Canoy The University of Manchester
Advertisements

The use of Ontology in Organising and Managing Protein Family Resources Katy Wolstencroft, University Of Manchester.
Copyright © 2008, SAS Institute Inc. All rights reserved. Discovering Meaningful Patterns in Genomics Data with JMP Genomics Jordan Hiller JMP Genomics.
Statistical methods and tools for integrative analysis of perturbation signatures Mario Medvedovic Laboratory for Statistical Genomics and Systems Biology.
1 Harvard Medical School Mapping Transcription Mechanisms from Multimodal Genomic Data Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni Children.
Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
Gene Set Enrichment Analysis (GSEA)
Inferring Causal Phenotype Networks Elias Chaibub Neto & Brian S. Yandell UW-Madison June 2010 QTL 2: NetworksSeattle SISG: Yandell ©
Expression Modules Brian S. Yandell (with slides from Steve Horvath, UCLA, and Mark Keller, UW-Madison)
Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts.
Asa MacWilliams Lehrstuhl für Angewandte Softwaretechnik Institut für Informatik Technische Universität München Dec Software.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
Teresa Przytycka NIH / NLM / NCBI RECOMB 2010 Bridging the genotype and phenotype.
Software Architecture Patterns (2). what is architecture? (recap) o an overall blueprint/model describing the structures and properties of a "system"
Seattle Summer Institute : Systems Genetics for Experimental Crosses Brian S. Yandell, UW-Madison Elias Chaibub Neto, Sage Bionetworks
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
Bayesian causal phenotype network incorporating genetic variation and biological knowledge Brian S Yandell, Jee Young Moon University of Wisconsin-Madison.
Cancer is heterogeneous disease! -> enabled characterization of new tumor subtypes for improving personalized treatment and ultimately achieving better.
Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques Min Wenwen
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
Quantile-based Permutation Thresholds for QTL Hotspots Brian S Yandell and Elias Chaibub Neto 17 March © YandellMSRC5.
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
Professor Brian S. Yandell joint faculty appointment across colleges: –50% Horticulture (CALS) –50% Statistics (Letters & Sciences) Biometry Program –MS.
Monsanto: Yandell © Building Bridges from Breeding to Biometry and Biostatistics Brian S. Yandell Professor of Horticulture & Statistics Chair of.
Expression Modules Brian S. Yandell (with slides from Steve Horvath, UCLA, and Mark Keller, UW-Madison)
Breaking Up is Hard to Do: Migrating R Project to CHTC Brian S. Yandell UW-Madison 22 November 2013.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38.
Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.
Any data..! Any where..! Any time..! Linking Process and Content in a Distributed Spatial Production System Pierre Lafond HydraSpace Solutions Inc
Near Real-Time Verification At The Forecast Systems Laboratory: An Operational Perspective Michael P. Kay (CIRES/FSL/NOAA) Jennifer L. Mahoney (FSL/NOAA)
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Expression Modules Brian S. Yandell (with slides from Steve Horvath, UCLA, and Mark Keller, UW-Madison) Modules/Pathways1SISG (c) 2012 Brian S Yandell.
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
Yandell © June Bayesian analysis of microarray traits Arabidopsis Microarray Workshop Brian S. Yandell University of Wisconsin-Madison
October 2008BMI Chair Talk © Brian S. Yandell1 networking in biochemistry: building a mouse model of diabetes Brian S. Yandell, UW-Madison October 2008.
Causal Network Models for Correlated Quantitative Traits Brian S. Yandell UW-Madison October Jax SysGen: Yandell.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
13 October 2004Statistics: Yandell © Inferring Genetic Architecture of Complex Biological Processes Brian S. Yandell 12, Christina Kendziorski 13,
Gene Mapping for Correlated Traits Brian S. Yandell University of Wisconsin-Madison Correlated TraitsUCLA Networks (c)
JAX: Exploring The Galaxy Glen Beane, Senior Software Engineer.
Causal Network Models for Correlated Quantitative Traits Brian S. Yandell UW-Madison October Jax SysGen: Yandell.
J2EE Platform Overview (Application Architecture)
Quantile-based Permutation Thresholds for QTL Hotspots
Quantile-based Permutation Thresholds for QTL Hotspots
Professor Brian S. Yandell
A web portal for management of biological data and applications
Computational Infrastructure for Systems Genetics Analysis
Making “Open Data” Work: Challenges for Data Integration in Genomics Research
Attie Bioinformatics Server Redesign
Data challenges in the pharmaceutical industry
Graphical Diagnostics for Multiple QTL Investigation Brian S
Inferring Genetic Architecture of Complex Biological Processes BioPharmaceutical Technology Center Institute (BTCI) Brian S. Yandell University of Wisconsin-Madison.
Causal Network Models for Correlated Quantitative Traits
Inferring Genetic Architecture of Complex Biological Processes Brian S
Inferring Causal Phenotype Networks
Causal Network Models for Correlated Quantitative Traits
Causal Network Models for Correlated Quantitative Traits
High Throughput Gene Mapping Brian S. Yandell
Linking Genetic Variation to Important Phenotypes
Inferring Causal Phenotype Networks Driven by Expression Gene Mapping
What's New in eCognition 9
Inferring Genetic Architecture of Complex Biological Processes Brian S
R/qtlbim Software Demo
The first two principal components for the islet gene expression data for the 181 microarray probes that map to the chromosome 6 trans-eQTL hotspot with.
What's New in eCognition 9
eQTL Tools a collaboration in progress
Presentation transcript:

Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts to share tools eQTL ToolsSeattle SISG: Yandell © 20121

UW-Madison –Alan Attie –Christina Kendziorski –Karl Broman –Mark Keller –Andrew Broman –Aimee Broman –YounJeong Choi –Elias Chaibub Neto –Jee Young Moon –John Dawson –Ping Wang –NIH Grants DK58037, DK66369, GM74244, GM69430, EY18869 Jackson Labs (HTDAS) –Gary Churchill –Ricardo Verdugo –Keith Sheppard UC-Denver (PhenoGen) –Boris Tabakoff –Cheryl Hornbaker –Laura Saba –Paula Hoffman Labkey Software –Mark Igra U Groningen (XGA) –Ritsert Jansen –Morris Swertz –Pjotr Pins –Danny Arends Broad Institute –Jill Mesirov –Michael Reich eQTL ToolsSeattle SISG: Yandell © 20122

eQTL ToolsSeattle SISG: Yandell © 2012 experimental context B6 x BTBR obese mouse cross –model for diabetes and obesity –500+ mice from intercross (F2) –collaboration with Rosetta/Merck genotypes –5K SNP Affymetrix mouse chip –care in curating genotypes! (map version, errors, …) phenotypes –clinical phenotypes (>100 / mouse) –gene expression traits (>40,000 / mouse / tissue) –other molecular phenotypes 3

eQTL ToolsSeattle SISG: Yandell © 2012 how does one filter traits? want to reduce to “manageable” set –10/100/1000: depends on needs/tools –How many can the biologist handle? how can we create such sets? –data-driven procedures correlation-based modules –Zhang & Horvath 2005 SAGMB, Keller et al Genome Res –Li et al Hum Mol Gen mapping-based focus on genome region –function-driven selection with database tools GO, KEGG, etc Incomplete knowledge leads to bias –random sample 4

eQTL ToolsSeattle SISG: Yandell © 2012 why build Web eQTL tools? common storage/maintainence of data –one well-curated copy –central repository –reduce errors, ensure analysis on same data automate commonly used methods –biologist gets immediate feedback –statistician can focus on new methods –codify standard choices 5

eQTL ToolsSeattle SISG: Yandell © 2012 how does one build tools? no one solution for all situations use existing tools wherever possible –new tools take time and care to build! –downloaded databases must be updated regularly human component is key –need informatics expertise –need continual dialog with biologists build bridges (interfaces) between tools –Web interface uses PHP –commands are created dynamically for R continually rethink & redesign organization 6

perspectives for building a community where disease data and models are shared Benefits of wider access to datasets and models: 1- catalyze new insights on disease & methods 2- enable deeper comparison of methods & results Lessons Learned: 1- need quick feedback between biologists & analysts 2- involve biologists early in development 3- repeated use of pipelines leads to documented learning from experience increased rigor in methods Challenges Ahead: 1- stitching together components as coherent system 2- ramping up to ever larger molecular datasets eQTL ToolsSeattle SISG: Yandell © 20127

eQTL ToolsSeattle SISG: Yandell © 2012 Swertz & Jansen (2007) 8

view results (R graphics, GenomeSpace tools) systems genetics portal (PhenoGen ) collaborative portal (LabKey) iterate many times get data (GEO, Sage) run pipeline (CLIO,XGAP,HTD AS) eQTL ToolsSeattle SISG: Yandell © 20129

analysis pipeline acts on objects (extends concept of GenePattern) pipeline check s inpu t outpu t setting s eQTL ToolsSeattle SISG: Yandell ©

pipeline is composed of many steps A I B C D E’ O’O’ D’ E O compare methods alternative path I’ combine datasets A’ eQTL ToolsSeattle SISG: Yandell ©

causal model selection choices in context of larger, unknown network focal trait target trait focal trait target trait focal trait target trait focal trait target trait causal reactive correlated uncorrelated eQTL ToolsSeattle SISG: Yandell ©

BxH ApoE-/- chr 2: causal architecture hotspot 12 causal calls eQTL ToolsSeattle SISG: Yandell ©

BxH ApoE-/- causal network for transcription factor Pscdbp causal trait work of Elias Chaibub Neto eQTL ToolsSeattle SISG: Yandell ©

view results (R graphics, GenomeSpace tools) systems genetics portal (PhenoGen ) collaborative portal (LabKey) iterate many times get data (GEO, Sage) develop analysis methods & algorithms run pipeline (CLIO,XGAP,HTD AS) update periodically eQTL ToolsSeattle SISG: Yandell ©

pipeline check s inpu t outpu t setting s raw code preserv e history R&Dpackage eQTL ToolsSeattle SISG: Yandell ©

Model/View/Controller (MVC) software architecture isolate domain logic from input and presentation permit independent development, testing, maintenance Controller Input/response View render for interaction Model domain-specific logic user changes system actions eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell © 2012 automated R script library('B6BTBR07') out <- multtrait(cross.name='B6BTBR07', filename = 'scanone_ csv', category = 'islet', chr = c(17), threshold.level = 0.05, sex = 'both',) sink('scanone_ txt') print(summary(out)) sink() bitmap('scanone_ %03d.bmp', height = 12, width = 16, res = 72, pointsize = 20) plot(out, use.cM = TRUE) dev.off() 21

eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell ©

eQTL ToolsSeattle SISG: Yandell ©