Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts.

Slides:



Advertisements
Similar presentations
Panel 2 – Promoting Re-Use of Scientific Collections John Harrison SHAMAN Project University of Liverpool
Advertisements

Working Together to Find Solutions NISO E-Resource Management Forum Sandy Hurd Innovative Interfaces Director of Strategic Markets September 25, 2007.
Enerjetic Strengths Enerjetic is not marketed as a technology company, we are a data company. One that identifies business value through data and delivers.
Unveiling ProjectWise V8 XM Edition. ProjectWise V8 XM Edition An integrated system of collaboration servers that enable your AEC project teams, your.
Inferring Causal Phenotype Networks Elias Chaibub Neto & Brian S. Yandell UW-Madison June 2010 QTL 2: NetworksSeattle SISG: Yandell ©
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
© Copyright 2011 John Wiley & Sons, Inc.
The Cactus Portal A Case Study in Grid Portal Development Michael Paul Russell Dept of Computer Science The University of Chicago
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Software Architecture Patterns (2). what is architecture? (recap) o an overall blueprint/model describing the structures and properties of a "system"
Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts.
Web Application Architecture: multi-tier (2-tier, 3-tier) & mvc
UNIT-V The MVC architecture and Struts Framework.
Annual SERC Research Review - Student Presentation, October 5-6, Extending Model Based System Engineering to Utilize 3D Virtual Environments Peter.
LIGO-G E ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech Argonne, May 10, 2004.
Bayesian causal phenotype network incorporating genetic variation and biological knowledge Brian S Yandell, Jee Young Moon University of Wisconsin-Madison.
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Breaking Up is Hard to Do: Migrating R Project to CHTC Brian S. Yandell UW-Madison 22 November 2013.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
ARCH-4: The Presentation Layer in the OpenEdge® Reference Architecture Frank Beusenberg Senior Technical Consultant.
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
© 2004, The Trustees of Indiana University Kuali Project Development Methodology, Architecture, and Standards James Thomas, Kuali Project Manager Brian.
Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts.
© 2012 xtUML.org Bill Chown – Mentor Graphics Model Driven Engineering.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
May 7, 2003 Command and Control Visualization NAVCIITI Tasks 2.1b.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Computing Ontology Part II. So far, We have seen the history of the ACM computing classification system – What have you observed? – What topics from CS2013.
Open Source Software This permits users to use, change, and improve the software, and to redistribute it in modified or unmodified forms. It is very often.
Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
Near Real-Time Verification At The Forecast Systems Laboratory: An Operational Perspective Michael P. Kay (CIRES/FSL/NOAA) Jennifer L. Mahoney (FSL/NOAA)
Having a Blast! on DiaGrid Carol Song Rosen Center for Advanced Computing December 9, 2011.
A scalable and flexible platform to run various types of resource intensive applications on clouds ISWG June 2015 Budapest, Hungary Tamas Kiss,
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 1 Automate your way to.
Software Engineering Chapter: Computer Aided Software Engineering 1 Chapter : Computer Aided Software Engineering.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
CLOUD COMPUTING RICH SANGPROM. What is cloud computing? “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a.
Causal Network Models for Correlated Quantitative Traits Brian S. Yandell UW-Madison October Jax SysGen: Yandell.
Document Name CONFIDENTIAL Version Control Version No.DateType of ChangesOwner/ Author Date of Review/Expiry The information contained in this document.
CARBOOCEAN Data management and SOCAT Benjamin Pfeil, Are Olsen, Jeremy Malzcyk, Steve Hanhin, Alex Kozyr and many others Partner 16 WDC-MARE Partner 19.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Gene Mapping for Correlated Traits Brian S. Yandell University of Wisconsin-Madison Correlated TraitsUCLA Networks (c)
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
CINET Registry Aditya Agashe, Harshal Ganpatrao Hayatnagarkar and Sarang Joshi Mentored by Dr. Keith Bisset (NDSSL) CS 6604 – Digital Libraries Virginia.
Causal Network Models for Correlated Quantitative Traits Brian S. Yandell UW-Madison October Jax SysGen: Yandell.
J2EE Platform Overview (Application Architecture)
Quantile-based Permutation Thresholds for QTL Hotspots
PLM, Document and Workflow Management
VLEEM Software project
Computational Infrastructure for Systems Genetics Analysis
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
Attie Bioinformatics Server Redesign
Systems Analysis and Design 5th Edition Chapter 8. Architecture Design
Causal Network Models for Correlated Quantitative Traits
Inferring Causal Phenotype Networks
Causal Network Models for Correlated Quantitative Traits
Inferring Causal Phenotype Networks Driven by Expression Gene Mapping
Integration of EGA secure data access into Galaxy
What's New in eCognition 9
Inferring Genetic Architecture of Complex Biological Processes Brian S
Overview Activities from additional UP disciplines are needed to bring a system into being Implementation Testing Deployment Configuration and change management.
What's New in eCognition 9
eQTL Tools a collaboration in progress
What's New in eCognition 9
Presentation transcript:

Computational Infrastructure for Systems Genetics Analysis Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts to share tools UW-Madison: Yandell,Attie,Broman,Kendziorski Jackson Labs: Churchill U Groningen: Jansen,Swertz UC-Denver: Tabakoff LabKey: Igra 1

typical workflow (from Mark Igra, LabKey) 3 File Storage Database Storage Cluster 1 Transfer files. 2 Configure Analysis 3 Bio-Web Server Run analysis and load results 4 Review results, repeat steps 2-4 as required

view results (R graphics, GenomeSpace tools) systems genetics portal (PhenoGen ) collaborative portal (LabKey) iterate many times get data (GEO, Sage) run pipeline (CLIO,XGAP,HTDAS) 4

data access model closedlimitedopen published shared collaboration in progress patent issues proprietary internal Sage Commons Company X

analysis pipeline acts on objects (extends concept of GenePattern) pipeline checks input output settings 7

pipeline is composed of many steps A i B C DE o D’ 8

causal model selection choices in context of larger, unknown network focal trait target trait focal trait target trait focal trait target trait focal trait target trait causal reactive correlated uncorrelated 9

BxH ApoE-/- chr 2: causal architecture hotspot 12 causal calls 10

BxH ApoE-/- causal network for transcription factor Pscdbp causal trait 11

view results (R graphics, GenomeSpace tools) systems genetics portal (PhenoGen ) collaborative portal (LabKey) iterate many times get data (GEO, Sage) develop analysis models & algorithms run pipeline (CLIO,XGAP,HTDAS) update periodically 12

platform for biologists and analysts create and extend pipeline steps share algorithms – public library – private authentication compare methods on one platform combine data from multiple studies 13

pipeline checks input output settings raw code version control system R&D 14

Model/View/Controller (MVC) software architecture isolate domain logic from input and presentation permit independent development, testing, maintenance 15 Controller Input/response View render for interaction Model domain-specific logic user changes system actions

perspectives for building a community where disease data and models are shared Benefits of wider access to datasets and models: 1- catalyze new insights on disease & methods 2- enable deeper comparison of methods & results Lessons Learned: 1- need quick feedback between biologists & analysts 2- involve biologists early in development 3- repeated use of pipelines leads to documented learning from experience increased rigor in methods Challenges Ahead: 1- stitching together components as coherent system 2- ramping up to ever larger molecular datasets 16

UW-Madison – Alan Attie – Christina Kendziorski – Karl Broman – Mark Keller – Andrew Broman – Aimee Broman – YounJeong Choi – Elias Chaibub Neto – Jee Young Moon – John Dawson – Ping Wang – NIH Grants DK58037, DK66369, GM74244, GM69430, EY18869 Jackson Labs (HTDAS) – Gary Churchill – Ricardo Verdugo – Keith Sheppard UC-Denver (PhenoGen) – Boris Tabakoff – Cheryl Hornbaker – Laura Saba – Paula Hoffman Labkey Software – Mark Igra U Groningen (XGA) – Ritsert Jansen – Morris Swertz – Pjotr Pins – Danny Arends Broad Institute – Jill Mesirov – Michael Reich 17

Systems Genetics Analysis Platform Brian Yandell, UW-Madison high-throughput analysis of systems data enable biologists & analysts to share tools UW-Madison: Attie, Broman,Kendziorski Jackson Labs: Churchill U Groningen: Jansen, Swertz UC-Denver: Tabakoff LabKey: Igra hotspot causal trait 18