SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014.

Slides:



Advertisements
Similar presentations
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Information Systems Analysis and Design
National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology Pasadena, California Facilitating Distributed.
Aug. 20, JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,
Managing Data Resources
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
Copyright 2004 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Second Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
Lecture Nine Database Planning, Design, and Administration
The Software Product Life Cycle. Views of the Software Product Life Cycle  Management  Software engineering  Engineering design  Architectural design.
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
LEVERAGING THE ENTERPRISE INFORMATION ENVIRONMENT Louise Edmonds Senior Manager Information Management ACT Health.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions of systems whose requirements are being.
What is Software Architecture?
Copyright 2001 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter 1 The Systems.
The Design Discipline.
CceHUB A Knowledge Discovery Environment for Cancer Care Engineering Research Ann Christine Catlin HUBzero Workshop November 7, 2008.
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
Database System Concepts and Architecture
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.
By Xiangzhe Li Thanh Nguyen.  Introduction  Terminology  Architecture  Component  Connector  Configuration  Architectural Style  Architectural.
1 A National Virtual Specimen Database for Early Cancer Detection June 26, 2003 Daniel Crichton NASA Jet Propulsion Laboratory Sean Kelly NASA Jet Propulsion.
EDRN Biomarker Review Process Heather Kincaid, Andrew Hart, Kristen Anton, Mark Thornquist, Dan Crichton and Christos Patriotis.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Chapter 6 Supporting Knowledge Management through Technology
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CIS/SUSL1 Fundamentals of DBMS S.V. Priyan Head/Department of Computing & Information Systems.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Information Architecture WG: Report of the Spring 2004 Meeting May 13, 2004 Dan Crichton, NASA/JPL.
CSC480 Software Engineering Lecture 10 September 25, 2002.
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System models.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Children’s Health Exposure Analysis Resource (CHEAR) CHEAR Center for Data Science Susan Teitelbaum, PhD November 4, 2015.
EDRN Biomarker Database Curation Web Interface and Model.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Object storage and object interoperability
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
E ARTHCUBE C ONCEPTUAL D ESIGN A Scalable Community Driven Architecture Overview PI:
CIMA and Semantic Interoperability for Networked Instruments and Sensors Donald F. (Rick) McMullen Pervasive Technology Labs at Indiana University
ECAS Curation John J. Tran. DATA “BLOB” –staging area ALVIN LIU WHI COLORADO SELDI, MALDI, & MISC PI BILL GRIZZLE Current Curation Process & Status *
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
Informatics and the caTissue Wrapper for the Early Detection Research Network Chris A. Mattmann, Ph.D. Senior Computer Scientist Instrument Software/ Science.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Managing Data Resources File Organization and databases for business information systems.
Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.
EOSC MODEL Pasquale Pagano CNR - ISTI
Bio68: Bioinformatics Databases
Chapter 1 Database Systems
Design Model Like a Pyramid Component Level Design i n t e r f a c d s
Chapter 1 Database Systems
Presentation transcript:

SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014

White River Computing / UNC Chapel Hill From Information Design, Nathan Shedroff

Process Architecture – describes the core processes for the system Data Architecture – describes the information models and data standards for the system Application Architecture – Portals, tools, etc. Technology Architecture – Infrastructure elements White River Computing / UNC Chapel Hill Architecture – what is it? The fundamental organization of a system embodied in its components, their relationships to each other and to the environment, and the principles guiding its design and evolution (ANSI/IEEE Std ) Architecture: Architecture is decomposed into four core pieces :

Identify the drivers and requirements Create an architectural description of the system – identified stakeholders, concerns and associated models Identify core architectural principals Separate the architecture into key viewpoints Create a decomposition of the system identifying the elements and mapping to the requirements Identified the high-level flows and analyze from the rpocess, information and application/technology perspectives Generate the architectural models White River Computing / UNC Chapel Hill Architecture Development Approach

One of the major challenges is communicating an architecture Who are the PCORI stakeholders that care about the architecture? How do we communicate their care-abouts? White River Computing / UNC Chapel Hill Communicating an Architecture Determine a useful view of the system for the stakeholder Projects have suffered because a useful view wasn’t provided The viewpoint is where you look from The view is what you see (Stakeholders)

The organization, implementation and deployment of the software should follow the identification of an architecture which aligns with the principles and needs of the stakeholders The separation of the architecture into concerns will let us determine what capabilities exist and what capabilities need to be developed Ultimately this will help to ensure that a system is deployed which will integrate White River Computing / UNC Chapel Hill Software Development

White River Computing / UNC Chapel Hill Recommended Software Development Approach Project Formulation System Formulation/ Architecture Site Development Project Organization, Objectives, High Level Schedule and Project Plan High-Level Architecture for System and Data, Architecture, Data Flows, Initial Data Structure, etc Development and deployment of the infrastructure and architecture; development of the core data model/ consistent with PCORnet “universal” data model? Jan 2014 – Mar 2014 Feb 2014 – June 2014 June 2014 – June 2015

White River Computing / UNC Chapel Hill Supporting science-driven research needs: Case Study – Early Detection Research Network (EDRN) Research network of collaborating scientists from more than 40 institutions – international network of networks Focus on identifying and validating biomarkers of cancer at early stage/ preclinical Bioinformatics challenges in EDRN: Developing computing infrastructure that is “biomarker-centric.” Improve research capability by enabling real-time access to a variety of information that crosses institutional boundaries.

Coordinated discovery and validation of biomarkers across cancer research centers to increase accuracy of the results of studies Accommodating various data types Facilitation of analytics through data integration and single-point access Support workflows associated with various types of information Encouraging and supporting collaboration White River Computing / UNC Chapel Hill Bioinformatics – Goals Supporting science-driven research needs

White River Computing / UNC Chapel Hill Bioinformatics – Goals Supporting science-driven research needs Linking highly diverse systems together to integrate and present data for analytics Defining a comprehensive information model for describing the problem space/ ontology Providing software interfaces for capture, discovery, and access of data resources Providing a secure transfer and distribution infrastructure Enabling all data sources to be heterogeneous and distributed Providing integrated portal for access to distributed data Providing bioinformatics tools/ pipelines for uniform data processing

White River Computing / UNC Chapel Hill Bioinformatics EDRN Knowledge Environment Functional architecture: Services Data capture Data discovery Data access Data retrieval Data processing Data distribution

Biomarkers Studies Participants Organs Data generated from instruments (e.g. mass spec, arrays) White River Computing / UNC Chapel Hill Bioinformatics EDRN Knowledge Environment Information architecture: Data Model across EDRN projects (“universal” data model) Representation of information associated with data objects managed within the knowledge system Models for:

Relationships between and among objects Standard set of metadata elements that can be used for annotating objects Multiple metadata schemata for machine usable explanations of the metadata descriptions Metadata descriptions describe the inception and composition of data Common language for describing data and associated attributes: Common Data Elements (CDEs) CDE has a Uniform Resource Identifier (URI) – URL form points to CDE definition page – used in XML standards White River Computing / UNC Chapel Hill Bioinformatics EDRN Knowledge Environment Information architecture: Data Model

eCAS Science Warehouse CDE Repository ERNE VSIMS Participant DB Protocol DB Public Portal Distributed Specimen Databases EDRN science data results (local, distributed and varying degrees of validation) Descriptions of biomarkers and their use (protocol_id) Descriptions of EDRN studies -Participants -Specimen tracking, etc Protocols and their descriptions Data elements and their descriptions BIOINFORMATICS TOOLS EDRN science data results (protocol_id, participant_id) (protocol_id, participant_id) (protocol_id, participant_id) Biomarker_DB (protocol_id) Participants and their characteristics EDRN Knowledge Environment

Biomarker Database holds 850 curated biomarkers, including panels/ signatures of biomarkers Biomarker Database modeled to reflect the data model: activity in multiple organs, protocols, data files – facilitate single-point data access eSIS contains 165 protocols eCAS holds 56 data sets, with many files in each set, and more added daily – standard metadata around each set and each product Two bioinformatics tools implemented: Proteomics “pipeline” (generating standardized biomarker identification files); REDCap (standardized data definition and capture at the project level) – additional in progress Common Data Elements (CDEs) contributed to the NCI repository CDE has a Uniform Resource Identifier (URI) – URL form points to CDE definition page – used in XML standards Portal facilitates authorized access to almost 200,000 specimens Publications and Resources White River Computing / UNC Chapel Hill EDRN Knowledge Environment Success?

White River Computing / UNC Chapel Hill EDRN Knowledge Environment Technology Iterative development Open Source philosophy and tools Apache OODT (Object Oriented Data Technology) Software components developed independent of any data model: EDRN’s computing infrastructure can be replicated

White River Computing / UNC Chapel Hill EDRN Knowledge Environment Technology

White River Computing / UNC Chapel Hill Bioinformatics – Goals Supporting science-driven research needs: SHARE

Geisel School of Medicine at Dartmouth / UNC Chapel Hill Bioinformatics – Goals Supporting science-driven research needs: SHARE

Geisel School of Medicine at Dartmouth / UNC Chapel Hill Supporting science-driven research needs: PCORI PPRN

Geisel School of Medicine at Dartmouth / UNC Chapel Hill Opportunity to offer our architecture to PCORnet? Synergy in data model Query across CCFA PPRN network …network of networks?