Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.

Slides:



Advertisements
Similar presentations
The SDMX Registry Model April 2, 2009 Arofan Gregory Open Data Foundation.
Advertisements

National Institute of Statistics, Geography and Informatics (INEGI) Implementation of SDMX in Mexico.
ASIAES Project Overview Satellite Image Network for Natural Hazard Management in ASEAN+3 region Pakorn Apaphant Geo-Informatics and Space Technology Development.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
EDRN’s Validation Study Information Management System Developed for EDRN by the DMCC Cancer Biomarkers Group Division of Cancer Prevention Jet Propulsion.
1 CEOS/WGISS20 – Kyiv – September 13, 2005 Paul Kopp SIPAD New Generation: Dominique Heulet CNES 18, Avenue E.Belin Toulouse Cedex 9 France
CHOICE Pathology Informatics 2010 Boston, Massachusetts DataReady ® : A Deployable Data Management and Integration System for Large-scale Cancer Repositories.
Spatial Information Integration Services (SIIS) ISO/TC211 Workshop on Standards in Action Adelaide, South Australia October 2001 Mr. Neil Sandercock, SA.
National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology Pasadena, California Facilitating Distributed.
Aug. 20, JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,
Thee-Framework for Education & Research The e-Framework for Education & Research an Overview TEN Competence, Jan 2007 Bill Olivier,
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
8/28/2001Database Management -- Fall R. Larson Database Management: Introduction University of California, Berkeley School of Information Management.
1 Overview of Other Global Networks Exchange Network User Group Meeting April 2006.
LEVERAGING THE ENTERPRISE INFORMATION ENVIRONMENT Louise Edmonds Senior Manager Information Management ACT Health.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
A Software Architecture for Highly Data-Intensive Systems Chris A. Mattmann USC Center for Software Engineering Annual Research Review.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014.
NIH/NASA Meeting on Space-Related Health Research Henry Rodriguez, Ph.D., M.B.A. Director, Clinical Proteomic Technologies Initiative National Cancer Institute.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Digital Object Architecture
ISpheres Project. Project Overview iSpheresCore iSpheresImage Demonstration References.
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
©Ian Sommerville 2000 Software Engineering, 6th edition. Slide 1 Component-based development l Building software from reusable components l Objectives.
Evaluation and Testbed Development Bhavani Thuraisingham The University of Texas at Dallas Jim Massaro and Ravi Sandhu.
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
Planning for Arctic GIS and Geographic Information Infrastructure Sponsored by the Arctic Research Support and Logistics Program 30 October 2003 Seattle,
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
1 A National Virtual Specimen Database for Early Cancer Detection June 26, 2003 Daniel Crichton NASA Jet Propulsion Laboratory Sean Kelly NASA Jet Propulsion.
Metadata Architecture at StatCan MSIS 2008 Luxembourg, April 7-9, 2008 Karen Doherty Director General Informatics Branch Statistics Canada.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
Dan Crichton/JPL Steve Hughes/JPL Sean Kelly/UTA Sean Hardman/JPL
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Federated Database Set Up Greg Magsamen ITK478 SIA.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology Pasadena, California EDGE: The Multi-Metadata.
State Key Laboratory of Resources and Environmental Information System China Integration of Grid Service and Web Processing Service Gao Ang State Key Laboratory.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
SPASE and the VxOs Jim Thieman Todd King Aaron Roberts.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
EDRN Biomarker Database Curation Web Interface and Model.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
A Data Architecture for Interoperable Space Sciences Data Systems 1st Annual ERDN Workshop Early Detection Research Network Chicago, IL September 27,
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Informatics and the caTissue Wrapper for the Early Detection Research Network Chris A. Mattmann, Ph.D. Senior Computer Scientist Instrument Software/ Science.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
National Aeronautics and Space Administration 1 CCSDS Information Architecture Working Group Daniel J. Crichton NASA/JPL 24 March 2005.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
International Planetary Data Alliance Registry Project Update September 16, 2011.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
NIST Office of Data and Informatics (ODI) of the Material Measurement Laboratory Robert Hanisch, director Ray Plante, interoperability expert ODI has responsibility.
Flanders Marine Institute (VLIZ)
Distribution and components
Database Management Systems
Database Management System (DBMS)
Presentation transcript:

Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion Laboratory, California Institute of Technology, National Aeronautics and Space Administration

Page 2 Problem Definition and Proposal F Overview ä Problem: Specimen data is geographically distributed across heterogeneous data systems making the location, retrieval and use of this data difficult. ä Solution: Build a “data architecture” for the EDRN network F Use “metadata” as a key to interoperability F Provide services for data sharing, archiving and distribution F Provide a software framework that allows analysis tools to be plugged into the EDRN data enterprise ä Benefit: Correlating data across multiple centers affords an opportunity to create new data sets and data awareness F Example: Find all prostate tissue samples for men ages 70 and older collected before 1980 from databases across the EDRN

Page 3 EDRN Data Architecture Evolution Data System Evolution Local Database - Local Tools - No Data Sharing between Centers - No Common Data Elements Limited Data Sharing - Manual Data Sharing - Manual Correlation - Export/Import Data - Limited CDEs Full Data Sharing - Location Independence - Data Interchange - Data Sharing - Common CDEs between centers - Heterogeneous Systems Locally Centralized Data Interoperable & Distributed Databases

Page 4 Completed Steps for the Mockup Implementation F Extracted Data from Partner Centers ä Moffitt and San Antonio provided sample data sets to the DMCC and JPL ä Used “synthesized” data in lieu of “sensitive” data ä Preserved the original data structures provided by the centers F Mapped Data Dictionary Terms ä Mapped common models between the EDRN CDE, Moffitt and San Antonio for correlating data sets ä Developed “Profiles” that represent data resources for San Antonio, Moffitt, DMCC, EDRN and NCI F Hosted data and metadata “profiles” at JPL F Integrated with an existing data sharing software framework developed by JPL called “OODT” or Object Oriented Data Technology ä Framework developed to share space science datasets across NASA’s distributed Planetary Data System F Built a user interface to demonstrate a use case scenario for interoperability and data sharing between the databases

Page 5 Goals for the Mockup Implementation F Demonstrate the Return on Investment (ROI) achieved in “federating” (or linking) laboratory data systems together ä Identify a scenario that demonstrates usability such as providing generic support for specimen data location and retrieval F Use metadata (or profiles) ä “Recipes” to describe what data (specimen) and resources are available ä Communicate across systems F Adoption of EDRN CDEs ä Look for common models between systems ä Understand how to relate center-specific metadata models F Look for “low hanging” fruit ä Centers with similar databases and data models

Page 6 Query Manager EDRN Knowledge Architecture Mockup Implementation at JPL San Antonio MoffittMetadata Profiles EDRN Mock Databases Hosted at JPL San Antonio Product Exchange Server Moffitt Product Exchange Server In:Query Out::Identified Resources In:Query Out::Data Products In:Query Out::Data Products OODT Middleware: Hosted at JPL EDRN “Mock” Query Interface In:Query Out::Data Products

Page 7 Profile CDE Integration F Describe specimen data, data servers, and other resources using metadata “profiles” ä Use Common Data Element (CDE) set for specimen description and search attributes ä Use industry standard metadata terminology such as Dublin Core F Example Metadata Profiles: ä Mockup EDRN H. Lee Moffitt Cancer Center Product Server ä Mockup EDRN University of Texas, San Antonio Product Server ä Mockup EDRN DMCC Fred Hutchinson Cancer Research Center Query Interface ä Mockup EDRN DMCC Fred Hutchinson Cancer Research Center Web Site ä Early Detection Research Network Web Site ä EDRN Data Management and Coordinating Center Data Dictionary

Page 8 Data Element Comparison Chart * As of 12/5/2000

Page 9 User Interface F Provide a user interface to support various queries of related to cancer specimen data ( ): ä Find all prostate tissue samples for all men collected from San Antonio and Moffitt databases ä Find all prostate tissue samples for men ages 70 and older collected before 1980 from San Antonio and Moffitt databases sorted by Grade, Age, and Site ä Find all breast tissue samples from women ages 50 and older from San Antonio* and Moffitt databases ä Find all lung tissue samples from San Antonio and Moffitt databases * * San Antonio database contains just prostate

Page 10 Key Challenges F Local data dictionaries and associated data models ä Different terms, data types, enumerated values, etc ä Different meanings and interpretations F Different database product implementations ä Filemaker Pro and Microsoft Access ä Maintain the structural integrity of the data models F EDRN CDEs exist for demographic data, but not specimen data* ä JPL developed common CDEs between the two databases for the specimen data * As of 12/5/2000

Page 11 Next Steps F Focus the implementation of data sharing on defining a robust metadata infrastructure ä Complete the EDRN CDE effort and begin a process of mapping the CDEs to the center databases ä Reuse this mockup experience as an example! F Incorporate feedback from mockup presentation F Address IRB and security requirements related to data sharing ä Encrypted and de-identified keys ä Network and computer security access F Connect to databases physically located at the centers ä Implement data system interfaces to the remote databases

Page 12 Acknowledgements F Lynn Anderson, H. Lee Moffitt Cancer Center F Betsy Higgins, University of Texas, San Antonio F Heather Kincaid, Data Management and Coordinating Center, Fred Hutchinson Cancer Research Center F Mark Thornquist, Data Management and Coordinating Center, Fred Hutchinson Cancer Research Center F Ziding Feng, Data Management and Coordinating Center, Fred Hutchinson Cancer Research Center F Greg Downing, Office of Science Policy, Office of the Director, National Institute of Health F Sudhir Srivastava, National Cancer Institute

Page 13 Backup Slides

Page 14 EDRN Mockup Query Example

Page 15 EDRN Mockup Results – Query 1

Page 16 EDRN Mockup Results – Query 3

Page 17 EDRN Mockup Results – Query 4

Page 18 Detailed Search of Profiles

Page 19 Profiles of EDRN Resources EDRN Website Resource Profiles Moffitt Product Server San Antonio Product Server EDRN Resources San Antonio Mockup DB Moffitt Mockup DB DMCC Sample Interface EDRN Website DMCC Website DMCC Website DMCC Sample Interface Moffitt Product Server San Antonio Product Server

Page 20 EDRN Mockup Data Flow Query Server Profile Server jpl.edrn Product Server edrn.moffitt Product Server edrn.sanantonio User query XSL (profiles or data products formatted) XMLQuery/IIOP (no results) XMLQuery/IIOP (profiles or data results as requested) XMLQuery/IIOP (no results) XMLQuery/IIOP (profiles of resources to handle query) XMLQuery/IIOP (data results) XMLQuery/IIOP (product search) Search Web Page Profile DB Moffitt “Mock” Database San Antonio “Mock” Database QueryClientWeb server search.jsp Web EDRN/NCI Resources