Information Integration José Luis Ambite, Ph.D. Project Leader, Information Sciences Institute Research Assistant Professor, Computer Science University.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Medical Image Resource Center. What is MIRC? Medical Image Resource Center Makes it easier to locate and share electronic medical images and related information.
CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
A Secure Interoperable Infrastructure For Healthcare Information System Ehsan ul Haq Abrar Ahmed Sair
National Cancer Institute U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health NCI Perspective on Informatics and Clinical Decision.
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
Integration and Insight Aren’t Simple Enough Laura Haas IBM Distinguished Engineer Director, Computer Science Almaden Research Center.
Institute for Scientific Computing – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University.
eGovernance Under guidance of Dr. P.V. Kamesam IBM Research Lab New Delhi Ashish Gupta 3 rd Year B.Tech, Computer Science and Engg. IIT Delhi.
1 Information Integration and Source Wrapping Jose Luis Ambite, USC/ISI.
Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago.
1 Overview of Database Federation and IBM Garlic Project Presented by Xiaofen He.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Collaborations and Architectures mBIRN Progress at BWH.
PROJECT NAME: DHS Watch List Integration (WLI) Information Sharing Environment (ISE) MANAGER: Michael Borden PHONE: (703) extension 105.
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Morphometry BIRN Bruce Rosen, M.D. Ph.D.. Scientific Goal Methods –Multi-site MRI calibration, acquisition –Integrate advanced image analysis and visualization.
Integrated Querying Across Disparate Data Sources José Luis Ambite & Gully APC Burns Information Sciences Institute University of Southern California.
Ongoing BIRN-GCRC Collaborations Medical College Wisconsin (non BIRN site) –Functional MRI acquisition calibration University of Texas (non BIRN site)
San Diego Supercomputer Center University of California, San Diego The MIX Project Native XML Database XML View(s) Wrappers export: 1. Schemas & Metadata.
NCBO Driving Biological Project Ontology-Based Annotation of Biomedical Time-Series Data Rai Winslow & Steve Granite Last update October 2008.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
1 Lessons from the TSIMMIS Project Yannis Papakonstantinou Department of Computer Science & Engineering University of California, San Diego.
Atlas Interoperablity I & II: progress to date, requirements gathering Session I: 8:30 – 10am Session II: 10:15 – 12pm.
The Saguaro Digital Library for Natural Asset Management Dr. Sudha RamSudha Ram Advanced Database Research Group Dept. of MIS The University of Arizona.
OGSA-DAI in OMII-Europe Neil Chue Hong EPCC, University of Edinburgh.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Ashish Sharma, Tony Pan, Barla Cambazoglu, Joel Saltz Ohio State University, Columbus, OH (ashish, tpan, October 10, 2007 caBIG In Vivo.
BIRN Advantages in Morphometry  Standards for Data Management / Curation File Formats, Database Interfaces, User Interfaces  Uniform Acquisition and.
Clinical Measures Genotype Local Storage BIRN Rack SRB MCAT HID/ XNAT/ LONI DUP Calibration & Analysis Tools GRID Portal Mediator Institution A BIRN Rack.
All Hands Meeting 2005 Morphometry BIRN - Overview - Scientific Achievements.
2004 All Hands Meeting FBIRN 2005 – Database and Informatics Working Group David Keator.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The Biomedical Informatics Research Network Carl Kesselman BIRN Principal Investigator Professor of Industrial and Systems Engineering Information Sciences.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Data Integration Progress. BIRN Data Integration Framework 2. Create conceptual links to a shared ontology 1. Create multimodal databases 3. Situate the.
Information Integration BIRN supports integration across complex data sources – Can process wide variety of structured & semi-structured sources (DBMS,
All Hands Meeting 2005 FBIRN Tools: 2005 Subtitle added here.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
BIRN Knowledge Engineering Working Group Chair: Gully APC Burns.
Neuroinformatics Working Group Update 10/26/2009 H Jeremy Bockholt.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Jorge Jovicich, Ph.D. Massachusetts General Hospital - Harvard Medical School Biomedical Informatics Research Network Overview Testbeds Morphometry BIRN.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
In Vivo Imaging Middleware and Applications RSNA 2007 Berkant Barla Cambazoglu The Ohio State University Department of Biomedical Informatics.
© Geodise Project, University of Southampton, Integrating Data Management into Engineering Applications Zhuoan Jiao, Jasmin.
Security Solutions Rachana Ananthakrishnan University of Chicago.
An Introduction to the Biomedical Informatics Research Network (BIRN) Gully APC Burns Information Sciences Institute University of Southern California.
University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid.
Federating Standardized Clinical Brain Images Across Hospitals.
Standards for Data Sharing Program Oversight Chair: Colin Ingram, Newcastle UK.
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Brigham and Women’s Hospital GCRC – BIRN Collaboration Planning August Two active GCRC protocols were selected to be part of the initial collaboration.
2005 All Hands Meeting Data & Data Integration Working Group Summary.
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Tony Pan, Stephen Langella, Shannon Hastings, Scott Oster, Ashish Sharma, Metin Gurcan, Tahsin Kurc, Joel Saltz Department of Biomedical Informatics The.
Provenance Work Plans and Deliverables October 2005  Data Provenance information in SRB and HID Test upload to SRB (March) Give DB working group formal.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
A Mixed-Initiative System for Building Mixed-Initiative Systems Craig A. Knoblock, Pedro Szekely, and Rattapoom Tuchinda Information Science Institute.
BIRN: Where We Have Been, Where We are Going. Carl Kesselman BIRN Principal Investigator Professor of Industrial and Systems Engineering Information Sciences.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Solutions to Clinical Data Visualization and Analysis
UCSD Neuron-Centered Database
Presentation transcript:

Information Integration José Luis Ambite, Ph.D. Project Leader, Information Sciences Institute Research Assistant Professor, Computer Science University of Southern California

Team Information Integration Infrastructure: Jose Luis Ambite, Craig Knoblock, Maria Muslea, Gowri Kumaraguruparan (USC/ISI) Domain Collaborators: FBIRN: Naveen Ashish (UCI), Jessica Turner (MRN), Karl Helmer (MGH), Tim Olsen (WUSTL), Dingying Wei (UCI) NHPRC: John Nylander, Dave Brink, Liz Moran (NHPRC) CVRG: Naveen Ashish (UCI), Steve Granite (JHU) NeuroDev: Dobyns, Paciorkowski (UW), Sherr (UCSF), … UCI CTSI: Ashish, Keator (UCI), … Security: Rachana Ananthakrishnan (UC), Laura Pearlman (USC/ISI) Data Management: Robert Schuler, Ann Chervenak (USC/ISI) Knowledge Engineering: Gully Burns (USC/ISI), Naveen Ashish (UCI), Jessica Turner (MRN) User Interfaces: Naveen Ashish (UCI), Jose Luis Ambite, Pedro Szekely, Craig Rogers, Gowri Kumaraguruparan, Maria Muslea (USC/ISI)

Information Integration Problem: consistent view of heterogeneous, distributed data Challenges: – Syntactic heterogeneity: formats, data models – Semantic heterogeneity: names, structure, viewpoint – Efficiency: query execution – Scalability: ease of adding new sources Approaches: – Warehouse/ETL – Common-schema federation – Virtual Integration/Mediator BIRN supports deep integration across complex data sources – Heterogeneous sources: Relational, XML DBs, Web Services, HTML, files – Structured queries – Secure, Efficient Query Execution Decision Support Application Programs, Workflows Mediator Knowledge Bases Databases Computer Programs Web BIRN

Information Mediator Virtual Integration Architecture: – Virtual organization: providers, consumers sharing data for specific purpose – Autonomous sources: data, control remains at sources; no changes to sources – Mediator: define domain schema and describe source contents Domain schema: view of the domain agreed upon by virtual organization Source descriptions: declarative logical formulas relating source/domain schemas Query Answering – User writes query in domain schema – Mediator: Determines sources relevant to query Rewrites query in sources schemas Breaks query into sub-queries for sources Optimizes query evaluation plan Combines answers from sources Declarative  Easy to add new sources EZ-config: Automatic configuration for single schema federations Mediator Domain Schema User queries Reformulation Optimizer Execution Engine Data Source Data Source Data Source Wrapper Sources schemas Logical Source Descriptions [VLDBJ 2005, Frontiers NeuroScience 2010, JAMIA 2011]

FBIRN Data Integration Use Case: HID and XNAT Human Imaging Database(s) Oracle DB XNAT EXtensible Neuroimaging Archive Toolkit Web service API BIRN Mediator SQL query XML query User query: find all male patients over 50 with t1 scans Results integrated from XNAT and HID HID results XNAT results (XML) … Domain query Integrated results Logical Source descriptions [Front. NeuroScience 2010]

ECG_Mesa (MySQL DB) CardioVascular Research Grid BIRN Mediator Integrated results Logical Source descriptions Chesnokov Analysis (eXistDB XML DBMS) Image Metadata dcm4che PACS (MySQL DB) WaveformDB (eXistDB XML DBMS) DICOM Image Files (file system) Waveform Files (file system) Domain query Use mediator to identify subjects and files of interest Same BIRN mediator Just plug in CVRG source descriptions and additional wrapper for eXistDB (XML/XQuery database)

LISDB Neuro Developmental Disorders BIRN Mediator SQL query User query: find all white females with Aicardi syndrome Results integrated from LISDB and SherrDB LISDB results SQL query … Domain query Integrated results Logical Source descriptions SherrB SherDB results Same BIRN mediator NeuroDev source descriptions

8 Non-Human Primate Research Consortium Provide data integration infrastructure for NHPRC: – Colony management, genetics, pathology, … BIRN NHPRC Activities: – BIRN/ISI demonstrated Colony Management integration prototype – NHPRC team developed DNA Banking application using BIRN mediator – Collaborated on NHPRC Pathology Project

BIRN Mediator (OMOP Model) RAND Custom Interface BWH UCSD UCI Scanner mediator Integration of multiple clinical data sources – Relational databases: UCSD, UCI, RAND, Brigham & Women’s Hospital, … – EMR system  Relational export Domain model based on the OMOP common data model – OMOP: Observational Medical Outcomes Partnership [Ashish (UCI), Boxwala (UCSD), …]

Cross-CTSI Data Integration: Oxytocin Study UCSD-UCI cross-CTSI Oxytocin study: – – Mediated solution – BIRN Mediator Data from neurological assessment scales – PANSS, STM, SCID, …. BIRN Mediator RedCAPHID Custom Interface [Ashish, Keator, Potkin, Fiefel, …]

Cross-CTSI Data Integration: Oxytocin Study

BIRN Information Integration General information integration infrastructure – BIRN Mediator bridge semantics across data sources provide integrated data for analysis and visualization – Domain model development and curation process Balance bottom-up/top-down domain model/ontology development and reuse – Security and user data access control built-in Approach – Engage research communities: NHPRC, FBIRN, CVRG, NeuroDev, Radiation Oncology, CTSIs,... – Build applications incrementally – Enhance capabilities while providing useful tools