Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Semantic Business Management November 5, 2009 Paul Haley Automata, Inc. (412)
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
DAML Ontology Library Mike Dean OntoLog Forum 28 February
An ontology server for the agentcities.NET project Dr. Manjula Patel Technical Research and Development
1 Meaningful Use of Electronic Medical Records through Semantic Technologies: The Cleveland Clinic Experience Christopher Pierce, Ph.D. (Cleveland Clinic)
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Semantic Web Thanks to folks at LAIT lab Sources include :
XML Technology in E-Commerce
By Ahmet Can Babaoğlu Abdurrahman Beşinci.  Suppose you want to buy a Star wars DVD having such properties;  wide-screen ( not full-screen )  the extra.
Semantic Web Introduction
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
A Secure Interoperable Infrastructure For Healthcare Information System Ehsan ul Haq Abrar Ahmed Sair
Introduction to Databases Transparencies
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
1 DCS861A-2007 Emerging IT II Rinaldo Di Giorgio Andres Nieto Chris Nwosisi Richard Washington March 17, 2007.
Course Instructor: Aisha Azeem
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
A Really Brief Crash Course in Semantic Web Technologies Rocky Dunlap Spencer Rugaber Georgia Tech.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Semantic Web Technologies: A Paradigm for Medical Informatics Chimezie Ogbuji (Owner, Metacognition LLC.)
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
1 Electronic Health Records with Cleveland Clinic and Oracle Semantic Technologies David Booth, Ph.D., Cleveland Clinic (contractor) Oracle OpenWorld 20-Sep-2010.
ONTOLOGY SUPPORT For the Semantic Web. THE BIG PICTURE  Diagram, page 9  html5  xml can be used as a syntactic model for RDF and DAML/OIL  RDF, RDF.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Practical RDF Chapter 1. RDF: An Introduction
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design.
Denotation as a Two-Step Mapping in Semantic Web Architecture David Booth, Ph.D. Cleveland Clinic (contractor) Identity Workshop, IJCAI 2009, Pasadena.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Database System Concepts and Architecture
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
School of Computing FACULTY OF ENGINEERING Developing a methodology for building small scale domain ontologies: HISO case study Ilaria Corda PhD student.
A Model-Driven Approach to Interoperability and Integration in Systems of Systems Gareth Tyson Adel Taweel Steffen Zschaler Tjeerd Van Staa Brendan Delaney.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Semantic Web - an introduction By Daniel Wu (danielwujr)
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
1 Chapter 1 Introduction to Databases Transparencies.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Christopher Pierce (Cleveland Clinic)
Event Linking With Meaning: Ontological Hypertext and the Semantic Web Hugh Davis Learning Societies Lab ECS The University of Southampton, UK All Notes.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
The Semantic Web By: Maulik Parikh.
WHIT 3.0 December 11, 2007 Christopher Pierce and Chimezie Ogbuji
Cloud based linked data platform for Structural Engineering Experiment
Meaningful Use of Electronic Medical Records through Semantic Technologies: The Cleveland Clinic Experience Christopher Pierce, Ph.D. (Cleveland Clinic)
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
PREMIS Tools and Services
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
Presentation transcript:

Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc. (Owner)

Outline Me Semantic Web and Semantic Web technologies RDF, GRDDL, OWL, RIF, and SPARQL Cleveland Clinic Semantic DB project Content repository Data collection workflow Quality and outcomes reporting Cohort identification Use of the system

Me and Semantic Web I’ve been developing software using standards of the Semantic Web since 2001 Began working on Cleveland Clinic SemanticDB project in 2003 Began working in the World-Wide Consortium (W3C), developing the SPARQL and GRDDL standards in 2007 and 2006, respectively I contribute to and maintain several open source software projects related to Semantic Web technologies: RDFLib ( FuXi ( Akamu (

The Semantic Web A vision of how the existing WWW can be extended such that machines can interpret the meaning of data involved in protocol interactions A vision of the founder of the World-wide Web Consortium (W3C) and inventor of the internet (Tim Berners-Lee) Semantic Web technologies / standards A technological roadmap that attempts to realize this Layers of W3C standards (“Layer cake”)

“Focus” standards Resource Description Framework Gleaning Resource Descriptions from Dialects of Language SPARQL Protocol And RDF Query Language Ontology Web Language

RDF A framework for representing information in the Web. Motivation machine interpretable metadata about web resources mashup of application data automated processing of web information by software agents Graph data model (directed, labeled graph) Nodes and links are labeled with URIs Some nodes are not labeled (Blank nodes) Links are called RDF sentences or triples

GRDDL A protocol for sowing semantics in structured (XML) web content for harvest Vast amount of latent semantics in web documents Web content today is primarily built for human consumption

Faithful Rendition “By specifying a GRDDL transformation, the author of a document states that the transformation will provide a faithful rendition in RDF of information (or some portion of the information) expressed through the XML dialect used in the source document.” Licenses an interpretation of an XML document that is certified by the author

GRDDL Transformations Functions that take an XML source document and return an RDF graph Transformations can be written in any particular language The “reference” transformation language is XSLT Transformations can be associated with an entire XML dialects that shares an common XML namespace

Architectural value XML is well suited for messaging, data collection, and structural validation RDF is well suited for expressive logical assertions, querying, and inference. RDF graphs can be created, update, deleted, etc. (managed) using a particular XML vocabulary vocabulary can be specific to a particular purpose rather GRDDL facilitates mutually beneficial use of XML and RDF processing and representation

SPARQL The query language for RDF content It operates over an RDF dataset Comprised of named RDF graphs and a single RDF graph without a name Operationally and structurally similar to SQL Many implementations (including the one we used) build on existing relational database management systems Translate SPARQL queries into SQL queries Elliott et al. A complete translation from SPARQL into efficient SQL

OWL Language for describing and constraining the semantics of an RDF vocabulary Such constraints (often hierarchical) are called ontologies An ontology specifies a conceptualization of a particular domain as categories, relationships between them, and constraints on both. By defining an OWL document for the terms in an RDF graph, additional RDF sentences can be inferred Additionally, an RDF graph can be determined to be consistent or inconsistent with respect to the ontology Both tasks can be done by a logical reasoning engine

Semantic Database (SDB) Cleveland Clinic’s Heart and Vascular Institute (HVI) Challenges: fragmented gathering and storing of clinical research data compartmentalization of medical science and practice clinical knowledge is typically expressed in ambiguous, idiosyncratic terminology problematic for longitudinal patient data that can feasibly span multiple, geographically separated sources and disciplines Longitudinal patient record: patient records from different times, providers, and sites of care that are linked to form a lifelong view of a patient’s health care experience

Project goals Create a framework for context-free data management Usable for any domain with nothing (or little) assumed about the domain Expert-provided, domain-specific knowledge is used to control most aspects of Data entry Storage Display Retrieval Formatting for external systems

Components Content repository supports data collection, document management, and knowledge representation for use in managing longitudinal clinical data manages patient record documents as XML and converts them to RDF graphs for downstream semantic processing Data collection workflow process of transcribing details of a heart procedure from the EHR into a registry RDF used as the state machine of a workflow engine Pierce et al. SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting Ogbuji. A Role for Semantic Web Technologies in Patient Record Data Collection. 2009

Workflow State as RDF Dataset Each task is an XML document in a content repository Mirrored into a named RDF graph that shares a web location (the name) with the document (SPARQL) query is dispatched against a workflow dataset to find tasks in particular states or assigned to particular people Applications interact with task information and fetch: JSON and XML representations (for client-side web applications) XHTML documents that render as faceted views of a collection of tasks faceted view includes links to subsequent stages in workflow and into other web applications on server

Reporting challenges Reporting places a heavy burden on institutions to produce data in specific formats with precise definitions Definitions vary across reports makes it difficult to use the same source data for all reports Institutions are typically forced to manually abstract the data for each report This is done separately to conform to the requirements for each report Pierce et al. SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting. 2012

Components: reporting Quality and outcomes reporting generate outcomes reports both for internal and external consumption internal reports were generated monthly and external reports are generated quarterly quarterly reports submitted to Society of Thoracic Surgeons (STS) Adult Cardiac Surgery National Database and American College of Cardiology (ACC) CathPCI Database submissions are required for certification Pierce et al. SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting. 2012

Cohort identification SPARQL and RDF datasets are well-suited as infrastructure for a longitudinal patient record data warehouse HVI software development team partnered with Cycorp to build a cohort identification interface called the Semantic Research Assistant (SRA) Based on the Cyc inference engine a powerful reasoning system and knowledge base with built-in capability for natural language (NL)processing, forward-chaining inference and backward-chaining inference. incorporates Cyc's NL processing to permit a user to compose a cohort selection query by typing an English sentence or sentence fragment Lenat et al. Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries

RDF dataset warehouse CycL to SPARQL domain-specific medical ontologies in conjunction with the Cyc general ontology are used to convert the NL query into a formal representation and then into SPARQL queries. SPARQL queries are submitted to the SemanticDB RDF store for execution Cleveland Clinic’s registry of 200,000 patient records comprises an RDF graph of roughly 80 million RDF assertion

Dataset topology An RDF dataset with no default graph and one named graph per patient record (a patient record graph) Beyond identifying the cohort, most subsequent query processing happens within a single patient record graph In our vocabulary, there are instances of PatientRecord, Operation, Patient, MedicalEvent, HospitalEpisode, etc. PatientRecord resources share a URI with their containing graph

GRAPH operator can be used to optimize the search space Optimal for the following cohort querying paradigm Constraints in the first part of query are cross-graph and the second part are intra-graph

Use of system From 2009 through June of 2011 over 200 clinical investigations utilized SemanticDB to identify study cohorts and retrieve appropriate data for analysis studies ranged from relatively simple feasibility assessments to extremely complex investigations of time-related events and competing risks of the patient experiencing a certain outcome after treatment prior cohort identification and data export queries for studies would have been performed by a skilled database administrator (DBA) interpreting instructions from domain experts Using SemanticDB and the SRA, a non-technical domain expert performed most of the queries