WHIT 3.0 December 11, 2007 Christopher Pierce and Chimezie Ogbuji Applications of Semantic Technologies to Patient Medical Data at Cleveland Clinic WHIT 3.0 December 11, 2007 Christopher Pierce and Chimezie Ogbuji
Current Challenges Explosion of Need for Structured Clinical Data Mandated and voluntary reporting Research, quality management and decision support No Hospital-Wide System for Capturing Structured Data EMRs primarily narrative accounts rather than structured data Although most clinical information is documented in unstructured narrative descriptions of medical encounters, increases in mandated reporting and the use of routine clinical data in outcomes research has led to an explosion of structured data captured as values for variables.
Current Challenges Structured Data Captured in Numerous Single-Purpose Applications Cleveland Clinic has over 500 IRB approved database applications and growing Just tip of the iceberg Islands of Data Idiosyncratic terms and definitions Overlapping content Incomplete and inconsistent patient data
Consequences Data and Process Fragmentation Data Fidelity Issues Semantic Dissonance Poor Accessibility and Usability across Data Silos
Semantic Solutions Point-to-Point Mappings Interlingua Ontologies Hard-coded by human or ontology No a priori agreement required Interlingua Ontologies Mediated by one or more ontologies Requires agreement Semantic Database No mediation required Works best with agreement, but not required Decreasing cost and brittleness
Cleveland Clinic Applications SemanticDB™ Database/Knowledgebase For patient clinical data collection and storage Semantic Query To facilitate investigators identifying patient cohorts for clinical research and reporting Semantic Search Engine For intelligent common sense search of healthcare information by patients
SemanticDB
SemanticDB Features Extensible - Accept any kind of data without refactoring of data store. Store has no knowledge of content. Expressive - Formal knowledge representation with transforms between KR dialects Automated - Model and metadata-driven Accessible - Highly distributable Scaleable - Handles enterprise-scale data management needs Standard - Based on emerging W3C standards
SemanticDB Overview
SemanticDB Architecture Ad Hoc Query SPARQL & Natural Language with Cyc Facilitated Data Entry Data Entry Screen Compiler User Interface Plan Domain Model Templates XML Data Dictionary Stored Queries Report Templates Data Mason Domain RDF Store CCF SQL DB RDF Triple-store SPARQL Interface XML Schema OWL Ontology XML XSLT ‘Throttled’ Dual Representation Domain Instance
Timeline Browse View
Semantic Query
Complex Query for Outcomes IDENTIFY PATIENT POPULATION FIND all native aortic valve replacements performed at CCF between January 1, 2000 and December 31, 2004 with a pre-operative diagnosis, as determined by echocardiogram, of moderately severe or severe aortic stenosis and moderate to severe left ventricular impairment. INCLUDE operations in which concomitant primary CABG or concomitant mitral or tricuspid valve repair was performed. EXCLUDE all patients with any prior valve repair or replacement; or with concomitant pulmonary valve repair; or with concomitant mitral, tricuspid, or pulmonary valve replacement; or with aortic regurgitation greater than moderate degree.
Cyc Ontology & Knowledge Base Semantic Query Architecture Reasoning Modules Query Formulation Cyc Ontology & Knowledge Base Answer Exploration Semantic Knowledge-Source Integration CCF SemanticDB™ SPARQL Interface Refine, Convert, Integrate Registry Data RDF Triple-store CCF SQL DB
The Analytic Environment Simple English sentences are typed into query search box System extracts entities, concepts, and relations from the text and instantiates them according to rules and constraints placed on the concepts and relations
The Analytic Environment User selects relevant query fragments They then use a menu option to combine automatically the fragments into a single query
Full query appears in query construction screen The system combines the fragments into a single query
Terms that can be temporally qualified are referenced here. The system combines the fragments into a single query Terms that can be temporally qualified are referenced here.
User can drag and drop these to form temporal sequences The user can drag and drop boxes representing various events to designate temporal ordering User can drag and drop these to form temporal sequences
Here the user has specified that the infection comes after the pericardial window procedure Here user has specified that pericardial procedure precedes the infection
At that point, constraint is automatically added to query Here the user has specified that the infection comes after the pericardial window procedure At that point, constraint is automatically added to query
When the answers come back we paraphrase the event and procedure nodes with information from the justification. User can also specify a range of dates within which condition or procedure must occur.
This is the SPARQL query that is dispatched to SemanticDB service.
Answers are displayed When the answers come back we paraphrase the event and procedure nodes with information from the justification.
A full English justification can be provided for any answer (by any of several “drill down” gestures by the user) We can generate a full justification of why the system returned that answer
Semantic Technology Implementation Model Enterprise-wide patient-centric clinical information supporting patient care Application Development & Integration: Web 2.0 (SAAS, SOA), Web 3.0 (Semantic Web) EMR User Interface Demographic data Patient treatment specific departmental system Patient Registry data source Patient treatment specific departmental module Patient Registry data source Patient-centric Departmental Patient centric database SemanticDB Enterprise-wide population-centric clinical data supporting research, outcomes, marketing, reporting … Population-centric Clinical Data Demographic Data 29