Presentation is loading. Please wait.

Presentation is loading. Please wait.

INNOVATION IN HEALTHCARE IT STANDARDS: THE PATH TO BIG DATA INTERCHANGE LUCIANA TRICAI CAVALINI, MD, PHD TIMOTHY WAYNE COOK, MSC.

Similar presentations


Presentation on theme: "INNOVATION IN HEALTHCARE IT STANDARDS: THE PATH TO BIG DATA INTERCHANGE LUCIANA TRICAI CAVALINI, MD, PHD TIMOTHY WAYNE COOK, MSC."— Presentation transcript:

1 INNOVATION IN HEALTHCARE IT STANDARDS: THE PATH TO BIG DATA INTERCHANGE LUCIANA TRICAI CAVALINI, MD, PHD TIMOTHY WAYNE COOK, MSC

2 BIG DATA IN HEALTHCARE MYTHS (AND FACTS)

3 MYTH #1: "BIG DATA" HAS A UNIVERSALLY ACCEPTED, CLEAR DEFINITION Two of these aspects are a particular concern in healthcare: VariabilityVelocity The various definitions have the 3V in common: Volume: Existence of gigantic amounts of data Variability: Coexistence of structured, non- structured, machine generated etc data Velocity: Data is produced, and it has to be processed and consumed very fast There is no consensus in scientific literature and on the specialized blogosphere about the definition of Big Data

4 MYTH #2: BIG DATA IS NEW Collecting, processing and analyzing sheer amounts of data is not a new activity in mankind Example: Middle Age monks and their concordances (correlations of every single word in the Bible) What is new is the volume size and the speed it can be processed and analyzed

5 MYTH #3: BIGGER DATA IS BETTER In biomedical science, this is partially fact: the bigger the sample size, the more precise the estimates are However, large sample sizes with bad quality data are dangerously misleading In healthcare, precision and reliability are both equally important

6 MYTH #4: BIG DATA MEANS BIG MARKETING There is no evidence that analyzing Big Data increases the number of customers Big Data is useful when it helps emerging actionable insights (e.g., an unknown relationship between a gene and a disease) That has little relevance in healthcare, especially in universal healthcare systems

7 HOW TO GET RELIABLE BIG DATA? TRADITIONAL STANDARDS X INNOVATION

8 THE TRADITIONAL HEALTHCARE IT STANDARDS HL7, openEHR, ISO 13606 Primary focus on message exchange among EMRs All of them precede in history the emergence of Big Data and the Semantic Web Top-down data modeling approach: not prepared to deal with the 3V of Big Data SNOMED- CT, LOINC, ICD Controlled vocabularies Also preceding Big Data and Semantic Web Main focus on pre- coordination (top- down approach) In other words: the traditional healthcare IT standards are not prepared to deal with Big Data

9 A DEVELOPMENT ABOUT OPENEHR The current version of the Archetype Definition Language is 1.4 It requires an archetype to be the maximal data set for a given concept By the book, it means that there can be just one archetype for each single concept in the whole globe There are several archetypes being developed in isolation, not being submitted to the proper governance tool (the CKM) In the ADL 1.5 spec, it is promised that the “maximal data model” requirement will be removed

10 Now Everywhere Locally BIG DATA IS BEING PRODUCED:

11 A BIG DATA-AWARE HEALTHCARE IT STANDARD IS: Compliant to Semantic Web Technologies Respectful to the different points of view coming from different medical schools Welcoming to all healthcare professionals (and their concepts) Not limited to EMR data modeling Prepared to deal with the emerging mHealth and the Internet of Things

12 MULTILEVEL HEALTHCARE INFORMATION MODELING (MLHIM) AN INNOVATION IN HEALTHCARE IT STANDARDS

13 THE BACKGROUND - 1 The typical application design locks up semantics in the database structure and application source code Different use cases in different scenarios often interpret seemingly similar data, differently when the semantics are missing Multilevel modelling provides a way to share semantics about any medical (healthcare) concept between distributed and independent applications

14 THE BACKGROUND - 2 MLHIM is based on the core modelling concepts of openEHR to provide semantics external from applications From openEHR, MLHIM inherited the multilevel model principles MLHIM also uses certain conceptual principles from HL7 v3 From HL7, MLHIM inherited the XML-based implementation

15 THE IMPLEMENTATION MLHIM simplifies the openEHR Reference Model It is called a ‘minimalistic’ multilevel model MLHIM uses XML instead of ADL so that ubiquitous tooling and training are available The whole Semantic Web is based on XML technologies Because MLHIM is based on the XML Schema data model there is no loss of information between model semantics and serialization in XML instances This is a problem when serializing ADL into XML (see next)

16 A NOTE ON ADL X XML There is a loss of information when moving between an object model (AOM) and XML Schema dADL is the proper instance serialization for the AOM However, in practice implementers are serializing openEHR/ISO13606 data in XML

17 ADL X XML: A COMPARISON ADLXML The openEHR test suite includes approximately 1600 total files, with known independent validations of its files The XML Schema test suite contains more than 40,000 independently validated tests OpenEHR tools are developed by one company and there is one open source reference model There are more than 30 XML editors, open source and proprietary from as many companies. There are additional tools in the XML family, XSLT, Xquery, Xlink and Xproc The FOSS Java RM has not been thoroughly tested and validated There are at least 3 widely used, XML parser/validators, open source and proprietary from different companies and communities The only ADL courses are from Ocean Informatics and a few startup course taught by non-experts XML is taught in all computer science courses as well as online There are zero books on ADLO'Reilly has 54 books on XML, Amazon has 11,890 results for Books: "xml"

18 QUESTION BREAK

19 MODELING CLINICAL MODELS IN MLHIM THE HEART OF HEALTHCARE IT STANDARDIZATION

20 CLINICAL KNOWLEDGE MODELING: FUNDAMENTALS Modeling clinical data is a complex task Requires deep knowledge of the specific clinical domain Requires at least an intermediate understanding of data types Modeling clinical data is a core activity in healthcare IT It is the only way to produce Big Data in healthcare with responsibility Even well designed clinical data modes in conventional software are not interoperable Multilevel model software is interoperable and it requires thoughtful clinical knowledge modeling

21 CLINICAL MODELS IN MULTILEVEL MODELING The Reference Model: generic information model shared by the ecosystem The Domain Model: definition of constraints to the Reference Model for each medical concept In multilevel modeling, the information ecosystem is structured in (at least) two levels: Multilevel ModelopenEHRMLHIM Domain ModelArchetypeConcept Constraint Definition (CCD) LanguageADLXML Schema 1.1 # of DM/concept1n GovernanceTop down, consensusBottom-up, merit

22 CONCEPT CONSTRAINT DEFINITION (CCD) In MLHIM, CCDs are XML Schemas that define constraints to the Reference Model, in order to model clinical concepts CCDs can be validated to the correspondent MLHIM Reference Model by third- party applications The CCD Schema informs the application developer of the structure of a valid data instance for each concept modeled for that system If the CCD is made public, any receptor of a data instance coming from this application can store, validate, query etc that data instance

23 CCD HIGH LEVEL STRUCTURE CCD Care, Demographic or AdminEntry Cluster DvAdapter (or Cluster) DataType

24 MLHIM DATATYPES FOR CCD S Ordered Quantified DvCount DvQuantity DvRatio DvOrdinalDvTemporal Unordered DvString (with enumeration) (without enumeration) DvCodedStringDvMediaDvParsable DvInterval RerefenceRange

25 MLHIM ELEMENTS: PRINCIPLES The elements of a CCD do not carry any semantics Since element names are structural identifiers, this is in keeping with the best practices of healthcare knowledge artifact identifiers, as first proposed by Dr. James Cimino (circa 1988) Characteristic #3 - Dumb Identifiers An identifier itself should not have meaning. If an identifier is comprised of other identifiers that have been combined, then the composite identifier is inherently unstable. If the circumstances that related the composite identifiers together in the first place change, the resulting identifier must also change.

26 MLHIM CCD S : TECHNICAL ASPECTS CCDs are the equivalent of an archetype in CEN13606 and openEHR They may be defined at any level, for any application use complexType definitions may be reused in multiple CCDs CCDs persist for all time and are not versioned, this is essential for data integrity across time All element names are unique identifiers (Type 4 UUIDs) With the exceptions:

27 CCD GOVERNANCE MODEL Artifact governance in MLHIM consists of maintaining a copy of the CCDs and Reference Models This can be on the web at the specified location or locally and referenced using the standard XML Catalog tools Because of the naming conventions, changes to the MLHIM reference model does not impact previously defined CCDs or data This maintains accurate semantics for all time

28 MLHIM RESOURCES PUTTING INNOVATION INTO PRACTICE

29 MLHIM REFERENCE MODEL The release version is availble at www.mlhim.org The development version is available at www.github.com/mlhim

30 CCD GENERATOR (CCD-GEN) CCD editor maintained by the MLHIM Laboratory at www.ccdgen.comProduces CCDs according to the correspondent MLHIM Reference ModelCCDs are automatically validatedOther products include: A sample data instance JSON serialization of the data instance A sample HTML form Modules for the R programming language to pull MLHIM data into R data frames for processing and analysis

31 OTHER MLHIM TOOLS A MLHIM repository using an SQL DB for persistence with a browser and a REST interface MLHIM Application Platform & Learning Environment (MAPLE) Utility to convert MLHIM CCD XML instances to use shortuuids and to convert to JSON and back again to XML It is intended to demonstrate how mobile apps can use smaller data files to pass over the wire to an API that expects these formats and can convert them back to full XML instances for validation MLHIM XML Instance Converter (MXIC) Web application to build a form and create a CCD from it (work in progress) Form2CCD FOSS CCD editor (work in progress) Constraint Definition Designer (CDD)

32 IN BRIEF CONCLUSIONS AND THE VIEW TO THE FUTURE

33 MLHIM IS BIG DATA READY MLHIM uses standard XML technologies and embedded RDF to define the syntax and semantics The semantics are in the CCD and can be easily exchanged or referenced via the web Their RDF can be queried, analyzed and linked using standard tools MLHIM data can be stored in SQL or NoSQL databases Examples are on GitHub for eXist-DB (XML) and SQLite3 (can easily be ported to use PostgreSQL, MySQL, Oracle, etc.) We also have experience with MLHIM data in a MarkLogic NoSQL cloud cluster environment In addition to native XML DBs, the small document oriented nature of MLHIM data is a perfect fit for document databases such as MongoDB and CouchDB MLHIM XML data can easily be round-trip converted to JSON for permanent storage and/or as an exchange serialization via REST APIs

34 OUR VISION OF THE FUTURE There are intuitions inside the healthcare IT world already about the inadequacy of conventional EMRs to collect reliable data at the point of care The real Big Data in healthcare will come from purpose-specific applications modeled by the domain experts The hardware support of choice for those apps is the mobile computing The other source of Big Data in healthcare will come from the Internet of Things All that data which is MLHIM compliant will participate in a semantically interoperable health information ecosystem

35 lutricav@mlhim.org tim@mlhim.org THANK YOU! /mlhim2 http://gplus.to/MLHIMComm @mlhim2 https://www.youtube.com/user/MLHIMdotORG


Download ppt "INNOVATION IN HEALTHCARE IT STANDARDS: THE PATH TO BIG DATA INTERCHANGE LUCIANA TRICAI CAVALINI, MD, PHD TIMOTHY WAYNE COOK, MSC."

Similar presentations


Ads by Google