CIMI / FHIR and Shape Expressions
Local DB … …
Preferred Strategy – Full Interoperability Local databases, CDA, HL7 V.2, etc. Term and Structure Translators Application Standard Structure AND Standard Terms (As defined by CIMI Models) Application and User Requirements
Addressing Instances Semantics – A combination of identifiers and structure – Have to determine whether two columns / elements / attributes / classes / … reference the same thing – (Potentially) have to split or join multiple elements to achieve the same level of granularity (e.g. by microparsing or exploiting contextual information) – Have to recognize and make explicit tacit information (units / data referent / workflow state / …)
Addressing Instances Syntax – With the exception of granularity/compositional issues, should be orthogonal to semantics – As such, it would be really handy to arrive at a representational form (syntax) that: Is readily transformed to and from multiple “native” syntaxes That has minimal restrictions on what can be said in the syntax
Proposal RDF is close to ideal for a “neutral” syntax – Triples provide minimal (absolute minimum?) restrictions on semantic aspects – Semantic identifiers have single form (URI) – Links to ontologies and terminologies provide key need for mapping – Tools exist today (any23, …) for mapping any syntax to RDF and visa-versa.
What is Missing? Schema XML Schema Java Class / Interface (… under construction …) DDL UML OWL ( ? ) Instance XML Java Object JSON SQL Tables (XMI -- …?) ( --- not really a schema ---) RDF
RDF Data Shapes Define a schema for RDF RDF – a set of triples. Constraints: – Subject must be IRI or Blank Node – Predicate must be a IRI – Object must be a IRI, a Blank Node or a Literal Any set of triples that meets the constraints above is valid – Even basic structures like lists, reification, etc. are not constraints…
RDF Data Shapes Constraints on a set of triples: – Triples that MUST exist – Triples that MAY NOT exist – Triple references – Object values – … A Schema language for RDF
Data Shape Example start = # Issue validation starts with { # An has: ex:state (ex:unassigned ex:assigned), # ex:state predicate with a target of either # ex:unassigned or ex:assigned # ex:peportedBy predicate whose target matches # ex:reportedOn xsd:dateTime, # ex:reportedOn predicate whose object is # a valid xsd:dateTime ( # optionally # a ex:reproducedBy predicate w/ # Target URI that references vaid UserShape ex:reproducedOn xsd:dateTime # and a ex:reproducedOn predicate w/ date time )?, * # 0 or more ex:related predicates whose objects # are the subject of a valid issue shape } <UserShape { … }
RDF Data Shapes Previous example is in a specific grammar (ShEx) – W3C is working on other representational forms under “SHACL” rubric RDF Data Shapes do not specify semantics (!!)
This is a valid “Issue” shape … ex:state ex:unassigned ; ex:reportedBy ; ex:reportedOn " T10:18:00"^^xsd:dateTime. foaf:name "Bob" ; foaf:mbox.
… but so is this foaf:firstName ex:state ex:unassigned ; ex:reportedBy ex:cornflakes; ex:reportedOn " T10:18:00"^^xsd:dateTime. ex:cornflakes foaf:name ”Kellog’s Corn Flakes" ; foaf:mbox.
Why RDF Data Shapes? Tools exist (or soon will) to transform XML Schema, UML Models, FHIR models, CIMI models, SQL DDL, … into ShEx – XML Schema ShEx – XML RDF – … RDF can be validated / queried using ShEx
Why ShEx? ShEx is based on parser semantics – Essentially a grammar with a fixed RDF Lexer – ShEx includes the notion of “semantic annotations” (!!!) ShEx can be used as: – A validation tool … “Are these triples a valid instance of X” – A query tool … “Find all subjects whose triples are valid instances of X” – A transformation tool (!) … “Every time you see a valid instance of X, emit the triple “(subj) rdf:type ex: X”
ShEx as a transformation tool Shape expression serves the role of a digital “ribosome”, crawling sets of related triples – Can be used to copy and modify RDF – Can emit other languages (XML, TSV, …) – Can generate code – Can create forms – …
Transformation Model Source Schema In ShEx Target Schema In Shex Source Instance Any23 transform RDF “Native” Common Model Common Model
ShEx Example
DNA Translation
Data Translation Any23 transform Visitor / Listeners ShEx Process Data Synthesis Generated Data/Code RDF Triples
Current Projects FHIR RDF FHIR Schema ShEx UML ShEx AML / ADL ShEx
ShEx Processors JavaScript Implementation – JISON Haskell Implementation Scala Implemention Python Implementation – (ANTLR)
Additional Work W3C SHACL Group – RDF Representation – ShEx compatible (?) – Currently spearheaded by Holger Knoblauch (TopQuadrant) w/ SPIN focus ShEx Group – Face to Face in Lille August 17-21
Links any23: “Anything to triples” — demo (not examples bar hidden on LHS) — BNF for ShEx — ANTLR for ShEx — SHACL working group — intro — general repository