Making Terminologies useful and usable: Clinical Terminologies in the 21st Century: What are they for? What might they look like? Alan Rector Bio.

Slides:



Advertisements
Similar presentations
The ORCHID project Dr Ian Gaywood, NUH Dr Ira Pande, NUH Professor John Chelsom, City University London.
Advertisements

Verification and Validation
BioHealth Informatics Group Feb 2005Ontology tutorial, © 2005 Univ. of Manchester1 Formal Modelling Alan Rector.
CO-ODE/HyOntUse JISC/EPSRC 1 Why I need both OWL/DLs & Frames Alan Rector Medical Informatics Group Bio Health Informatics Forum Department of Computer.
Data - Information - Knowledge
Chapter 1 Program Design
Implementation. We we came from… Planning Analysis Design Implementation Identify Problem/Value. Feasibility Analysis. Project Management. Understand.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
MIS 710 Module 0 Database fundamentals Arijit Sengupta.
1 Ontologies, Clinical and Genomic Information How to say what we mean and mean what we say Opportunities & Pitfalls Alan Rector, Jeremy Rogers, Chris.
1 Joined up Health and Bio Informatics: Joined up Health and Bio Informatics: Alan Rector Bio and Health Informatics Forum/ Medical Informatics Group Department.
Primary funding is provided by the JISC and ESRC. Based at Manchester Computing, The University of Manchester. 1 ‘The Famous 5’ Worked Examples from MIMAS.
1 Scale and Context: Issues in Ontologies to link Health- and Bio-Informatics Scale and Context: Issues in Ontologies to link Health- and Bio-Informatics.
SNOMED CT Denise Downs Knowledge Management & Education Lead Data Standards, Technology Office Department of Health Informatics Directorate.
Software Inspection A basic tool for defect removal A basic tool for defect removal Urgent need for QA and removal can be supported by inspection Urgent.
10th Intl. Protégé Conference - July 15-18, Budapest, Hungary Do ontologies work? Barriers and Possibilities to Implement Ontologies in (Health)
BioHealth Informatics Group Advanced OWL Tutorial 2005 Ontology Engineering in OWL Alan Rector & Jeremy Rogers BioHealth Informatics Group.
1 Integrating Bio and Health Informatics: Ontologies for Bridging Scales, Contexts and Customs Integrating Bio and Health Informatics: Ontologies for Bridging.
Manchester Medical Informatics Group OpenGALEN 1 Linking Formal Ontologies: Scale, Granularity and Context Alan Rector Medical Informatics Group, University.
G A L E N Slide No.: 1 G A L E N Where we started Building Classifications with GALEN A look into the future…? Key Questions.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
Chapter 22 Developer testing Peter J. Lane. Testing can be difficult for developers to follow  Testing’s goal runs counter to the goals of the other.
Primary funding is provided by the JISC and ESRC. Based at Manchester Computing, The University of Manchester. 1 1 Creating a Metadatabase for MIMAS Services.
The Practical Challenges of Implementing a Terminology on a National Scale Professor Martin Severs.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
Dr. Sebastian Garde Ocean Informatics Medinfo 2013 Copenhagen, Copyright 2012 Ocean Informatics.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Lexically Suggest, Logically Define: QA of Qualifiers &
Advanced OWL tutorial 2005 Ontology Normalisation, Pre- and Post- Coordination Alan Rector BioHealth Informatics Group.
Be.wi-ol.de User-friendly ontology design Nikolai Dahlem Universität Oldenburg.
Ontologies for Terminologies, Knowledge Representation & Software: Benefits & Gaps (“Don’t make the tea”) (Only a part of Knowledge Representation) Alan.
Week 1 Reference (chapter 1 in text book (1)) Dr. Fadi Fayez Jaber Updated By: Ola A.Younis Decision Support System.
Enterprise Architecture Melissa A. Cook | Director | Enterprise Strategy & Architecture.
Frequent Criticisms too ambitious recreation of a Textbook of Medicine competing with SNOMED-CT replicates the work done elsewhere: DSM, ICPC, too academic.
Health Informatics Career Responsibilities Communicate information File records Use technology Schedule appointments Complete medical records forms Maintain.
Semantic Interoperability in Healthcare State of the Art in the US (position paper) March 15, 2010 CCIB Convention Centre, Barcelona, Spain.
Chapter 1- Introduction
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Learning to Program D is for Digital.
CSC 480 Software Engineering
Chapter 18 Maintaining Information Systems
Development of the Amphibian Anatomical Ontology
The Systems Engineering Context
Web Engineering.
What is performance management?
Some Simple Definitions for Testing
Concepts used for Analysis and Design
Cloud Computing.
Design and Maintenance of Web Applications in J2EE
MANAGING DATA RESOURCES
Tools of Software Development
Chapter 1 Database Systems
Introduction to Software Testing
Linking levels of granularity and expressing contexts & views using formal ontologies: Experience with the Digital Anatomist FMA & other health & bio.
Verification and Validation Unit Testing
Thursday’s Lecture Chemistry Building Musspratt Lecture Theatre,
Software Testing and Maintenance Maintenance and Evolution Overview
Pragmatic RCTs and the Learning Healthcare System
Software Architecture
Healthcare Technology Management Solution
Grid Application Model and Design and Implementation of Grid Services
Agile testing for web API with Postman
Chapter 11 user support.
Bringing more value out of automation testing
Chapter 1 Database Systems
GISELA & CHAIN Workshop Digital Cultural Heritage Network
What Is Good Software(Program)?
CHAPTER 6 ELECTRONIC DATA PROCESSING SYSTEMS
Injury epidemiology- Participatory action research and quantitative approaches in small populations Lorann Stallones, PhD Professor and Director, Colorado.
Lesson Overview 1.1 What Is Science?.
Chapter 2: Building a System
Presentation transcript:

Making Terminologies useful and usable: Clinical Terminologies in the 21st Century: What are they for? What might they look like? Alan Rector Bio and Health Informatics Forum/ Medical Informatics Group Department of Computer Science University of Manchester rector@cs.man.ac.uk www.cs.man.ac.uk/mig img.man.ac.uk www.clinical-escience.org mygrid.man.ac.uk

An Old Problem “On those remote pages it is written that animals are divided into: a. those that belong to the Emperor b. embalmed ones c. those that are trained d. suckling pigs e. mermaids f. fabulous ones g. stray dogs h. those that are included in this classification i. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance" From The Celestial Emporium of Benevolent Knowledge, Borges

But why in healthcare? What’s it for? What’s the purpose? Terminologies are of little use in themselves How will it make care better? new things possible? How will it make information systems better? Painful experience of 20 years of over-selling and under performance Do we need it: Clinically? Technically? If we need it what is ‘it’? Is ‘it’ one thing or many? How will we know if we have ‘it’? How will we know if ‘it’ is fit for purpose

Why Now? What’s different now? Web, E-Science, Grids Web speed New technologies – OWL, new DLs, hybrid frame-DL environments www.semanticweb.org Post genomic medicine – personalised medicine Joining up Healthcare Medical and Bioscience research – CLEF Systemisation of healthcare Clinical error reduction, clinical governance, evidence based medicine, … Does anybody else have similar problems? Ontologies are ‘flavour of the month’ in E-Science & Web Bioinformatics is building them very rapidly What can we learn from them?

Need more and better clinical information A Convergence of Need Post genomic research Safe, high quality, evidence based health care Need more and better clinical information Which scales In Size In Complexity Knowledge is Fractal

The requirements & Tools chain Clinical users with needs to improve care / clinical knowledge Applications for clinical users that meet those needs Developers’ needs for terminology to build those applications Terminologies which fit the applications’ builders’ needs to meet the clinical users’ needs

Who is it for? (Useful & usable to whom?) Clinical users Carers - prospective Reviewers – retrospective Researchers, managers, assessors, … The community – how it shares its knowledge Knowledge creators / distributors Application developers Easier to re-use what exists than to build new Re-use or bust Terminology authors Quick responsive evolution

Useful and Usable Useful – for what? Usable – by whom? Supports needed applications Purpose Does it well Quality Usable – by whom? Intuitive / understandable Handy What you need is “to hand” Timely

Preview of Arguments The priorities are clinical needs supported by applications supported by terminology Clinical quality is critical Useful and usable to: clinical users, developers, ‘reviewers’, authors In an open evolving world, open managed evolution is the only plausible way forward Current technology gives us the opportunity to cope Tools and environments are as important as content

Where we come from Best Practice Clinical Terminology Data Entry PEN& PAD Clinical Record Decision Support Best Practice HealthCard Mr Ivor Bigun Any country Anytown Dun Roamin 4431 3654 90273 GALEN Clinical Terminology Data Entry Language Technology: CLEF Electronic Health Records: CLEF Decision Support & Aggregated Data Best Practice

Terminology is Now Middleware human-machine / machine - machine Explicit Machines can only manipulate what is represented explicitly More re-use  more manipulation  more explicitness Understandable People can only build, maintain and use it if they can understand it Adequate Expressive enough to do the job but still computationally tractable Reliable People can use it consistently Scalable and maintainable

Where we think we are going: Pre-1980: paper Application specific retrospective human oriented systems ICD, early SNOMED, CPT, OPCS, … Mid 1980s – 1990s: “electronic paper” Retrospective reporting + Prospective collection ICPC Read I, II Mid 1990s – mid 2000s:Centralised computer based Retrospective reporting + Prospective collection OpenGALEN, Read III, SNOMED-RT… PEN&PAD Mid 2000s – ?: Web based open managed evolution ???? – but see the Semantic Web, Gene Ontology, etc.

How we will know when we get there Criteria for success Re-use A recognised growing library of common decsision support modules Stop starting from scratch! Integration 2+ independently developed DSSs integrated with 2+ independently developed EPRS without exponentially increasing effort.

Criteria for success Authoring Indexing No individual invests in their own terminology enterprise-wide terminology servers Indexing Simplification of systems a sharp drop in special cases and exceptions a sharp increase in authors’ productivity

Criteria for success User interfaces Real systems in real use with real patients by real clinicians transparent systems

Stones in the Road Why are we not there yet? Some background definitions Some hypotheses

Clinical quality & logical quality Clinical quality – do users put in the right things? Repeatability of information captue (inter rater reliability) For decision support in prospective use For retrieval in retrospective use Salience Relevance to clinical decisions for prospective use Significance to questions for retrospective use A better measue than “coverage” Logical quality – do systems give the right responses? Correct organisation (classification) Correct inferences given correct input

Hypothesis 1 Most computer oriented terminology development ignores clinical quality … The EHR as black hole Bigger is not necessarily better …although clinical quality was the primary concern of traditional paper/human oriented terminologies (and there are honourable exceptions – e,g, ICPC). Evidence: High variability in recorded use Systematic failure to use data from GP systems in clinical studies (despite PRIMIS) Our own & colleagues’ experience in repeated studies Current planned cost of cohort ‘post genomic’ studies

Three models Meaning - ontologies Can I depend on the answers? “Dyspnoea is a respiratory problem” Clinical significance – decision support What should I think of / how does it affect decisions “Dyspnoea can be a symptom of congestive heart failure” Model of use – EHR/human factors Is what I want ‘to hand’ – is it ‘handy’?” “Dyspnoea should be a question on a cardiac history”

Hypothesis 2 Early terminologies emphasised models of use and significance and failed for lack of model of meaning “Heart diseases” are in 13 Chapters of ICD9 Recent terminologies emphasise model of meaning and fail for lack of models of use and significance Evidence: User dissatisfaction, non-use, and poor quality data The few systems based on models of use have been surprisingly popular with doctors, e.g. MedCin, ORCA But hard to use for retrieval We have fewer formal models of use than of meaning We have almost no models of ‘significance’

Grounding cost vs Clean-up cost (with thanks to Enrico Coiera) The cost of establishing a given quality of communication How much French do you need to order a meal? “Clean up cost” The cost of fixing miscommunication How many surprises will you accept? of what kind?

Special purpose vs Re-usable Multipurpose Special purpose terminologies Almost all retrospective Reporting for remuneration – ICD9-CM, CPT Reporting for epidemiology - ICD10, OPCS Multipurpose re-usable terminologies Aspire to be the glue for ‘Patient centred systems’ & ‘Personalised Medicine’ Decision support Electronic Health Records Research Integration with Bioscience … But too often ‘multipurpose’ means ‘no purpose’ ‘multiapplication’ means ‘no application’

Need “Multipurpose” mean “no purpose”? Multiple purposes held by multiple groups Multiple sources of expertise & authority One size does not fit all Multiple collaborations Multiple legacies Multiple purposes use multiple applications Applications are the point of interaction Applications make needs concrete & testable

Multipurpose means interacting with others It’s a big open world out there… Bioscience Gene Ontology, National Cancer Institute Center for Bioinformatics (NCICB), The Digital Anatomist/ Mouse Anatomy/Mammalian Anatomy, BioJava, PRINTS, EMBL, Microarrays, Protemoics, Metabalomics, Systems Biology… Medicine meets bioscience Cancer therapeutics, New imaging, … E-Health: sharing and pooling data: Collections based research” BioBank, NTRAC, NCRI, NCTR, CLEF, … “Health Intelligence” MRC policy on data sharing …

Hypothesis 3 Grounding costs can be delimited for special purpose terminologies Grounding costs are indefinite for re- usable terminologies (& is historically high) Without purposes testable through applications there Danger of the escalating deadly embrace “Must have terminology to build applications; but Must have applications before terminology” Evolutionary approach the only exit

Central Control vs Open managed evolution Académie française vs Oxford English Dictionary Scholasticism vs Empiricism The ‘arrogance of the a prior’ People don’t know what they do Look to see what is actually used Language technology shows time and again that our predictions are faulty Command economy vs Social Market Participation is the issue rather than money Somebody will still have to pay But at least they might pay for something useful

Central management Owned by one “Authority” Coupling tight / autonomy low/ participation low “Grounding costs” high / “Clean up costs” low? must have everything before you can do anything Change slow & lockstep A product

Open managed evolution “Owned” by the community – multiple “authorities” Coupling loose/ autonomy high / participation high To be useful & usable involve users using systems “Grounding costs” low / “Clean up costs” high? “Just in time” “Just enough” Agree where it counts Change quick and local - “threaded with annealing” A process

Hypotheses 4 Single purpose clinical terminologies can be best managed centrally By definition are developed in conjunction with an application Re-usable terminologies can only succeed by open managed evolution Many purposes require many contributors Evidence: Speed of uptake of HL7/LOINC W3C & the evolution of the Web Re-usable terminologies can only be developed in open collaboration with applications Otherwise “multipurpose” become “no purpose”

Hypothesis 5 Modern technology provides the means to support open managed evolution without compromising clinical quality or technical stability Trade lower grounding cost for greater clean up cost Focus on minimal stable core. Defer commitments. Evidence: OpenGALEN, Gene Ontology Utilise Web/Grid technologies for rapid dissemination and coordination Evidence: Current developments at Mayo clinic using LDAP Distribute terminology like domain names

The technologies Applications centric development Decoupled development Special purpose languages / “Intermediate Representations” Deferred commitment Clinical before technical Logic based ontologies + Models of clinical significance Models of clinical use Models of EHRs Web services & Grid technology Authentication/authorisation/accounting Distributed directories & LDAP Service discovery

Decoupled development using “Conceptual Lego” If we manage the connectors and the pieces the users can build most things for themselves Without compromising quality

Applications centric Development Meta-authoring Common Terminology/ Ontology clinical applications authoring environments Intermediate Representations clinicians / Applications builders Empowered Authors templates/ views

Loosely Coupled Development Local author uses resources & templates to formulate definition templates Worldwide Resources problems Server validates & organises Central Ontology Central Gurus integrate & fix problems Local Author needs new terms for application Local author checks Local Ontology updates

The templates are more important than the underlying formalism… "Open fixation of a fracture of the neck of the left femur" MAIN fixing ACTS_ON fracture HAS_LOCATION neck of long bone IS_PART_OF femur HAS_LATERALITY left HAS_APPROACH open “Intermediate Representations” are critical

…complex underpinnings can &will change (‘SurgicalProcess’ which isMainlyCharacterisedBy (performance which isEnactmentOf (‘SurgicalFixing’ which hasSpecificSubprocess (‘SurgicalAccessing’ hasSurgicalOpenClosedness (SurgicalOpenClosedness which hasAbsoluteState surgicallyOpen)) actsSpecificallyOn (PathologicalBodyStructure which < involves Bone hasUniqueAssociatedProcess FracturingProcess hasSpecificLocation (Collum which isSpecificSolidDivisionOf (Femur which hasLeftRightSelector leftSelection))>))))

Decoupling & Flexibility Use formality to permit flexibility Change need not mean instability Formality means effects can be predited Most users only need change in tightly controlled areas Lesson from the Semantic Web: “Forking” a natural part of development Harmless if strictly local Manageable if controlled from standard “Lego” & templates “Clean up cost” 10%-20% central effort is a reasonable target Necessary to cope with change and ignorance Evolution by “annealing”

Scalable models of use

Scalable models of Use: PEN&PAD Structured Data Entry File Edit Help FRACTURE SURGERY Reduction Fixation Fixation Open Closed Open Tibia Fibula Ankle More... Radius Ulna Wrist Humerus Femur Femur Left Left Right More... Gt Troch Shaft Neck Neck 250,000 forms from 10,000 Facts “Fractal tailoring”

Scalable models of use: Fractal tailoring forms for clinical trials Hypertension Hypertension Idiopathic Hypertension Idiopathic Hypertension` In our company’s studies In our company’s studies Idiopathic Hypertension in Study a phase 2 Idiopathic Hypertension in our co’s phase 2 study a In Phase 2 studies In Phase 2 studies

It can work The Lessons of GALEN The Lessons of PEN&PAD Loosely coupled development based on formal ontologies works “Coherence without uniformity” 90% of work done locally Ontologies can be modular rather than monolithic “Plug and play” terminology development The Lessons of PEN&PAD Models of use based on formal ontologies scale 250,000+ forms from 10,000 ‘facts’ The Lessons of the Semantic Web It works for knowledge management Growing user community outside of medicine No longer “rocket science”

So what are “Logic based ontologies”

Logic-based Ontologies: Conceptual Lego “SNPolymorphism of CFTRGene causing Defect in MembraneTransport of ChlorideIon causing Increase in Viscosity of Mucus in CysticFibrosis…” “Hand which is anatomically normal”

Logic based ontologies A formalisation of semantic nets, frame systems, and object hierarchies via KL-ONE and KRL “is-kind-of” = “implies” (“logical subsumption”) “Dog is a kind of wolf” means “All dogs are wolves” Modern examples: DAML+OIL /“OWL”?) Older variants LOOM, CLASSIC, BACK, GRAIL, K-REP, …

Logic Based Ontologies: The basics Validating (constraining cross products) Primitives Descriptions Definitions Reasoning Thing Feature pathological red Structure Encrustation + involves: MitralValve Thing + feature: pathological Structure + involves: Heart Heart MitralValve Encrustation MitralValve * ALWAYS partOf: Heart Encrustation * ALWAYS feature: pathological red + partOf: Heart red + partOf: Heart + (feature: pathological)

Building with Conceptual Lego Species Genes CFTRGene in humans Protein Protein coded by (CFTRgene & in humans) Function Membrane transport mediated by (Protein coded by (CFTRgene in humans)) Disease Disease caused by (abnormality in (Membrane transport mediated by (Protein coded by (CTFR gene & in humans))))

Avoiding combinatorial explosions The “Exploding Bicycle” From “phrase book” to “dictionary + grammar” 1980 - ICD-9 (E826) 8 1990 - READ-2 (T30..) 81 1995 - READ-3 87 1996 - ICD-10 (V10-19 Australian) 587 V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income and meanwhile elsewhere in ICD-10 W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities

The Cost: Normalising (untangling) Ontologies Structure Function Part-whole Structure Function Part-whole

The Cost: Normalising (untangling) Ontologies Making each meaning explicit and separate PhysSubstance Protein ‘ ProteinHormone’ Insulin ‘Enzyme’ Steroid ‘SteroidHormone’ ‘Hormone’ ‘ProteinHormone’ Insulin^ ‘SteroidHormone’ ‘Catalyst’ ‘Enzyme’ PhysSubstance Protein ProteinHormone Insulin Enzyme Steroid SteroidHormone Hormone ProteinHormone^ Insulin^ SteroidHormone^ Catalyst Enzyme^ …build it all by combining simple trees … ActionRole PhysiologicRole HormoneRole CatalystRole … … Substance BodySubstance Protein Insulin Steroid … Hormone = Substance & playsRole-HormoneRole ProteinHormone = Protein & playsRole-HormoneRole SteroidHormone = Steroid & playsRole-HormoneRole Catalyst = Substance & playsRole CatalystRole Insulin  playsRole HormoneRole Enzyme ?=? Protein & playsRole-CatalystRole

But none of it works without tools None of it works without communication & cooperation

Communicating software environments “Environments” rather than “servers” Clinical users - care and review Environments for entering& retrieving information Methodologies for measuring and monitoring quality of information Human factors, language technology, fractal tailoring to needs Application developers Configuration tools – much more than “terminology servers” The key to success Ontology authors Tools for distributed loosely coupled authoring Ontology managers (the “gurus”) Tools for reconciliation, change management, & meta-authoring of templates

Summary of Arguments The priorities are clinical needs supported by applications supported by terminology Unless they serve clinical needs, applications are useless Unless they serve applications, terminologies are useless Unless used reliably, terminologies are meaningless “Meaning is a social construct” Clinical quality should be our watchword Useful and usable to: clinical users, developers, ‘reviewers’, authors Requires models of use & clinical significance Requires tools and environments In an open evolving world, open managed evolution is the only plausible way forward Participation and control are the issues – not money Current technology gives us the opportunity to cope If we let development follow need If we use them to the full 19th century methods won’t cope with 21st century problems

Making Terminologies useful and usable: Clinical Terminologies in the 21st Century: What are they for? What might they look like? Alan Rector Bio and Health Informatics Forum/ Medical Informatics Group Department of Computer Science University of Manchester rector@cs.man.ac.uk www.cs.man.ac.uk/mig img.man.ac.uk www.clinical-escience.org mygrid.man.ac.uk