Presentation is loading. Please wait.

Presentation is loading. Please wait.

Developing Biomedical Ontologies in OWL Alan Rector School of Computer Science / Northwest Institute of Bio-Health Informatics with.

Similar presentations


Presentation on theme: "Developing Biomedical Ontologies in OWL Alan Rector School of Computer Science / Northwest Institute of Bio-Health Informatics with."— Presentation transcript:

1 Developing Biomedical Ontologies in OWL Alan Rector School of Computer Science / Northwest Institute of Bio-Health Informatics rector@cs.man.ac.uk with special acknowledgement to Jeremy Rogers www.co-ode.org www.clinical-escience.org www.opengalen.org www.clinical-escience.org www.opengalen.org © University of Manchester 2007 Licensed under Creative Commons Attribution Non-commercial License http://creativecommons.org/licenses/by-nc/2.0/uk/

2 2 Tools and downloads Protege4Alpha –protege.stanford.edu –http://protege.stanford.edu/download/prerelease-alpha/protege-bin- 4.0-alpha.ziphttp://protege.stanford.edu/download/prerelease-alpha/protege-bin- 4.0-alpha.zip GraphViz - required for OWLViz –http://www.graphviz.org/http://www.graphviz.org Tutorial handouts and Ontologies –http://www.cs.man.ac.uk/~rector/tutorials/Medinfo-2007http://www.cs.man.ac.uk/~rector/tutorials/Medinfo-2007 Preparatory material –If you haven’t done so already, please read the introduction to OWL and Protege-OWL –http://www.co-ode.org/resources/tutorials/protege-owl-tutorial.phphttp://www.co-ode.org/resources/tutorials/protege-owl-tutorial.php 2

3 3 Goals for Today Improved practical understanding of OWl as it applies to biomedical and clinical ontologies Basic notions from Ontology as well as OWL proper –Standard notions from common “upper ontologies” Used rather than explained in detail –but we will be happy to discuss Basic notions of ontology modularisation Some useful patterns for biomedical problems 3

4 Topics for today Very Brief Review of OWL - naive versions of “infection” and “drugs” to illustrate principles of OWL Motivation for broader approach Normalisation & Why Classify? –Why use a computable subset of logic? Modularisation - doing it in layers Anatomy, parts and Disorders - a less naive version of “pneumonia” Quantities and Units - if time Normal, NonNormal & Pathological –Using negation –“Situations” –OWL, XML & UML Summary

5 © University of Manchester 5 Now: How to do it in OWL - A quick review ►Existential qualifiers (SOME) ►Universal qualifiers (ONLY) ►Open World Assumption ►Negation as inconsistency (”unsatisfiability”) ►What is neither provable nor provably false is unspecified (”unknown”) ►Load infections-only.owl ►Trivially simply model for illustration only ►Define a bacterial infection ►Define a viral infection ►Define a mixed (bacterial-viral) infection 5

6 © University of Manchester 6 Basic OWL (In Manchester Syntax) ►Classes: B subclassOf A ►B --> A ►All Bs are As ►All pneumococci are bacteria ►without exception. ►Restrictions - “links between classes” ►SOME (Existential / someValuesFrom) All As have property P with value from V ►A --> have_p SOME V ►All As have_p some kind of V ► Infection --> has_cause SOME Micro_organism ►ONLY (Universal, allValuesFrom) ►A --> have_p ONLY V ►All as have_p only kinds of V ►Infection --> has_cause ONLY Micro_organism 6

7 © University of Manchester 7 Basic OWL ►Individuals ►Individual mike TYPE Person ►“Mike” is a member of the class Person ►“Mike is a person” ►Facts ►mike has_brother VALUE john ►“Mike has a brother John” ►Restrictions on individuals ►make TYPE (has_brother SOME Person) ►“Mike has some brother(s) (but we may not know who they are)” 7

8 © University of Manchester 8 OWL-DL is disguised first order logic ►All statements that appear to be about classes in OWL are really disguised statements about all members of those classes. ►B subclassOf A means “All Bs are As” ► ∀ x. B(x) ➔ A(x) ►A has SOME B means “All As have some B” ► ∀ x.A(x) ➔∃ y. B(y) & has(x,y) ►A has ONLY B means “All As have only B” ► ∀ xy. A(x) & has(x,y) ➔ B(y) ►Cannot talk about classes directly ►(No meta-knowledge in OWL-DL) 8

9 © University of Manchester 9 Primitive and Defined classes ►Primitives ►Named things that are described ►Lung ➔ is_contained_in SOME Chest ►but so are lots of other things ►Often related to “natural kinds” ►Defined classes ►Things for which we have a sufficient (and necessary) definition ►Pneumococcal_pneumonia ⇔ Pneumonia AND is_caused_by SOME Pneumococcus ►“ANYTHING that is a pneumonia and is caused by pneumoccocus is a ‘pneumococcal pneumonia’

10 © University of Manchester 10 Domain and Range constraints ►Domain and range constraints are really disguised universals ►has_part DOMAIN Anatomical_structure RANGE Anatomical_structure means ►Thing has_part ONLY Anatomical_structure Thing is_part_of ONLY Anatomical_structure ►BEWARE: Domain and Range constraints can effect classification - ►A common source of errors

11 © University of Manchester 11 Lack of Unique Name Assumption ►In OWL, two entities may be the same until we distinguish them ►say they are disjoint or different ►For classes ►Micro-organism Bacterium Virus ►DISJOINT( Bacteria, Virus) ►Otherwise may overlap ►For individuals ►allDifferent( male, female ) 11

12 © University of Manchester 12 Open World Reasoning ►Everything is true unless it can be proved false ►False means “probably false” ►“Open World Assumption” ►All OWL definitions and descriptions implicitly contain “amongst other things” ►“Pneumococcal pneumonia is a pneumonia that, amongst other things, is caused by pneumococcus” ►Most other systems use “negation as failure” ►If you can’t find it, it must be false ►“Closed World Assumption” 12

13 © University of Manchester 13 The OWL Reasoner ►Organises the subclass ( “subsumption”) hierarchy according to the definitions (and other axioms) ►A specialised theorem prover ►AKA “classifier” ►Several available ►FaCT++ and Pellet are built into Protege4Alpha ►FaCT++ is (usually) faster ►Pellet is more robust ►Pellet will soon have better debugging facilities 13

14 © University of Manchester 14 Part 1B: Review of building a simple ontology in OWL 14

15 © University of Manchester 15 Basic steps in building an ontology:1 Gather your terms Lung Pneumonitis Pneumonia Chest Rales Bacteria Virus Pneumococcus Haemophylus Antibiotic Infection

16 © University of Manchester 16 Basic steps in building an ontology 2: Organise your terms 53 LungPneumonitis Pneumonia Chest Rales Bacteria Virus Pneumococcus Haemophylus Antibiotic Micro-organism Infection

17 © University of Manchester 17 Basic steps in building an ontology 3 Represent your terms ►Perhaps first using mind maps or CMAPS ►Then when you understand them in OWL ►Including definitions ►And not throwing away the original source material ►... So now on to OWL ►First, A simplistic version of Infections 17

18 © University of Manchester 18 Very simple version ►Open Infections_first_start_here_first.owl ►One process: ►Infection ►A few organisms ►kinds of Bacterium and Virus ►One property and its inverse ►causes/is_caused_by ►Task ►Define different kinds of infections 18

19 © University of Manchester 19 Bacterial infection ►Any infection caused by some bacterium ►Note that it is a defined (equivalent) class.

20 © University of Manchester 20 Viral infection

21 © University of Manchester 21 Mixed infection ►How will bacterial infection, viral infection, and mixed infection classify? ►Run the classifier and see

22 © University of Manchester 22 After classification Why?

23 © University of Manchester 23 A “Pure bacterial infection” ►A pneumococcal infection ►How will these classify ►Is a pneumococcal and haemophylus infection a kind of pure bacterial infection? - Both are bacteria.

24 © University of Manchester 24 Why not? 24

25 © University of Manchester 25 Closure Axioms

26 © University of Manchester 26 Trivial satisfiability Two common errors ►How will these classify? Why? 26

27 © University of Manchester 27 After classification (Explain it)

28 © University of Manchester 28 Trivial satisfaction ►Bacterium and Virus are disjoint ►Nothing is both a bacterium and a virus ►owl:Nothing = (Bacterium AND Bottom) ►ONLY NOTHING ↔ NOT SOME THING ►Infection THAT is_caused_by ONLY Nothing = Infection THAT NOT (is_caused_by SOME Thing) ►ONLY does not mean SOME ►Infection THAT is_caused_by ONLY Bacterium = Infection THAT NOT (is_caused_by SOME NOT Bacterium) ►No cause given. There may be a cause, as long as it is a ►Therefore: An infection not caused by anything is a kind of infection not caused by anything except bacteria. ►Check definition for “Pure bacterial infection”

29 © University of Manchester 29 Exercise ►Now let’s do it with drugs ►We’ll use a simple version that ignores strengths ►Clinical_drug has_ingredient SOME Ingredient_substance ►We’ll provide a more complete version later in the tutorial ►Load the “simplistic-drugs.owl” - you should see... 29

30 © University of Manchester 30 The exercise: Extend the ontology to... ►Define a Clinical_drug containing Aspirin ►Define a Clinical _drug containing Codeine ►Define a Clinical_drug containing Aspirin and Codeine ►Check where it classifies wrt Drugs containing Aspirin and Drugs containing Codeine ►Define a Clinical drug containing only Aspirin ►Check where it classifies ►Define a Clinical drug containing more than one ingredient ►Which drugs classify under it ►Remove the disjoint axioms and see what happens

31 © University of Manchester 31 Part 2: Broader Motivation 31

32 © University of Manchester Where I come from Best Practice Clinical Terminology Data Entry Clinical Record Clinical research & Decision Support Best Practice Data Entry Electronic Health Records GALEN Ontologies & Description logics Healthcare Mr Ivor Bigun Dun Roamin Anytown Any country 4431 3654 90273 Clinical research Decision Support & Knowledge Presentation

33 By way of User Centred Design Knowledge Representation / ontologies was a solution, not a goal

34 ... and EHRs Three Resources Developed independently reusable reference information resources Metadata interface Concept System Model (‘Ontology’) Information Model (EHR Model, Archetypes) Inference Model (Guideline Model) ‘Contingent’ Domain Knowledge General Domain Knowledge Individual Patient Records "Abstraction" Each with –Model –Knowledge/ content –Metadata –Interfaces to the others

35 “Ontologies” in Information Systems What information systems can say and how - “Models of Meaning” –Mathematical theories - although usually weak ones evolved at the same time as Entity Relation and UML style modelling Managing Scalabilty / complexity - “Knowledge driven systems” –Housekeeping tools for expert systems Organising complex collections of rules, forms, guidelines,... Interoperability –The common grounding information needed to achieve communication –Standards and terminology Communication with users –Document design decisions Testing and quality assurance –sufficient constraints to know when it breaks –Empower users to make changes safely –... but “They don’t make the coffee” –just one component of the system / theory 8

36 36 My definition of an ontology Short version: “a representation of the shared background knowledge for a community” Long version: “an implementable model of the entities that need to be understood in common in order for some group of software systems and their users to function and communicate at the level required for a set of tasks”... and “it doesn’t make the coffee” Just one of at least three components of a complete system 36

37 Solution space –Ontologies –Information Models –Logics –Rules –Frames –Planners –Logic programming –Bayes nets –Decision theory –Fuzzy sets –Open / closed world –… Problem space –Answer questions –Advising on actions –Hazard monitoring –Creating forms –Discovering resources –Constraint actions –Assess risk –…

38 Solution space –Ontologies –Information Models –Logics –Rules –Frames –Planners –Logic programming –Bayes nets –Decision theory –Fuzzy sets –Open / closed world –… Problem space –Answer questions –Advising on actions –Hazard monitoring –Creating forms –Discovering resources –Constraint actions –Assess risk –…

39 Problem space & solution space `` Guidelines, Patterns, Tools Problem space Solution space

40 Why use a Classifier? To compose concepts –Allow conceptual lego To avoid combinatorial explosions –Keep bicycles from exploding To manage polyhierarchies –Adding abstractions (“axes”) as needed –Normalisation Untangling –labelling of “kinds of is-a” To manage context –Cross species, Cross disciplines, Cross studies To check consistency and help users find errors

41 © University of ManchesterAssertion: ►Let the ontology authors ►create discrete modules ►describe the links between modules ►Let the logic reasoner ►Organise the result ►Let users see the consequences of their actions ►Very few people can do logic well ►And almost none quickly The arrival of computable logic-based ontologies/OWL gives new opportunities to make ontologies more manable and modular

42 Fundamental problems: Enumeration doesn’t scale

43 The scaling problem: The combinatorial explosion Predicted Actual It keeps happening! –“Simple” brute force solutions do not scale up! Conditions x sites x modifiers x activity x context→ –Huge number of terms to author –Software CHAOS

44 Combination of things to be done & time to do each thing Effort per term Things to build Terms and forms needed –Increases exponentially Effort per term or form –Must decrease to compensate To give the effectiveness we want –Or might accep t What we would like What we might accept

45 The exploding bicycle 1972 ICD-9 (E826) 8 READ-2 (T30..) 81 READ-3 87 1999 ICD-10 ……

46 1999 ICD10: 587 codes V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities

47 Defusing the exploding bicycle: 500 codes in pieces 10 things to hit… –Pedestrian / cycle / motorbike / car / HGV / train / unpowered vehicle / a tree / other 5 roles for the injured… –Driving / passenger / cyclist / getting in / other 5 activities when injured… –resting / at work / sporting / at leisure / other 2 contexts… –In traffic / not in traffic V12.24 Pedal cyclist injured in collision with two- or three-wheeled motor vehicle, unspecified pedal cyclist, nontraffic accident, while resting, sleeping, eating or engaging in other vital activities

48 Conceptual Lego… it could be... Goodbye to picking lists… Structured Data Entry File Edit Help What you hit Your Role Activity Location Cycling Accident

49 Intelligent Forms

50 Integrating rather than Cross Mapping

51 And generate it in language

52 global cycle & work at centre local cycles: work by users Supports Loosely coupled distributed ontology development From authoring to meta-authoring From 80% central/global effort to 10% central/global effort User effort cut by 75% compared with manual methods Mostly in reduced committee meetings & arguments

53 The means: Logic as the clips for “Conceptual Lego” hand extremity body acute chronic abnormal normal ischaemic deletion bacterium polymorphism cell protein gene infection inflammation Lung expression virus mucus polysacharide

54 Logic as the clips for “Conceptual Lego” “ SNPolymorphism of CFTRGene causing Defect in MembraneTransport of Chloride Ion causing Increase in Viscosity of Mucus in CysticFibrosis …” “Hand which is anatomically normal”

55 © University of Manchester 55 Part 3: Normalisation & Classification 55

56 The means: Logic as the clips for “Conceptual Lego” hand extremity body acute chronic abnormal normal ischaemic deletion bacterium polymorphism cell protein gene infection inflammation Lung expression virus mucus polysacharide

57 Logic as the clips for “Conceptual Lego” “ SNPolymorphism of CFTRGene causing Defect in MembraneTransport of Chloride Ion causing Increase in Viscosity of Mucus in CysticFibrosis …” “Hand which is anatomically normal”

58 Build complex representations from modularised primitives Genes Species Protein Function Disease Protein coded by gene in humans Function of Protein coded by gene in humans Disease caused by abnormality in Function of Protein coded by gene in humans Gene in humans

59 © University of Manchester Normalising (untangling) Ontologies Structure Function Part-whole Structure Function Part-whole

60 © University of Manchester Rationale for Normalisation ►Maintenance ►Each change in exactly one place ►No “Side effects” ►Modularisation ►Each primitive must belong to exactly one module ►If a primitive belongs to two modules, they are not modular. ►If a primitive belongs to two modules, it probably conflates two notions ►Therefore concentrate on the “primitive skeleton” of the domain ontology ►Parsimony ►Requires fewer axioms

61 © University of Manchester Normalisation and Untangling Let the reasoner do multiple classification ►Tree ►Everything has just one parent ►A ‘strict hierarchy’ ►Directed Acyclic Graph (DAG) ►Things can have multiple parents ►A ‘Polyhierarchy’ ►Normalisation ►Separate primitives into disjoint trees ►Link the trees with restrictions ►Fill in the values

62 Untangling and Enrichment Using a classifier to make life easier Substance - Protein - - ProteinHormone - - - Insulin - Steroid - - SteroidHormone - - - Cortisol - Hormone - -ProteinHormone - - - Insulin - - SteroidHormone - - - Cortisol - Catalyst - - Enzyme - - - ATPase - PhsioloicRole - - HormoneRole - - CatalystRole - Substance - - Protein - - - Insulin - - - ATPase - Steroid - - Cortisol Hormone ≡ Substance & playsRole - someValuesFrom HormoneRole ProteinHormone ≡ Protein & playsRole someValuesFrom HormoneRole SteroidHomone ≡ Steroid & playsRole someValuesFrom HormoneRole Catalyst ≡ Substance & playsRole someValuesFrom CatalystRole Enzyme ≡ Protein & playsRole someValuesFrom CatalystRole Insulin → playsRole someValuesFrom HormoneRole Cortisol → playsRole someValuesFrom HormoneRole ATPase → playsRole someValuesFrom CatalystRole Substance - Protein - - ProteinHormone - - - Insulin - - Enzyme - - - ATPase - Steroid - - SteroidHomone^ - - - Cortisol -Hormone - - ProteinHormone^ - - - Insulin^ - - SteroidHormone^ - - - Cortisol^ - Catalyst - - Enzyme^ - - - ATPase^

63 © University of Manchester Modularised into structure and function ontologies (all primitive)

64 © University of Manchester Unified ontology after classification

65 © University of Manchester 65 Module Structure Module Structure NB: Examples are in the tutorial ontology folder

66 © University of Manchester Normalisation: Criterion 1 The skeleton should consist of disjoint trees ►Every primitive concept should have exactly one primitive parent ►All multiple hierarchies the result of inference by reasoner

67 © University of Manchester Normalisation Criterion 2: No hidden changes of meaning ►Each branch should be homogeneous and logical (“Aristotelian”) ►Hierarchical principle should be subsumption ►Otherwise we are “lying to the logic” ►The criteria for differentiation should follow consistent principles in each branch eg. structure XOR function XOR cause

68 © University of Manchester Normalisation Criterion 3 Distinguish “Self-standing” and “Refining” Concepts “Qualities” vs Everything else ►Self-standing concepts ►Roughly Welty & Guarino’s “sortals” ►person, idea, plant, committee, belief,… ►Refining concepts – depend on self-standing concepts ►mild|moderate|severe, hot|cold, left|right,… ►Roughly Welty & Guarino’s non-sortals ►Closely related to Smith’s “fiat partitions” ►Usefully thought of as Value Types by engineers ►For us an engineering distinction…

69 © University of Manchester Normalisation Criterion 3a Self-standing primitives should be globally disjoint & open ►Primitives are atomic ►If primitives overlap, the overlap conceals implicit information ►A list of self-standing primitives can never be guaranteed complete ►How many kinds of person? of plant? of committee? of belief? ►Can’t infer: Parent & ¬sub 1 &…& ¬sub n-1  sub n ►Heuristic: ►Diagnosis by exclusion about self-standing concepts should NOT be part of ‘standard’ ontological reasoning

70 © University of Manchester Normalisation Criterion 3b Refining primitives should be locally disjoint & closed ►I ndividual values must be disjoint ►but can be hierarchical ►e.g. “very hot”, “moderately severe” ►Each list can be guaranteed to be complete ►Can infer Parent & ¬sub 1 &…& ¬sub n-1  sub n ►Value types themselves need not be disjoint ►“being hot” is not disjoint from “being severe” ►Allowing Valuetypes to overlap is a useful trick, e.g. ►restriction has_state someValuesFrom (severe and hot)

71 © University of Manchester Normalisation Criterion 4 Axioms ►No axiom should denormalise the ontology No axiom should imply that a primitive is part of more than one branch of primitive skeleton ►If all primitives are disjoint, any such axioms will make that primitive unsatisfiable ►A partial test for normalisation: ►Create random conjunctions of primitives which do not subsume each other. ►If any are satisfiable, the ontology is not normalised

72 © University of Manchester A real example: Build a simple treee easy to maintain

73 © University of Manchester Let the classifier organise it

74 © University of Manchester If you want more abstractions, just add new definitions (re-use existing data) “Diseases linked to abnormal proteins”

75 © University of Manchester And let the classifier work again

76 © University of Manchester And again – even for a quite different category “Diseases linked genes described in the mouse”

77 © University of Manchester Summary: Why Normalise? Why use a Classifier? ►To compose concepts ►Allow conceptual lego ►To manage polyhierarchies ►Adding abstractions (“axes”) as needed ►Normalisation ►Untangling ►labelling of “kinds of is-a” ►To avoid combinatorial explosions ►Keep bicycles from exploding ►To manage context ►Cross species, Cross disciplines, Cross studies ►To check consistency and help users find errors

78 © University of Manchester 78 Exercises ►Load the ontology from: normalisation_unfactored/ Normalisation-demo-hormones-start.owl ►Create a notion of a “biological role” and use it to untangle the hierarchy as per the demonstration ►Result when classified should be roughly as on next slide ►Add the role of a neurotransmitter Add a new type of structure Catecholamine ►Add Norepinephrine so that it is both a neurotransmitter and a hormone and has the chemical structure of a catecholamine. 78

79 © University of Manchester 79 Original & example finished hierarchies Note defined classes in finished version!

80 © University of Manchester Part 4 Modularisation: towards assembling ontologies from reusable fragments

81 © University of Manchester Why use modules ►Re-use ►e.g. annotations, quantities, upper ontologies ►Coherent extensions ►Localisation & Views ►Local normal ranges, value sets, etc. under generic headings ►Experimentation and add ins ►e.g. add in tutorial examples without corrupting basic structure ►Logical separation ►e.g. avoid confusing medicine and medical records ►...but managing modularised ontologies is more work ►More things to remember. ►More things to get wrong ►Easy to put something in the “wrong” module

82 © University of Manchester Modules and imports ►Key notions: ►“Base URI” - the identifier for the ontology ►In the form of a URI but really just an ID ►Used by the import mechanism to identify the module ►Physical location ►Where the module is actually stored. ►usualy your local directory for this version of the ontology ►Our conventions: ►Ontologies stored as sets of modules in a single directory ►“Start-Here.owl” tells you what to load and load everything else. ►The “Active ontology” is the one you are editing ►Active ontology items are shown in bold 91

83 © University of Manchester Protege-OWL import mechanism ►Importer looks for a file with the correct identifying “Base URI” ►Written into the header of the XML ►NOT the physical location ►Order of search ►(Specified file) ►The local directory ►Local libraries ►Global libraries ►The internet at the site indicated by the Base URI 93

84 © University of Manchester 84 Example: Factoring the normalisation demo ►Load the ontology: Normalisation-factored/START-HERE.OWL ►If it isn’t up by default, select Views ➔ Ontology views ➔ OWLViz Imports Graph and place it below Imported Ontologies ►You should see...

85 © University of Manchester 85 To select the active ontology, either ►Select it from the menu at the top ►..or right click/CMND-click on the graph (active ontology is in blue) 85

86 © University of Manchester 86 Entities for which there is information in the active ontology are in bold Structure.owl ActiveStructure-plus-function.owl Active

87 © University of Manchester 87 URIs of Entities and location of axioms shows when you hover over them Insulin “originates” in Structure.owl plays_bio_role some Catalyst role was asserted in Structure-plus-function.owl

88 © University of Manchester Items from active ontology are in bold

89 © University of Manchester Load the tutorial ontology ►Open.../biomedical-tutorial-2007/Start-Here.owl 94

90 © University of Manchester Typical pattern Views ➔ Ontology views ➔ OwlViz Imports

91 © University of Manchester Module list Module list ►Annotation - the annotation properties needed ►data-entities.owl ►quantities, units & numbers ►very-top - the upper ontology ►in this case adapted for biomedicine ►Anatomy, Physiology, Biochemistry, Organism ►The main topics ►Qualities ►The basic qualities of those topics ►Disorders ►General patterns for disorders ►Specific disorders ►Examples for this tutorial ►Situations ►Disorders in context of patients and observations 96

92 © University of Manchester Part 5 Anatomy and Disorders Using infections to define pneumonias ►Disorders have a locus in an anatomical structure of physiological process ►Disorder has_locus SOME Anatomical_structure ►Disorders are anything which is described as pathological ►has_normality_quality SOME Pathological ►To be explained in detail late ►Parts and wholes ►A whole field “mereology” ►Multiple views - functional / cinician’s view different from structural / anatomist’s view. 106

93 © University of Manchester 93 Plan of this section ►Develop the definition of “Pneumonia” within a common upper ontology framework ►starting with a naive definition ►Illustrate issues ►Improving it iteratively ►Exercises ►To do similarly with Hepatitis and its variations ►To do similarly with Fracture and variants ►Then to move on to “codes” and “Situations” ►Side benefit ►Introduce by example key notions from common upper ontologies - DOLCE and BFO 93

94 © University of Manchester Examples ►Backup your ontology directory ►Make “disorders.owl” the active ontology ►Create a new ontology in the same frame named “my- disorders” ►File new ►When pop-up asks about a new frame say NO ►When asked for a name edit the end of the URI to “my-ontology.owl” ►When asked where to store it, browse to your current directory ►Press finish ►NB This does NOT save your ontology!

95 © University of Manchester Import disorders.owl ►Go to the Active Ontology Tab ►Click the plus icon for imported ontologies ►Select import an ontology that has already been loaded ►Select disorders.owl and press finish

96 © University of Manchester Task: Make Pneumonitis and Pneumonias in various variations ►Question 1: What is “Pneumonia” and what is “Pneumonitis” ►Look it up ►e.g. Google define: pneumonitis ►Write your own paraphrases: ►“Pneumonitis” is an “Inflammation of the lungs” ►“Pneumonia” is an “inflammation of the lungs caused by an infection” ►Many definitions on the web, but this summarises them for our purposes.

97 © University of Manchester First defintion of pneumonitis ►“Inflammation of the lung” ►Find Inflammation ►CTRL or CMND F in class hierarchy 110

98 © University of Manchester Create “Pneumonitis” ►Create a new subclass of Inflammation ►In the comment box type something like “Pneumonitis” = “Inflammation of lung” ►ALWAYS add a free text paraphrase of what you are modelling ►Add the restriction ►has_locus SOME Lung ►Make it a defined class ►CTRL/CMD-D.

99 © University of Manchester Create pneumonia ►“Pneumonia” is a pneumonitis is the outcome of an infection ►In this ontology we use “is_outcome_of” for “cause” 112

100 © University of Manchester Bacterial pneumonia ►First attempt ►“Pneumonia caused by a bacteria” ►But need to rephrase to fit the ontology ►“Pneumonia that is the outcome of an infection by bacteria” ►In this ontology “by” translates to the property “has_actor” ►Processes have actors and objects

101 © University of Manchester By analogy make viral pnemonia and mixed pneumonia ►Mixed pneumonia is a pneumonia that is caused by both virus and pneumonia ►How to say this

102 © University of Manchester Classify and check ►Be sure that all classes are defined ► defined ► primitive ►To convert from primitive to defined, cmnd-d or ctrtl-d (Mac or PC) 115

103 © University of Manchester Should get 116

104 © University of Manchester 104 WARNING: How not to do it! ►Must get the “and” in the correct place ►wrong: has_actor SOME (Virus AND Bacterium) ►Create a class Probe_Mixed_pneumonia_wrong using ►... Infection THAT has_actor SOME (Virus AND Bacterium) ►it is inconsistent ►Why? ►Think of it in terms of SQL queries for “Find me all bacterial and viral disorders” ►The same issue ►“Probes” are useful for testing 104

105 © University of Manchester What about “left lower lobe pneumonia”? ►First define lobar pneumonia as ►Pneumonia that has locus in a lobe of a lung ►“Lobe THAT is_subdivision_of SOME Lung” ►But what then is a disorder of the lung ►Disorder THAT has_locus SOME Lung ►But what if I define an inflammation of a lobe of the lung ►Inflammation THAT has_locus SOME (Lobe THAT is_subdivision_of SOME Lung) ►The classifier ought to organise it for us ►... but it doesn’t.

106 © University of Manchester OWL means what it says ►Lobes are not lungs! ►Our definition of lung disorder is too narrow ►Almost always Disorders of parts are disorders of the whole ►A broader definition of “Disorder_of_lung” ►Disorder THAT has_locus SOME (Lung OR is_clinical_part_of SOME Lung) ►Almost OK, but still Inflammation of lobe of lung is not a pneumonitis

107 © University of Manchester Make the pattern consistent ►Redefine Pneumonitis “An inflammation of the lung or any clinical part of the lung of the lung” ► 121

108 © University of Manchester Almost correct, but... ►What about “Bronchitis” ? ►An inflammation of the bronchi (or any of their parts) ►Try it and see. ►Definition of “Pneumonitis” is now too broad ►Not just any part of the lung, but the “subdivisions” of the lung ► lobes, quadrants, bases, apices, etc.

109 © University of Manchester The property hierarchy allows multiple views ►The bronchus is a “component” of the lung ►The lobe is a subdivision of the lung ►Redefine pneumonitis as an inflammation of the lung or a subdivision of the lung ► 123

110 © University of Manchester Now reclassify ►Bronchitis is now a disorder of the lung ( “lung disease”) but not a pneumonitis ►As required.

111 © University of Manchester 111 Alternative / Addtional Solution ►Pneumonitis is an inflammation of the Parenchyma of the lung ►Inflamation THAT has_locus SOME (Parenchyma THAT constitutes SOME (Lung OR is_clinical_part_of SOME Lung)) ►Shifts “work” from properties to classes ►Advantage - fewer properties ►... but still need clinical parts and constituents ►Disadvantage - more embedding 111

112 © University of Manchester Clinical partonomy and pleuritis ►To an anatomist, ►the pleura are different organs from the lungs ►To a clinician, ►“Pleuritis”should be classified as a “Lung disease” or “Disorder of the lung” ►“Pleuritis” - Inflammation of the pleura ►The Pleura ► function as part of the lung ►even though they are not physically part of the lung ►The property hierarchy copes with both views. ►Anything that is structurally a part of something is a clinical part of it ►Anything that is functionally a part of something is a clinical part of it ►etc. ►BUT NOT VICE VERSA.

113 © University of Manchester Also affects modularity ►We have chosen to model functional parts with physiology rather than with anatomy ►To stick with the FMA view as far as possible in the Anatomy module. ►So we add the fact that the pleura are functional parts of the Lung in the physiologic_processes module rather than the anatomy module ►Might even have a separate functional module ►We can add information to a class in a new module Additions in physiological_ processes.owl

114 © University of Manchester Create pleuritis and classify ►Classify and check results ►A disorder of the lung but not a bronchitis or pneumonitis. ►as required ►Anatomists & Clinicians can each have their own view

115 © University of Manchester Part 5b: Normality and Negation ►What does it mean to be normal or abnormal? ►To have a disease ►We implement two notions - ►NonNormal - anything noteworthy ►Pathological - requiring medical intervention ►(including “watchful waiting” or an active decision not to intervene) ►GALEN used “Intrinsically pathological” but not needed in OWL ►Basic rules ►Pathological ➔ nonNormal ►Normal = NOT nonNormal ►nonPathological = NOT pathological 128

116 © University of Manchester Normality and negation ►Basic rules ►Pathological ➔ nonNormal ►Normal = NOT nonNormal ►nonPathological = NOT pathological ►Remember ►subclassOf means “necessarily implies” ►so Pathological is a subclass of nonNormal ►See the definitions of Normal-nonNormal_quality in disorders.owl ►Let the classifier do the work... ►Make disorders.owl the active ontology ►Look at definitions under Normal_nonNormal_quality ►then classify ►examine using OwlViz

117 © University of Manchester Defining “disease” or “disorder” ►Hard, probably futile ►The words are used in many different ways ►Things referred to cross ontological boundaries ►Lesions - e.g tumours ►Processes - e.g. infection or inflammation ►Qualities - e.g. obstruction, malformation, elevation,... ►Best just to say what is pathological ►let the classifier gather them up ►Also classify along multiple dimensions ►include as many abstractions as are useful, no more and no less 130

118 © University of Manchester Example from tutorial ontology: active ►Top Level ►Expanded

119 © University of Manchester Exercises ►Commented version of my_ontologies.owl is in ►Specific_disorders.owl ►Make that the active ontology and compare ►From either Specific_disorders.owl or your my- ontologies.owl ►Create a further myOntologies2 ►Define Hepatitis, Infectious Hepatitis, Alcoholic Hepatitis ►Be sure that the definition excludes Cholangitis (inflamation of the billiary tract which is, in part, part of the liver ►Define fractures of the Lower limb, Fracture of femur, and Fracture of the anatomical neck of the femur so that they subsume correctly. ►Be sure that all Fractures are “pathological” ►Distinguish between fractures caused by Trauma and Fractures not caused by trauma ►Note the labelling problem ►In this what is usually called a “pathological fracture” is paraphrased as a “non-traumatic fracture”

120 © University of Manchester 120 Part 6 Quantities and Units ►Fundamental to any clinical or scientific application ►Quantities are first class entities ►The value of a lab test is a quantity not a number ►Safety demands that quantities be complete with a magnitude and unit ►Otherwise errors occur ►whether for Mars landers or chemotherapy doses ►Doubly important because almost all units are standard ►but just enough are not to allow errors 120

121 © University of ManchesterQuantities ►As real as numbers, matrices, or any other mathematical structure ►“Naked” numbers rarely suitable for healthcare IT ►Too much chance of error ►Mars landers have failed and patients have died ►Distinguish “naked numbers” from “pure numbers” - e.g. percentages, universal constants etc ►Quantities have ►magnitude ►units ►dimension ►dimension and units must be compatible ►dimension is usually indicated by units ►Time and Duration ►Note that Dates and times and temporal intervals (temporal deictics) are not quantities ►The difference between two lengths is a length ►The difference between two times is a duration ►Date-times and temporal intervals require separate mechanisms ►Duration is a quantity.

122 © University of Manchester Quantities and Units ►Make data-entities.owl the active ontology ►Right-click or cmd-click in the OWLViz View ►Select from the menu at the top of the screen

123 © University of Manchester Simple structure in tutorial ontology ►Dimension implicit in classification of quantities ►unit an object ►Classifier forces subsumption of units to track subsumption of quantities ►magnitude a number ►“int” is artifact because currently some classifiers still only handle ints should be number

124 © University of Manchester Quantities and Units which is “primitive”? ►Requirements ►Avoid maintaining the hierarchies of quantities and nits separately ►Be able to determine the legal units for a quantity ►Be able to “co-erce” the quantity according to the units ►Our solution ►All classes of units are asserted in a flat list ►Defined following the pattern ►Unit_class = is_units_of SOME Quantity_class ➔ is_units_of ONLY Quantity_class ►“Any unit for this class of quantity must be of this kind and only of this kind” ►Query for unit for a quantity ►Unit THAT is_units_of SOME quantity_class ►Quantity that uses units ►Quantity that has_units SOME Unit_class

125 © University of Manchester DL Queries

126 © University of Manchester Typical quantities ►Specific value ►Concentration_quantity THAT has_units SOME mg_per_L has_magnitude VALUE 140 ►Value range (OWL 1.1 / new version only) ►Concentration_quantity THAT has_units SOME mg_per_L has_magnitude SOME int[ >=130, <=150] ►NB units mg_per_L will cause quantity to be classified as a Mass_concentration_quantity ►Try it in DL query tab. 100

127 © University of Manchester DL Query for previous

128 © University of Manchester Example of Quantity in use ►“Hemoglobin 13 mg%” ►Serum_hemoglobin THAT has_value SOME (Concentration THAT has_magnitude VALUE 13 has_units SOME mg_per_cent) 104

129 © University of Manchester 129 Ranges ►OWL supports simple ranges ►All classifiers support integer ranges; all should support floats and other XML data type ranges ►In Manchester Syntax ►int[>=11, <=15] ►or if we use the convention of multiplying by 100 as a kluge ►int[>=1100, <=1500]. ►Can use ranges to set normals ►Often best done in a separate module to allow for localisation 129

130 © University of Manchester 130 Two levels of statement ►Equivalence: Very strong ►I can infer 11.0-13.5 from “normal” and vice versa. ►Subclass: Weaker ►I can infer normal from 11.0-13.5 but not vice versa “Any serum Hb between 11 and 13.5 is normal” ►Reminder ►Matching the equivalent class definition implies the superclasses

131 © University of Manchester Extensions to quantities ►Compatible java packages for units and conversion. ►Reported to be on the way in java 6 ►Really ought to disappear into datatypes ►but it seems unlikely, so... ►A full range of units is available in an almost compatible form from HL7 thanks to Gunther Schadow 105

132 © University of Manchester 132 Dates & times are not quantities ►Note the position of date_time_holder in the hierarchy ►The difference between two quantities is always a quantity of the same type ►The difference between two date-time points is a duration ►Date-times known as ►“Coordinates” ►Integral spaces ►The corresponding “differential space” to date-time is “duration” ►Issues of units are entirely different for date-time points as for quantities ►Although we often express date-time points by the duration since some fixed index point, e.g. msec since 1900-01-01:00- 00 132

133 © University of Manchester 133 The issue of “moles” ►Moles are the one unit that isn’t absolute ►mmPerL is meaningless without knowing what it is of ►A good example of using context in OWL ►... and of current limitations of arithmetic in OWL ►Conversion THAT is_from SOME mg to SOME gm has_magnitude VALUE 1000 ►Conversion THAT is_from SOME (mm_per_L THAT is_units_of SOME Sodium_concentration) to SOME mg_per_L has_magnitude VALUE 23 ►Moles are really a class of units ►One motivation for representing all units as classes for uniformity

134 © University of Manchester 134 EXERCISES: ►Extend the simple list of quantities and units provided in data-entity.owl ►Add mEq_per_L as a mass concentration unit ►We will ignore conversions for now. ►express serum concentration of sodium of 140mEq/L ►Express the notion of a serum sodium concentration in the range 135-145mEq/L. ►In the strong form with equivalence ►In the weaker form with subclass statements ►Add a per cent by volume quantity and appropriate units ►Express the notion of ethanol concentration of 95% by volume ►Add a speed (velocity) quantity and appropriate units ►express a velocity of 2mm/minute 134

135 © University of Manchester 135 Part VII Situations, Codes, and Negation ►What is it that should be entered in the medical record? ►What should codes represent? ►How do we capture negation in medical records and codes? ►Conjunctions and disjunctions? (”and”s and “or”s) ►How do ontologies relate to coding systems? ►Are codes just classes in the ontology? ►See also Rector et al. Medinfo 2007 http://www.cs.man.ac.uk/~rector/papers/Whats-in-a-code/ http://www.cs.man.ac.uk/~rector/papers/Whats-in-a-code/

136 © University of Manchester The Premise: Two levels of modelling Two different sorts of things ►Models of the our understanding of the world ►Models of data structures to carry the information about our understanding of the world 7 The Ontology Level: Logical statements about the world - “ All skull fractures are head injuries” “John has a skull fracture with an intracranial bleed” The EHR / Data Structure Level: Valid Specifications - “ Valid data structures for head injury must include: a topic field containing the code for head injury, a type code with the code for the specific type of head injury a intracranial bleeding field with the code for a type of intracranial bleeding or the code for some flavour of null” encode to decode as

137 © University of Manchester The coding system is derived from the ontology Diabetes Type 1 Diabetes Type 2 Diabetes Metabolic disorder Diabetes Type 1 Diabetes Type 2 Diabetes Metabolic disorder Classes in the ontology code_for_ metabolic_disorder code_for_diabetes code_for_ diabetes_type_1 code_for_ diabetes_type_2 is_subcode_of Individuals in the information model (“meta” to the ontology) Map to

138 © University of Manchester But then the question: What does a code represent? (What are the dots?) ►“Having something” ►e.g. “having diabetes”, “having a tumour”, “having a blood pressure of 130/70mmHG” ►But what is it that “has” something ►A patient (“subject”) ►as observed by an observer ►at a given time ►and place and... ►Encapsulate it as a “Situation” ►Codes represent classes of situations ►All those situations for which the code can be validly asserted Diseases, findings and disorders? –No –Separate codes for having and not having diabetes –Don’t want to say “has some disease that isn’t diabetes”

139 © University of Manchester “Clinical situations” ►The unit of medical observation ►A patient as observed by a clinician at a time in a setting ►A Patient Situation may include / not include any number of statements ►An individual Patient Situation may belong to many different classes of Patient Situation ►May be encoded by many different codes in the patient record ►Adding a code to a “event” in a record is to assert that the “event” is a instance of that class of “Situation” ►The event has at least all of the assertions comprising the individual codes ►Allows easy expression of negation and classification of results with recognition of equivalence

140 © University of Manchester 140 “Situations” ►The package unit of information in electronic health records is a “Situation” ►Adding a code to a “event” in a record is to assert that the “event” is a instance of that class of “Situation” ►The event has at least all of the assertions comprising the individual codes ►Allows easy expression of negation and classification of results with recognition of equivalence 140

141 © University of Manchester Simple example in OWL ►Ontology Level ►CLASS Situation CLASS Diabetes ➔ Metabolic_disease CLASS Diabetes_type_1 = Diabetes AND has_type VALUE type_1 ►Code Level ►INDIVIDUAL metabolic_disease TYPE Code INDIVIDUAL diabetes_code TYPE Code ➔ subcode_of VALUE metabolic_disease INDIVIDUAL diabetes_code TYPE Code ➔ subcode_of VALUE metabolic_disease_code ►Details ►See paper ►For more complete meta-modelling in OWL see Motik and Grauer, OwlEd--07 11 Generated From

142 © University of Manchester Negation: “Not some” / “Not any” ►What does it mean to say “without intracranial bleed” ►Situation does not include any intracranial bleed ►In OWL ►Situation THAT NOT (includes SOME Intracranial_bleed) ►NB: “SOME” can also be read as “any” 18

143 © University of Manchester Example - Head Injury with/without Intracranial Bleed ►Ontological Level ►Situation THAT includes SOME Head_injury ►Situation THAT includes SOME Intracranial_bleed ►Situation THAT includes SOME Head_injury AND NOT (includes SOME Intracranial_bleed) ►The reasoner works out the classification hierarchy ►Codes and subcodes can be generated from the hierarchy reliably 19

144 © University of Manchester 144 Example: Head Trauma with/without intracranial bleeding with/without Skull Fracture

145 © University of Manchester Head Injury with and without intracranial bleed, before classification and after classifica- tion

146 © University of Manchester

147 147 Classification - Note automatic inversion of negatives

148 © University of Manchester Claim 1 ►Codes correspond to classes of situations ►To give a patient a code is to say that ►“This situation for this patient is and instance of the class designated by the code” ►i.e. it is a instance of the subclass of all situations that conform to the code’s definition ►Corollaries ►Two codes are equivalent if and only if the classes of situations that they represent are logically equivalent ►One code is a subcode of another if and only if the class of situations represented by one is necessarily a subclass of the class of situations represented by the other 12

149 © University of Manchester “Findings” and “Observables” An Information Perspective ►A common distinction in medical coding systems that is inappropriate in medical ontologies. ►Principle ►Every statement should convey information ►Corollary: Every statement should differentiate a subclass of clinical situations ►There is no point in making a statement that we already know ►All statements should be true of only some situations (otherwise they are tautologies)

150 © University of Manchester Findings vs observables ►Findings ► Things that only some people have sometimes included in only some Clinical Situations ►e.g. having/not having Diabetes, Fracture, Tumour, Mumur, Headache,... ►Having a finding on its own conveys information ►Distinguishes this patient from others - Findings may have further properties that provide further information ►Observables ►Things that everybody has always included in all Clinical Situations ►e.g. having Body temperature, Height, Blood pressure,... ►Noting an observable on its own conveys no information ►Must note its value or other properties ►Observable + value = Finding ►Having an observable with a value conveys information 14

151 © University of Manchester Findings and Observables (more detail and in OWL) ►Situation THAT includes SOME Diabetes ➔ Situation ►Subclass of Situation but not equivalent to Situation ►Therefore “includes SOME Diabetes” is a finding ►conveys enough information to differentiate a subclass of situations ►Axiom Situation ➔ includes SOME Serum_K ►All situations include a Serum_K ►Whether or not we measure it ►Therefore Situation THAT includes SOME Serum_K ≡ Situation ►Equivalent to Situation ►Therefore “includes Serum_K” is an observable ►Situation THAT includes SOME (Serum_K THAT has_level SOME Elevated) ➔ Situation ►Therefore “includes Serum_K has_level SOME Elevated) can be treated as a finding - or as a observable value pair.

152 © University of Manchester Example: Elevated Potassium ►Ontology Level ►Situation THAT includes SOME (Serum_potassium_conc THAT has_level SOME Elevated) ►Situation THAT includes SOME (Serium_potassium_conc THAT has_value VALUE quant[5.1 mmPerL] ►Code Level: Two possible mappings ►Elevated serum potassium: ► ►Both map to same class of situations in ontology ►Therefore logically equivalent ►(NB: For quantitative values, not possible to enumerate as findings) ►A different finding for each possible value 16

153 © University of Manchester Note ►You cannot use the distinction as to whether two classes are equal directly in the ontology ►It is a second order (meta) fact about the classes rather than about all of their members ►therefore...

154 © University of Manchester 154 Claims 2 ►The distinction between “finding” and “observable” is part of the meta-ontology, rather than the ontology proper ►It says something about the ontology itself rather than the entities represented by the ontology ►Many classes in the ontology consist mostly of “observables”, e.g. “Quality” or “Findings”, e.g. “Conditions” but the distinction is problematic in the ontology itself ►Because an “obserable” + “value” will always be a kind of “finding” ►And the ontology (should) say that all “observables” have a value (even if we don’t yet know it) 154

155 © University of Manchester 155 Claim 3 ►The “Code-Value” debate is an artefact of mapping ontological level to EHR leve ►Choice fundamentally arbitrary at EHR level ►No choice to make at ontological level –Both forms map to the same class of situations 155

156 © University of Manchester 156 Exercise ►Build for yourself the situations for Ulcer, Ulcer with and without perforation, with and without haemorrhage ►Check that the the classification is correct ►If time... Build the situations for the following and their negations. ►Foot ulcer ►Diabetic foot ulcer ►Foot ulcer caused by Type 1 Diabetes ►Check that the classifications are correct 156

157 © University of Manchester Summary: Building Ontologies in OWL-DL ►Start with a taxonomy of primitive classes ►Should form pure trees ►Remember, to make disjointness explicit ►Use definitions and the classifier to create multiple hierarchies ►Use existential (someValuesFrom) restrictions by default ►Things will only be classified under defined classes ►Be careful with ►Open world reasoning ►Use closure axioms when needed ►“some” and “only” – someValuesFrom/allValuesFrom ►domain and range constraints ►making disjoint explicit

158 © University of Manchester Summary ►Knowledge is fractal ►Enumeration is never ending ►The power of logic / OWL is composition and classification ►Normalise ontologies for re-use and maintenance ►Build DAGs (nets) out of Trees using classification ►Use Modules to separate views and then bind them ►Diseases of the parts are diseases of the whole ►... but must be careful ►The property hierarchy can be used to support multiple views ►Some notions defy definition - e.g. “Disease” ►When in doubt describe, classify and collect result bottom up ►Much more in the comments in the tutorial ontology ►and remember: “Ontologies are just one part of a system”


Download ppt "Developing Biomedical Ontologies in OWL Alan Rector School of Computer Science / Northwest Institute of Bio-Health Informatics with."

Similar presentations


Ads by Google