Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovery Seminar 295265/SS1 – Spring 2009 Translational Pharmacogenomics: Discovering New Genetic Methods to Link Diagnosis and Drug Treatment Ontology:

Similar presentations

Presentation on theme: "Discovery Seminar 295265/SS1 – Spring 2009 Translational Pharmacogenomics: Discovering New Genetic Methods to Link Diagnosis and Drug Treatment Ontology:"— Presentation transcript:

1 Discovery Seminar /SS1 – Spring 2009 Translational Pharmacogenomics: Discovering New Genetic Methods to Link Diagnosis and Drug Treatment Ontology: Developing a Systematic Approach to Translational Pharmacogenomic Research Data Collection April 15, 2009 Werner CEUSTERS Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group University at Buffalo, NY, USA

2 What does the word ‘ontology’ mean?
‘Ontology’ is popular ‘Ontology’ in Buffalo is famous

3 Google  ‘define: ontology’:
the study of the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects; an explicit representation of the meaning of terms in a vocabulary, and their relationships; a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them; specification of a conceptualisation of a knowledge domain; a structured information model of a domain capable of supporting reasoning by human users and software agents; a data model that represents a set of concepts within a domain and the relationships between those concepts;

4 One term, many definitions
This raises some questions: Is it possible for a term to have so many meanings? Can the authors of these definitions all be right at the same time? Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?

5 Is it possible for a term to have so many meanings?
Merriam-Webster on ‘bank’ Entry term 27 occurrence types 3 different word types Noun Verb Part of compound term }

6 Is it possible for a term to have so many meanings?
Merriam-Webster on ‘bank’

7 Is it possible for a term to have so many meanings?
Merriam-Webster on ‘bank’

8 Is it possible for a term to have so many meanings?
Clearly: yes ! This phenomenon is called: Homonymy

9 Term Meaning Meaning-1 Meaning-2 Meaning-3 Term-1 Meaning-4 Term-2 Term-3 Meaning-5 Meaning-6 Meaning-7 This is called: synonymy

10 Homonymous use of the term ‘ontology’
the study of the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects; an explicit representation of the meaning of terms in a vocabulary, and their relationships; a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them; specification of a conceptualisation of a knowledge domain; a structured information model of a domain capable of supporting reasoning by human users and software agents; a data model that represents a set of concepts within a domain and the relationships between those concepts;

11 Can the authors of these definitions all be right at the same time?
Yes, if we are dealing with a case of homonymy.

12 Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ? ? information model is an data model representation is a is a ‘that’ thing is a vocabulary is a is a specification study

13 Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ? Not for all ! Only for some

14 Homonymous use of the term ‘ontology’: at least one clear cut distinction
the study of the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects; an explicit representation of the meaning of terms in a vocabulary, and their relationships; a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them; specification of a conceptualisation of a knowledge domain; a structured information model of a domain capable of supporting reasoning by human users and software agents; a data model that represents a set of concepts within a domain and the relationships between those concepts;

15 ‘Ontology’ as the study of what exists
Key questions: What exists ? How do things that exist relate to each other ? Some hypotheses: An external reality, time, space Ideas, concepts Particulars, universals, objects, processes God Ontologists from distinct ‘schools’ differ in opinion about the existence of some of the above: Realism, nominalism, conceptualism, monism, …

16 An ontology as a representation
Key question: of what ? Terms  WordNet, MedDRA, RxNORM Concepts  the majority of ‘ontologies’ But … overwhelming lack of clarity about what ‘concepts’ are: meaning shared in common by synonymous terms ? idea shared in common in the minds of those who use these terms ? unit of knowledge describing meanings ? feature or property shared in common by entities in the world ? Universals  Realism-based ontology

17 Ontology & Data

18 Current mainstream thinking
data information knowledge wisdom Reality What is there on the side of the patient

19 Example of data generation at the bench
Proteolysis: treat with trypsin Separation technique I: chromatography like lectin affinity chromatography From PNGase F: we get fractions that contain peptides and glycans – we focus only on peptides. Separation technique II: chromatography like reverse phase chromatography Amit Sheth. Semantic Web Technology in Support of Bioinformatics for Glycan Expression. W3C workshop on Semantic Web for Life Sciences, October 28, 2004, Cambridge MA

20 Semantic annotation of Scientific Data
Salient point: one ms/ms data element is annotated (mapped to) with multiple concepts in the ontology. Using ontological concepts enables the following: Comparison of data generated from heterogeneous sources. Interoperability of heterogeneously generated data Discovery of a relationship through a chain of mappings/connections. Hypothesis formulation. Amit Sheth. Semantic Web Technology in Support of Bioinformatics for Glycan Expression. W3C workshop on Semantic Web for Life Sciences, October 28, 2004, Cambridge MA

21 Ontologies for annotating data
Snomed-CT view on Serum hepatitis

22 Zoom on Hepatitis B with hepatitis D superinfection
Relationships to other concepts: Causative agent Hepatitis D virus (organism) Finding site Liver structure (body structure) Causative agent Hepatitis B virus (organism) Associated morphology Inflammation (morphologic abnormality) Information about this concept: PREFERRED_TERM Hepatitis B with hepatitis D superinfection TERM Hepatitis B with delta agent coinfection TERM Hepatitis B with delta agent superinfection TERM Hepatitis B with hepatitis D superinfection TERM Hepatitis D infection TERM Viral hepatitis B with delta agent superinfection TERM Viral hepatitis B with hepatitis D superinfection Comments ?

23 SNOMED-CT generated taxonomy (partial)
General finding of abdomen (finding) Viscus structure finding (finding) Abdominal organ finding (finding) Disorder of abdomen (disorder) Liver finding (finding) Disorder of liver (disorder) Infectious disease of abdomen (disorder) Inflammatory disease of liver (disorder) Viral hepatitis (disorder) Is a Comments ? Type B viral hepatitis (disorder) Hepatitis B with hepatitis D superinfection (disorder)

24 Problem with mainstream thinking:
data information knowledge wisdom Reality What is there on the side of the patient Questions not often enough asked: What part of our data corresponds with something out there in reality ? What part of reality is not captured by our data, but should because it is relevant ?

25 Terminological versus Ontological approach
The terminologist defines: ‘a clinical drug is a pharmaceutical product given to (or taken by) a patient with a therapeutic or diagnostic intent’. (RxNorm) The ontologist thinks: Does ‘given’ includes ‘prescribed’? Is manufactured with the intent to … not sufficient? Are newly marketed products – available in the pharmacy, but not yet prescribed – not clinical drugs? Are products stolen from a pharmacy not clinical drugs? What about such products taken by persons that are not patients? e.g. children mistaking tablets for candies.

26 The solution: the RIGHT sort of ontology Realism-based ontology

27 A multi-disciplinary approach to ontology
In philosophy: Ontology (no plural) is the study of what entities exist and how they relate to each other; In computer science and (biomedical informatics) applications: An ontology (plural: ontologies) is a shared and agreed upon conceptualization of a domain; Our ‘realist’ view within the Ontology Research Group combines the two: We use realism, a specific theory of ontology, as the basis for building high quality ontologies, using reality as benchmark.

28 Realist ontology: assumes three levels of reality
The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; Cognitive agents build up ‘in their minds’ cognitive representations of the world; To make these representations publicly accessible in some enduring fashion, they create representational artifacts that are fixed in some medium. Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA

29 Three levels of reality
The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA

30 Reality exists before any observation

31 Reality exists before any observation
Humans had a brain well before they knew they had one. Trees were green before humans started to use the word “green”. R And also most structures in reality are there in advance.

32 Three levels of reality
The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; Cognitive agents build up ‘in their minds’ cognitive representations of the world; Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA

33 B The cognitive agent acknowledges the existence of some Portion of Reality (POR) R

34 Some portions of reality
B Some portions of reality escape his attention. R

35 Three levels of reality
The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; Cognitive agents build up ‘in their minds’ cognitive representations of the world; To make these representations publicly accessible in some enduring fashion, they create representational artifacts that are fixed in some medium. Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA

36 He represents only what he considers relevant
B RU1B1 Both RU1B1 and RU1O1 are representational units referring to #1; RU1O1 is NOT a representation of RU1B1; RU1O1 is created through concretization of RU1B1 in some medium. RU1O1 O #1 R

37 The three levels applied to medication management
Generic Specific 3. Representation ‘person’ ‘drug’ ‘penicillin’ ‘W. Ceusters’ ‘my pneumonia’ 2. Beliefs (knowledge) CONTRA-INDICATION INDICATION my doctor’s work plan diagnosis 1. First-order reality me my toxic reaction to penicillin my bronchitis my doctor my pharmacist’s computer MOLECULE PERSON DISEASE PATHOLOGICAL STRUCTURE PORTION OF PENICILLIN DRUG

38 A realism-based ontology
is a representation of some pre-existing domain of reality which (1) reflects the properties of the objects within its domain in such a way that there obtains a systematic correlation between reality and the representation itself, (2) is intelligible to a domain expert (3) is formalized in a way that allows it to support automatic information processing

39 Our foundations: Basic Formal Ontology
An ontology which is Realist: Fallibilist: Perspectivalist: Adequatist: There is only one reality and its constituents exist independently of our (linguistic, conceptual, theoretical, cultural) representations thereof, theories and classifications can be subject to revision, there exists a plurality of alternative, equally legitimate perspectives on that one reality these alternative views are not reducible to any single basic view.

40 Basic Formal Ontology The world consists of entities that are
Either particulars or universals; Either occurrents or continuants; Either dependent or independent; and, relationships between these entities of the form <particular , universal> e.g. is-instance-of, <particular , particular> e.g. is-member-of <universal , universal> e.g. isa (is-subtype-of) Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA

41 Accept that everything may change:
changes in the underlying reality: Particulars and universals come and go changes in our (scientific) understanding: The planet Vulcan does not exist reassessments of what is considered to be relevant for inclusion (notion of purpose). encoding mistakes introduced during data entry or ontology development.

42 Example: continuants preserve identity while changing
human being living creature me child Instance-of in 1960 adult me Instance-of since 1980 caterpillar butterfly animal

43 Are we done ? Is an accurate coding system, classification system, terminology, ontology, …, a necessary and sufficient condition for obtaining “better” information ? Necessary: yes ! Sufficient: no !

44 Using codes does not prevent ambiguities as to what is described: how many disorders are listed?
The same type of location code used in relation to three different events might or might not refer to the same location. 5572 04/07/1990 closed fracture of shaft of femur Fracture, closed, spiral 12/07/1990 Accident in public building (supermarket) 79001 Essential hypertension 0939 24/12/1991 benign polyp of biliary tract 2309 21/03/1992 47804 03/04/1993 Other lesion on other specified region 17/05/1993 298 22/08/1993 Closed fracture of radial head 01/04/1997 PtID Date ObsCode Narrative 20/12/1998 malignant polyp of biliary tract Three references of hypertension for the same patient denote three times the same disease. If the same fracture code is used for the same patient on different dates, then these codes might or might not refer to the same fracture. If two different fracture codes are used in relation to observations made on the same day for the same patient, they might refer to the same fracture If two different tumor codes are used in relation to observations made on different dates for the same patient, they may still refer to the same tumor. The same fracture code used in relation to two different patients can not refer to the same fracure.

45 Consequences Very difficult to:
Count the number of (numerically) different diseases Bad statistics on incidence, prevalence, ... Bad basis for health cost containment Relate (numerically the same or different) causal factors to disorders: Dangerous public places (specific work floors, swimming pools), dogs with rabies, HIV contaminated blood from donors, food from unhygienic source, ... Hampers prevention ...

46 Referent Tracking: Being clear what the data are about !

47 Now! That should clear up a few things around here !
Purpose: explicit reference to the concrete individual entities relevant to the accurate description of each patient’s condition, therapies, outcomes, ... Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform Jun;39(3):

48 Referent Tracking: Numbers instead of words!
Method: Introduce an Instance Unique Identifier (IUI) for each relevant particular (individual) entity 235 78 5678 321 322 666 427 Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform Jun;39(3):

49 The principle of Referent Tracking
‘John Doe’s liver tumor was treated with RPCI’s irradiation device’ ‘John Doe’s ‘John Smith’s liver liver tumor tumor was treated was treated with with RPCI’s RPCI’s irradiation device’ irradiation device’ #1 #3 #2 #4 #5 #6 instance-of at t1 inst-of at t2 treating person liver tumor clinic device #10 #30 #20 #40 #5 #6

50 EHR – Ontology “collaboration”

51 Advantage: better reality representation
5572 04/07/1990 closed fracture of shaft of femur Fracture, closed, spiral 12/07/1990 Accident in public building (supermarket) 79001 Essential hypertension 0939 24/12/1991 benign polyp of biliary tract 2309 21/03/1992 47804 03/04/1993 Other lesion on other specified region 17/05/1993 298 22/08/1993 Closed fracture of radial head 01/04/1997 PtID Date ObsCode Narrative 20/12/1998 malignant polyp of biliary tract IUI-001 IUI-003 IUI-004 IUI-005 IUI-007 IUI-002 IUI-012 IUI-006 7 distinct disorders

52 Did you pay attention ? A final test
In John Smith’s EHR: At t1: “male” at t2: “female” What are the possibilities ? Change in reality: transgender surgery change in legal self-identification Change in understanding: it was female from the very beginning but interpreted wrongly Correction of data entry mistake: it was understood as male, but wrongly transcribed (Change in meaning of the words)

53 Conclusion (1) Building high quality ontologies is hard.
Not everybody has the right skills Experts in driving cars are not necessarily experts in car mechanics (and the other way round). Ontologies should represent the state of the art in a domain, i.e. the science. Science is not a matter of consensus or democracy Natural language relates more to how humans talk about reality or perceive it, than to how reality is structured. No high quality ontology without the involvement of ontologists.

54 Conclusion (2) Realist ontology is a powerful QA tool for building high quality ontologies AND high quality databases; Referent tracking, based on realist ontology, is a means to remove the ambiguity in data that cannot be solved by realist ontology alone; It is a form of “adult” annotation Application of RT requires a globally accessible repository Adds another level to interoperability. The use of “meaningless” IUIs allows very strict safety and security measures to be implemented.

55 Goal: new form of Evidence Based Medicine
Now: Decisions based on (motivated/justified by) the outcomes of (reproducable) results of well-designed studies Guidelines and protocols Evidence is hard to get, takes time to accumulate. Future: Each discovered fact or expressed belief should instantly become available as contributing to ‘evidence’, wherever its description is generated. Data ‘eternally’ reusable independent of the purpose for which they have been generated.

Download ppt "Discovery Seminar 295265/SS1 – Spring 2009 Translational Pharmacogenomics: Discovering New Genetic Methods to Link Diagnosis and Drug Treatment Ontology:"

Similar presentations

Ads by Google