Presentation is loading. Please wait.

Presentation is loading. Please wait.

New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U.

Similar presentations


Presentation on theme: "New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U."— Presentation transcript:

1 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U AMIA 2010 Applying Evolutionary Terminology Auditing to SNOMED CT Washington, DC, USA – November 17, 2010 Werner CEUSTERS, MD Center of Excellence in Bioinformatics & Life Sciences Ontology Research Group University at Buffalo, NY, USA

2 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT as a ‘knowledge base’ Pharmaceutical/ biologic product Autonomic drug Sympathomimetic agent (product) Epinephrine preparation Parenteral form epinephrine Adrenaline 1:10,000 100micrograms/1mL prefilled syringe Has dose form Injection (product) Allergen class Drug allergen Sympathomimetic agent (substance) Respiratory sympathomimetic agent Has active ingredient Epinephrine Substance Drug or medicament Hormone Adrenal medullary hormone

3 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT for documentation John Doe (#6189) John Doe’s medication #72566 Drug administration act #125611 Adrenaline 1:10,000 100micrograms/1mL prefilled syringe Described_by_means_of Injection (procedure) Described_by_means_of Patient (person) Described_by_ means_of

4 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Adrenaline 1:10,000 100micrograms/1mL prefilled syringe Described_by_means_of SNOMED CT for documentation John Doe (#6189) John Doe’s medication #72566 Drug administration act #125611

5 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Adrenaline 1:10,000 100micrograms/1mL prefilled syringe Described_by_means_of Requirements for accurate annotation systems John Doe (#6189) John Doe’s medication #72566 Drug administration act #125611 The terminology and underlying ontology must match what is the case in reality: Sufficient coverage of the domain, Up to date, Modifications to the system should occur in response to changes in the domain.

6 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Sorts of changes in ontologies Ontology versions exhibit differences of the following sorts: –Add a subtree –Delete a subtree –Move a subtree to a different location –Move a set of sibling classes to a different location –Create a new abstraction and move a set of siblings down in a class hierarchy, creating a new superclass. –Delete a class, moving its subclasses to become subclasses of its superclass. –Split a class –Merge classes Noy, N.F., Kunnatur, S., Klein, M., and Musen, M.A. Tracking changes during ontology evolution. In Proceeding of the 3rd International Semantic Web Conference (ISWC2004), Hiroshima, Japan, November 2004. These distinctions don’t tell us the motivations behind the changes

7 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U 1.There is an external reality which is ‘objectively’ the way it is; 2.That reality is accessible to us; 3.We build in our brains cognitive representations of reality; 4.We communicate with others about what is there, and what we believe there is there. Change-analysis through Ontological Realism Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA Basic Axioms:

8 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U

9 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Representations First Order Reality L1: entities with objective existence L2: clinicians’ beliefs about (1) L3: linguistic representations about (1), (2) or (3) Three levels of reality in Ontological Realism

10 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT’s updating principles Changes in SNOMED CT are … ‘… driven by changes in understanding of health and disease processes; introduction of new drugs, investigations, therapies and procedures; new threats to health; as well as proposals and work provided by SNOMED partners and licensees’. SNOMED CT® Technical Reference Guide – January 2007 Release, p43.

11 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT implicitly references the levels Changes in SNOMED CT are … ‘… driven by changes in understanding of health and disease processes; introduction of new drugs, investigations, therapies and procedures; new threats to health; as well as proposals and work provided by SNOMED partners and licensees’. SNOMED CT® Technical Reference Guide – January 2007 Release, p43. L2L1L3

12 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT Data Structure Summary SNOMED CT® Technical Reference Guide – July 2010 Release, p17.

13 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT’s updating principles (2) If the Concept’s meaning changes, the Concept is made inactive. One or more new Concepts are usually added to better represent the meaning of the old Concept. Concepts may become inactive but are never deleted. Concept identifiers are persistent over time and are never reused. The link between a Description and a Concept is persistent. If a Description is no longer pertinent for a Concept, the Description is inactivated. SNOMED CT® Technical Reference Guide – January 2007 Release, p43.

14 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT concept status values ‘ Historical relationships’ "SAME AS", "REPLACED BY", "WAS A", "MAYBE A", "MOVED TO", "MOVED FROM" inactive because found to contain a mistake5 inactive because inherently ambiguous.4 inactive because no longer recognized as a valid clinical concept (outdated) 3 inactive: withdrawn because duplication2 inactive because moved elsewhere10 inactive: ‘retired’ without a specified reason1 active with limited clinical value (classification concept or an administrative definition) 6 active in current use0 Concept StatusST SNOMED CT® Technical Reference Guide – January 2007 Release, p43.

15 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT concepts’ status (July 2011) 100%391,170TOTAL 0.29%1,142inactive because found to contain a mistake5 4.05%15,858inactive because inherently ambiguous.4 0.37%1,439inactive because no longer recognized as a valid clinical concept (outdated) 3 9.65%37,752inactive: withdrawn because duplication2 3.69%14,451inactive because moved elsewhere10 1.92%7,525inactive: ‘retired’ without a specified reason1 5.35%20,930active with limited clinical value (classification concept or an administrative definition) 6 74.677%292,073active in current use0 %NConcept StatusST

16 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U 399422005: Adenoma of small intestine

17 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U

18 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U

19 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U SNOMED CT concept status values ‘ Historical relationships’ "SAME AS", "REPLACED BY", "WAS A", "MAYBE A", "MOVED TO", "MOVED FROM" inactive because found to contain a mistake5 inactive because inherently ambiguous.4 inactive because no longer recognized as a valid clinical concept (outdated) 3 inactive: withdrawn because duplication2 inactive because moved elsewhere10 inactive: ‘retired’ without a specified reason1 active with limited clinical value (classification concept or an administrative definition) 6 active in current use0 Concept StatusST SNOMED CT® Technical Reference Guide – January 2007 Release, p43. With the exception of ST=3, all are pure internal motivations. ST=3 just hints to an external motivation, but doesn’t specify which one.

20 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Ontological Realism and optimal ontologies Each representational unit (RU) in such an ontology would designate –(1) a single portion of reality (POR), which is –(2) relevant to the purposes of the ontology and such that –(3) the authors of the ontology intended to use this RU to designate this POR, and –(4) there would be no PORs objectively relevant to these purposes that are not referred to in the ontology.

21 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U But things may go wrong … assertion errors: ontology developers may be in error as to what is the case in their target domain; relevance errors: they may be in error as to what is objectively relevant to a given purpose; encoding errors: they may not successfully encode their underlying cognitive representations, so that particular representational units fail to point to the intended PORs.

22 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Relations between levels of reality RealityUnderstandingEncoding E OEORVBEBRVInt.Ref. P+1YYYYYR+0 A+1N-N---0 A+2YNYN--0 P-1N-YYY RR 3 P-2N-YYN RR 4 P-3N-YYNR-5 P-4YYYYN RR 1 P-5YYYYNR-2 P-6YNYYYR+1 P-7YNYYN RR 2 P-8YNYYNR-3 P-9YYYYYR++1 P-10YNYYYR++2 A-1YYYN--1 A-2YYN---1 A-3N-YN--1 A-4YNN---1

23 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Updating is an active process authors assume in good faith that –all included representational units are of the P+1 type, and –all they are aware of, but not included, of A+1 or A+2. If they become aware of a mistake, they make a change under the assumption that their changes are also towards the P+1, A+1, or A+2 cases. Thus at that time, they know of what type the previous entry must of have been under the belief what the current one is, and the reason for the change.

24 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Effects of various sorts of changes … When something faithfully represented at t ceases to be faithful at t+1, leaving the ontology unchanged causes a P+1 to become a P-1. When something faithfully represented at t is not believed to be faithful anymore at t+1 while in fact it still is, removing the representational element causes a P+1 to become a A-2.

25 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Quality of a representation w.r.t. reality n: number of representational elements in the ontology m: number of unjustified absences e i : magnitude of the error, if any, for the i th representational element

26 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Comparing representational artifacts RealityTerminology 1Terminology 2Terminology 3 RU Config. ErrorConfig.ErrorConfig.Error animalP+1 0 0 0 fishP+1 0 0 0 whaleP+1 0 0 0 mammalP+1 0 0 0 fish are animalsP+1 0 0 0 mammals are animalsP+1 0 0 0 whales are fishA+1P-13A+10 0 whales are animalsP+1 0 0 0 whales are mammalsP+1A-21 1P+10 SCORE 8*5/ ((8*5)+(0*4)) = 1.00 ((7*5)+(1*2))/ ((8*5)+(1*4)) =0.84 7*5/ ((7*5)+(1*4)) =0.90 8*5/ ((8*5)+(0*4)) =1.00

27 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Comparing consecutive versions Time t1Time t2Time t3 T1 T2T1T2T3 C.E.C.E.C.E.C.E.C.E.C.E. animalP+10 0 0 0 0 0 fishP+10 0 0 0 0 0 whaleP+10 0 0 0 0 0 mammalP+10 0 0 0 0 0 fish are animalsP+10 0 0 0 0 0 mammals are animalsP+10 0 0 0 0 0 whales are fishP+10P-13A+10P-13A+10 0 whales are animalsP+10 0 0 0 0 0 whales are mammals------A-21 1P+10 SCORE 1.000.931.000.840.901.00

28 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Detail of calculations TerminologyTime of assessment Formula for quality scoreQuality Score t1(8*5)/(8*5)1.00 T1t2((7*5)+(1*2))/(8*5)0.93 t3((7*5)+(1*2))/((8*5)+(1*4))0.84 T2t2(7*5)/(7*5)1.00 t3(7*5)/((7*5)+(1*4))0.90 T3t3(8*5)/((8*5)+(0*4))1.00

29 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Is it possible to translate status into error-values? ‘ Historical relationships’ "SAME AS", "REPLACED BY", "WAS A", "MAYBE A", "MOVED TO", "MOVED FROM" inactive because found to contain a mistake5 inactive because inherently ambiguous.4 inactive because no longer recognized as a valid clinical concept (outdated) 3 inactive: withdrawn because duplication2 inactive because moved elsewhere10 inactive: ‘retired’ without a specified reason1 active with limited clinical value (classification concept or an administrative definition) 6 active in current use0 Concept StatusST SNOMED CT® Technical Reference Guide – January 2007 Release, p43.

30 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Some principles used all new introductions are unjustifiably missing in earlier versions. –is adequate for most types of concepts, except for pharmaceutical products and certain information artifacts such as newly constructed rating scales or named guidelines and protocols; ‘duplicate’ translates into P-9; sample of 1000 changes to find common principles.

31 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Translating SNOMED-status into error types StatusExisting concept made …Error Type 0 active: in current use A-1 1 inactive: ‘retired’ without a specified reason P-1 2 inactive: withdrawn because duplication P-9 3 inactive because no longer recognized as a valid clinical concept (outdated) P-1 4 inactive because inherently ambiguous. P-4 5 inactive because found to contain a mistake P-1 6 active with limited clinical value (classification concept or an administrative definition) A-1 10 inactive because moved elsewhere P-6 11 pending move P-6

32 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Views on versions of SNOMED-CT

33 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Evolutionary view on the Jan 2002 release of SNOMED CT concepts descriptions overall relationships

34 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Quality change of SNOMED CT (200201-200907) concepts descriptions overall relationships

35 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Limitations (1) We did not find any principle underlying the assignment of ‘inactive, reason not specified’ and ‘erroneous’. All concepts with the status ‘outdated’ in our sample involved organisms. The majority of concepts stated to be inactivated for reasons of ‘ambiguity’ do in our opinion not look ambiguous at all, as further witnessed by the fact that some of them have been replaced by a concept with an identical name.

36 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Limitations (2) Lack of resources might prevent changes to be introduced although the authors know it has to be done at some point. –thus a released version is perhaps not assumed to reflect state of the art the disappearance of a relationship in a newer version might not be a real disappearance since the relationship might still be inferred from the graph structure underlying SNOMED CT.

37 New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Conclusions Our quality assessments are –upper bounds for concepts and descriptions, –lower bounds for relationships. Accurate calculations are only possible if SNOMED would provide reasons for change along the lines described. The move towards a new information/distribution model might be a good opportunity to start doing so.


Download ppt "New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U."

Similar presentations


Ads by Google