Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Infrastructure and Ability to add in data models/vocabulary (Dynamic Extensions, etc) Session #1 October 4, 2010.

Similar presentations


Presentation on theme: "Semantic Infrastructure and Ability to add in data models/vocabulary (Dynamic Extensions, etc) Session #1 October 4, 2010."— Presentation transcript:

1 Semantic Infrastructure and Ability to add in data models/vocabulary (Dynamic Extensions, etc) Session #1 October 4, 2010

2 Agenda Discuss Domain needs for Dynamic Extensions or similar method Discuss how that fits with the new Semantic Infrastructure – how do we ensure it is taken into account? Querying over the Grid for these new data elements (discovery, ability to query). Vocabulary needs from the community: preferred lists for diseases, for race, for specimen description, etc - - how will they be determined, discovered, and used in semantic infrastructure? Note: Nov 3-4, 2010 TBPT F2F – we would like to communicate on the impact of Semantic Infrastructure 2.0 to community - need Government input Radiology SMEs are also waiting for AIM/PAIS resolution, and how semantic infrastructure will support their direction.

3 Appendix Background Material at end: Dynamic Extensions Use in caTissue Suite Example, Emory (Sept 2009) analysis of AIM and how may not support path, their initial proposal for MicroAIM Semantic Infrastructure Additional Relevant portions

4 Domain needs for DE (Dynamic Extensions) Discuss Domain needs for Dynamic Extensions or similar method caTissue Suite – ability for user to create “Smoking History”, Prostate SPORE specific data elements, etc. during localization of tool at an institute or consortium level Support research endeavors, such as creating new imaging (radiology/pathology) descriptions, terms/data elements that reflect new observation-types that are needed for multi-site studies to report on. Adding Classes/Attributes/Vocab ularies to Get Your Research Done for Your Project Contributing classes/attributes/vocabularies, etc. in a controlled manner so these new items can be queryable by others to find specimens, etc. Ideal: Need to make seamless, fast turnaround process to make this work! … Rely on Dynamic Extensions COLLABORATIVEISOLATED

5 Dynamic Extensions: High level goals Ability to create new classes (entities) dynamically and associate them with static model Class, Attribute, associations metadata - Add/Edit/Delete Data entry form generation: Form view and Spreadsheet view Validation rules (UI as well as backend) Permissible values support (custom, EVS, CDE) Security for identified attributes Entity reuse - Copy & Share attributes XMI: Import and export caCORE API to add, edit and read caGrid compatibility requires Dynamic Extensions compatibility review process?

6 Metadata structure MODEL CLASS ATTRIBUTES 1..* PERMISSIBLE VALUES 1..* caDSR EVS CDE reuse Class reuse Concept Code Download value domain

7 caTissue Dynamic Extensions  Add new data elements to caTissue through Dynamic Extensions (DEs)  DEs can be created through caTissue’s UI or programmatically through the API  Ad Hoc UI creation of DEs is useful when you are creating one or two extensions at a single institution  Programmatic DE creation is convenient when creating multiple extensions or creating extensions for multiple sites

8 Semantic Infrastructure 2.0 Relevant portions from September 2010 version: https://wiki.nci.nih.gov/download/attachments/29563169/CCBIIT_Semanti c_Infrastructure_2.0_Roadmap_Sept_6_2010.pdfhttps://wiki.nci.nih.gov/download/attachments/29563169/CCBIIT_Semanti c_Infrastructure_2.0_Roadmap_Sept_6_2010.pdf

9 Semantic Infrastructure 2.0

10

11

12

13 Semantic Infrastructure

14 Semantic Infrastructure Roadmap Concept Map

15 Radiology/AIM/ Pathology Imaging Efforts Need to clarify approach to AIM (Annotation and Image Markup) What does it mean, what is the unified approach for CBIIT? Note that AIM group (Stanford/Northwestern) are also considering that they may be able to support Annotation/develop a model that will support annotation for caTissue, Pathology, Imaging (From Daniel Rubin (Radiologist, Stanford)) “As I previously mentioned, we strongly believe that we should unify the disparate efforts related to image description/annotation among Radiology (AIM), Tissue (caTissue), and Digital Pathology. This will not only reduce fragmentation and improve interoperability, it will also enable substantive integration of radiology/pathology/molecular data needed for our Enterprise Use Cases and Big Health. Can we arrange a tcon with the key parties to discuss steps to move this forward?” Next Follow-up items to clarify direction? AIM (Rubin, Mongolkwat) are looking for a teleconference with Fore(TBPT), IMG

16 Annotation and Image Markup (Northwestern, Stanford) When AIM is used to describe annotations, each information component, anatomic entity, observation, measurement, etc, is explicitly captured in a semantically precise and computationally accessible manner. Thus, in the example above, an AIM-enabled PACS workstation would generate a pick list of RadLex® anatomic terms, from which the radiologist would select middle lobe of right lung. The specific RadLex® identifier for that location (RID1310) would automatically be embedded in the annotation. Similarly, the PACS workstation would generate pick lists for the AIM observation (mass, RID3874) and the AIM observation characteristic (enhancing, RID6065). The AIM can also contain the x and y coordinates of an outline drawn around a lesion or the coordinates of an arrow pointing to a lesion, as these are generated by the user of the PACS workstation. If calculations—for example, longest diameter or area—were performed by the workstation, then AIM could store these results. The latter are part of a list of standardized measurements. The details of the AIM information model are described elsewhere (3).3 Once an annotation is defined in the AIM model, making sophisticated queries becomes relatively simple. Our query “Find all studies that contain enhancing right middle lobe lung masses that measure between 5 and 6 cm 2 ” becomes “Find all image references in AIM annotations where AIM: Anatomic Entity = RID1310, AIM: Imaging Observation = RID3874, AIM: Observation Characteristic = RID6065, AIM: Calculation = Area and AIM: Calculation Result >5 and <6 cm 2.” The exact syntax and the mechanisms used to execute such a query are more complicated than those presented here but are well defined. Source: http://radiology.rsna.org/content/253/3/590.full

17 Center for Comprehensive Informatics AIM Data Model (V2.0 rev5) Overview AIM Sept 2009 Pres

18 How will Semantic Infrastructure Address: Dynamic Extensions -- how will this be supported (Service for them “registered” How can these terms be discoverable by other sites that want to incorporate them? How will they be queried over the grid? Vocabularies – how will users be able to select the “preferred name” but also view the synonyms? Pathology observations, Radiology observations on an image– how do they fit? Observations might be empirical, but could be generated from output of Algorithms acting on small area of image (i.e. cell counts, or automatic staining detection, outline, cardiac output from measurements on images) How about the RESULTING diagnosis/reporting that is the conclusions of of the data they observed? (this might be more static: i.e. breast ductal carcinoma in situ, hepatocellular chirrosis This is what is often passed on to other research systems (biobanks, clinical trials, etc)

19 TBPT F2F Discussion (Nov 3) – on Semantic Infrastructure Working Session(s): Semantic Infrastructure – Discussion on how it will move forward, how will biorepository management and pathology data be supported. Semantic Infrastructure Topics may include: How it will apply to Pathology Data and Biorepository Data – how will querying be developed and managed across the caBIG program – how will it help the end-user? Discuss with ARCH/VCDE team on Oct 4 meeting (mention) – Need additional meeting that week. Need answers at F2F meeting.

20 Appendix Background Material at end: Dynamic Extensions Use in caTissue Suite Example, Emory (Sept 2009) analysis of AIM and how may not support path, their initial proposal for MicroAIM Semantic Infrastructure Additional Relevant portions

21 DYNAMIC EXTENSIONS IN CATISSUE SUITE (EXAMPLES) Background Slides

22 caTissue use cases Administrator creates new custom annotations System stores metadata and creates RDBMS tables to store actual data CREATE Technician wants to add custom annotation System auto generates web page to add custom annotation System adds data to the database DATA Researcher query for Specimens based on custom annotation System auto generates query criteria pages and provides ability to query across static and dynamic model QUERY

23 Creating DEs using the caTissue Suite User Interface The four main features of DEs are: Form Creation Containment Linking Inheritance Administrators can create DEs in caTissue 1.By using EA to create model and XMI file and IMPORT into caTissue, send through Review Process 2.Create DE in caTissue, then export XMI and send through Review Process STEPS for caTissue Dynamic Extension creation: Requirement Document Design Model in Enterprise Architect Create XMI File upload into CaTissue Create Permissible Value file upload into CaTissue. Create Form Definition File upload into caTissue

24 Using DEs Forms can be viewed and data can be entered/edited by navigating to the collection protocol based view under Biospecimen Data. 1. Click the Biospecimen Data tab 2. Select the collection protocol for which the form was created 3. Select the hook entity under which the form was created 4. Click the View Annotation tab 5. Select the desired form from the Annotation Forms drop-down list 6. View or enter/edit data in the form

25 DE at Indiana University (S. Ragg, G. Schadow, A. McMaho, Persisten) - 16 Dynamic Extensions CreatedPediatrics Oncology Oncology  Acute Lymphoblastic Leukemia  Wilms Tumor  Osteosarcoma  Neuroblastoma  Ewings Sarcoma Hematology  Sickle Cell Disease Neonatology Inflammatory Bowl Disease  Crohn’s Disease  Ulcerative Colitis Pediatrics Endocrinology Endocrinology  Diabetes Mellitus Type 1  Diabetes Mellitus Type 2 Cardiology CardiologyObstetricsOphthalmology  Glaucoma  Cataract  Retinal Disease Medicine Cardiology

26 DE at Indiana - Workflow Dynamic Extension: Wilms Tumor

27 DE in Prostate SPORES project (Prostate SPORES consortium) Inter-Prostate SPORE Biomarker Study (IPBS)  Converted case report forms (CRFs) to spreadsheet  Spreadsheets were reviewed and mapped to existing data elements in UML models of caTissue Core  Distributed spreadsheet to several sites  Identified obvious matches in Core/Suite model  Assisted by caTissue development team in verifying and finding less-obvious matches  Identified missing classes/attributes  Some of these elements were incorporated into the next version of the caTissue model From Andrew Helsley, 2008 caBIG Annual Meeting

28 IPBS Dynamic Extensions  IPBS data elements not mapped to caTissue Core:  Determined to be out of scope:  Quality of life items  Patient Questionnaire.  Follow-up data  Chosen for DE on:  Participant  Specimen  Specimen Collection Group  Elements from spreadsheets not in caTissue Core were modeled with Enterprise Architect  XMI created from UML models  Dynamic Extension SOP for IPBS From Andrew Helsley, 2008 caBIG Annual Meeting

29 IPBS Dynamic Extensions for the Specimen Hook Entity

30 AIM Background (from September 2009 slides by Emory) Presentation given to TBPT as discussions started on understanding AIM, where it was, how it could support pathology Emory presented case for their MicroAIM – which has currently evolved to their PAIS (did not have current information available)

31 Center for Comprehensive Informatics Gap Analysis and Limitations One annotation per object –Annotation in one AIM document must be of the same observation type (no normal and cancer nuclei together) –One annotation per object leads to serious data redundancy and query performance deterioration Limited information in Annotation Of Annotation –AoA provides group level information derived or calculated from multiple annotations, but limited metadata of ReferencedAnnotation leads to dereference of all linked annotations for queries or analysis Unknown levels of nesting for Annotation of Annotation –Results in queries that have unknown number of joins for a single query, or unknown number of queries Gap Sept 2009 Pres

32 Center for Comprehensive Informatics Gap Analysis and Limitations (cont’d) Insufficient grouping relationship of related annotations –Project information not modeled –Grouping of closely related annotations are not explicitly modeled, such as annotations for time series, validation set, serial section, etc Limited markup types –Multipoint to represent polygon, semantically confusing –3D geometric shapes missing –No image based markups such as masks or tensors –Markups can also be animations, such as moving cells Lack of provenance information for computations –Algorithms, parameters, and inputs Gap Sept 2009 Pres

33 Center for Comprehensive Informatics Gap Analysis and Limitations (con’td) Limited image reference information –No support of references to microscopy images (WSI, TMA images, etc) –Problem on referencing images from multiple modalities generated from different equipments Patient centric: no support for animal image annotations, and no specimen information Ontology based on RadLex –Need support for subcellular anatomic entities, pathology and biology concepts and observations No versioning support –Versioning for schema, document instance and ontology needed Gap Sept 2009 Pres

34 Center for Comprehensive Informatics Our Proposal: MicroAIM – Microscopy and Pathology Annotation and Image Markup Redefine AIM to support multiple objects and associate different observations to each object, within same document Observation on a single or multiple objects, and can be derived Calculation on a single or multiple objects, and can be derived Another general class of type to represent mask or field value Geometric shapes should be extended to encompass 3D shapes Represent provenance information for computed markup and annotation Support pathology image reference and specimen information MicroAIM Sept 2009 Pres

35 Center for Comprehensive Informatics Sketch of MicroAIM Concepts ImageReference: Metadata that describes an image or a group of images that are used as the base for making markup and annotation, and can be used to identify and retrieve them from an image archive or data service Annotation: Explanatory or descriptive information made by humans or machines directly related to the content of a referenced image or images Markup: graphical symbols associated with an image. Provenance: information that helps determine the derivation history of a markup or annotation, such as algorithm information, parameters, and other inputs Project: aggregation of related images, markup, or annotations, from which conclusions may be drawn Group: aggregation of closely related subset of annotation documents User: The person who creates the MicroAIM document MicroAIM Sept 2009 Pres

36 Center for Comprehensive Informatics Overview of MicroAIM (Work-In-Progress) MicroAIM Sept 2009 Pres

37 Semantic Infrastructure 2.0 Relevant portions from September 2010 version: https://wiki.nci.nih.gov/download/attachments/29563169/CCBIIT_Semanti c_Infrastructure_2.0_Roadmap_Sept_6_2010.pdfhttps://wiki.nci.nih.gov/download/attachments/29563169/CCBIIT_Semanti c_Infrastructure_2.0_Roadmap_Sept_6_2010.pdf


Download ppt "Semantic Infrastructure and Ability to add in data models/vocabulary (Dynamic Extensions, etc) Session #1 October 4, 2010."

Similar presentations


Ads by Google