6th Annual CTSOG Workshop, Ann Arbor MI

Slides:



Advertisements
Similar presentations
A didactic plan for a communicative translation class Dr. Constanza Gerding Salas Leipzig Universität - Universidad de Concepción May 2012.
Advertisements

Service User Survey 2011 Service User Survey Results 2011 Toni Martin – Senior Consultant Quality Health.
A Metric for Software Readability by Raymond P.L. Buse and Westley R. Weimer Presenters: John and Suman.
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 5, Unit B, Slide 1 Statistical Reasoning 5.
TAP-ET: TRANSLATION ADEQUACY AND PREFERENCE EVALUATION TOOL Mark Przybocki, Kay Peterson, Sébastien Bronsart May LREC 2008 Marrakech, Morocco.
MT Evaluation: Human Measures and Assessment Methods : Machine Translation Alon Lavie February 23, 2011.
Jan/Feb 2008CAPA Train-the-Trainer Workshop1 CAPA Training CAPA Examiners.
Division of Biomedical Informatics Beyond Interoperability: What Ontology Can Do for the EHR William R. Hogan, MD, MS July 30 th, 2011 International Conference.
Annotation of 311 Admission Summaries of the ICU Corpus Yefeng Wang.
Keyword extraction for metadata annotation of Learning Objects Lothar Lemnitzer, Paola Monachesi RANLP, Borovets 2007.
Early Results from the Council of Graduate Schools Study of the Bepress/UMI Online Submission Application ECURE March 2, 2005 Bill Savage UMI Dissertations.
Teaching and Testing Pertemuan 13
Using Information Extraction for Question Answering Done by Rani Qumsiyeh.
HL7 RIM Exegesis and Critique Regenstrief Institute, November 8, 2005 Barry Smith Director National Center for Ontological Research.
Ontology-based Information Extraction for Business Intelligence
This project was supported by the Eunice Kennedy Shriver National Institute Of Child Health & Human Development of the National Institutes of Health under.
IMPROVING THE DOCUMENTATION OF DIAGNOSES Carol A. Lewis.
Cis-Regulatory/ Text Mining Interface Discussion.
Logic Programming for Natural Language Processing Menyoung Lee TJHSST Computer Systems Lab Mentor: Matt Parker Analytic Services, Inc.
Guideline development,re- licensing, continuing education and the nursing profession Dr.Cordula Wagner NIVEL
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
 CiteGraph: A Citation Network System for MEDLINE Articles and Analysis Qing Zhang 1,2, Hong Yu 1,3 1 University of Massachusetts Medical School, Worcester,
Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.
Patterns of Square Numbers Module 1. A Question For You… You are helping your niece with her homework and she says, “I notice that every time I square.
Responsible Engineers Framing the Problem. How do we address a problem? When addressing an ethical dilemma, we usually experience moral disagreement and.
Chapter 10: Hypothesis Testing Using a Single Sample.
Guidelines to Test Preparation Guidelines to Test Preparation.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
® Changes in Opioid Use Over One Year in Patients with Chronic Low Back Pain Alejandra Garza, Gerald Kizerian, PhD, Sandra Burge, PhD The University of.
Big Mechanism for Processing EEG Clinical Information on Big Data Aim 1: Automatically Recognize and Time-Align Events in EEG Signals Aim 2: Automatically.
Validation and recognition learning outcomes achieved abroad using ECVET Karin Luomi-Messerer – 3s Daniela Ulicna – GHK ECVET Pilot Projects 7th Seminar.
Consent & Vulnerable Adults Aim: To provide an opportunity for Primary Care Staff to explore issues related to consent & vulnerable adults.
Based on the performance appraisal system, the nursing home reported an improvement in the reduction of medication errors. However, adverse clinical.
Copyright © 2016 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
NIH Considerations for CBI Trainees Leslie Kinsland November 12, 2015.
EVALUATION SUFFECIENCY Types of Tests Items ( part I)
Outline Feedback and Structure. The texts you have produced.
Survey on Expert System Seung Jun Lee Dept. of Nuclear and Quantum Engineering KAIST Mar 3, 2003.
Evaluator Identification & Preview Sign your name at the end of the essay. Review objective of the PROGRESS CHECK. Take 2 minutes to preview your peers.
MEAP scores and beyond. 3 rd grade Math – 44% proficient. This is a 24% increase from last year’s 3 rd grade but about the same as the two years before.
December 4, 2009 State Board of Education adopted:  Oregon Diploma  Modified Diploma  Extended Diploma  Alternative.
5 th -6 th December th Meeting Paris WP2: NERC.
In 2014/15 a new national curriculum framework was introduced by the government for Years 1, 3, 4 and 5. However, Years 2 and 6 (due to statutory testing)
Identifying Actionable Post-Acute Stroke Concerns:
Assessing SNOMED CT for Large Scale eHealth Deployments in the EU Workpackage 2- Building new Evidence Daniel Karlsson, Linköping University Stefan Schulz,
Resume and Interview Tips
Automatically Labeled Data Generation for Large Scale Event Extraction
Developing items and assembling a test based on curriculum standards
Concept Grounding to Multiple Knowledge Bases via Indirect Supervision
Results and Discussion
Quality of Medical Care Received by Individuals with Mental Illnesses
Assessments Attitudes of UK Teachers & Parents
CRF &SVM in Medication Extraction
The ISSAIs for Financial Audit ISSAIs
Evaluating and improving a clinical practice guideline in the Western Cape, South Africa AIM STATEMENT: To design and use an appropriate evaluation tool.
A Presentation to Parents
Conducting an RHC Evaluation and
Human Language Technology Research Institute
9th – 12th May 2016 Information Evening for Parents
Key Stage 2 SATs.
A2 unit 4 Clinical Psychology
Partial Credit Scoring for Technology Enhanced Items
Stage 1 – Topic Sentences
Brain, Other CNS and Intracranial Tumours, Malignant (C70-C72, C75
Planning & Scheduling openSE 2.0 Task Force
Human Language Technology Research Institute
Asking the Right Questions
Cancer Patient Experience Survey, by Sex, England Female Male Persons
Client’s Rights & Choices
Celebrating Success and Making a Plan for Sustainability
Presentation transcript:

6th Annual CTSOG Workshop, Ann Arbor MI Measuring Interannotator Agreement in the Florida Annotated Corpus for Translational Science - The difficult ontological task Amanda Hicks University of Florida aehicks@ufl.edu 6th Annual CTSOG Workshop, Ann Arbor MI

Overview Overview of FACTS Interannotator Agreement scores The difficult task Can ontologists and philosophers do it better?

The Big Goal Comparatively evaluate the adequacy of ontologies for extracting patient-level data from unstructured text. Create a gold standard, ontologically annotated corpus for clinical and translational science. Annotate with multiple – and, as far as possible, competing –ontologies

The Florida Annotated Corpus for Translational Science (FACTS) Currently consists of 20 annotated case reports on hypertension Full text Freely available through PubMed English Within last 6 years Stratified by race, ethnicity, gender and age (<18 or 18+) Annotated with VSO, in the process of annotating with DOID Can be extended to other domains or document types

The Annotation Tasks Identify the assertions about a person mentioned in the corpus Annotate the entities referred to in those assertions with ontology classes Annotate the relations between individuals thereby representing the full assertion current tasks

The Annotation Process Follows the annotation procedure for the CRAFT corpus Two primary annotators annotated case reports with classes from the Vital Sign Ontology using BRAT One medical student, one public health specialist with training in nursing Primary annotations were sent to the lead annotator, who reviewed discrepancies Diffs were discussed at weekly meetings and consensus achieved, producing the gold standard. Annotation guidelines were used and revised during the annotation process.

Interannotator Agreement Was Low   f-measure exact matches only exact and partial matches Hypertension 1 1st set of10 case reports 0.50 0.54 Hypertension 2 2nd set of 10 case reports 0.60 0.69 Full Corpus 0.57 The CRAFT corpus achieves ~.90 f-score consistently.

What happened? One annotator had more training and experience than the other. However, IAA on Hypertension 2 is still quite low .06-.69. Our annotators performed two tasks, unlike CRAFT annotators. Identify the instance level assertions about an individual person mentioned in the corpus Annotate the entities referred to in those assertions with ontology classes

What is the major source of disagreement?   f-measure exact matches only exact and partial matches agreement of classes on matched spans only Hypertension 1 0.50 0.54 0.93 Hypertension 2 0.60 0.69 0.87 Full Corpus 0.57 0.90 When the primary annotators agree on the span, they tend to agree on the class. This suggests that the difficult task is determining whether a token expresses an instance level assertion.

Easy Cases "A 56-year-old man suddenly developed dyspnea after resection of choroidal melanoma” "Previous studies noted a significant association between melanoma and endothelin (ET)-1.” Sato K, Saji T, Kaneko T, Takahashi K, Sugi K. Unexpected pulmonary hypertensive crisis after surgery for ocular malignant melanoma. Life Sci. 2014;118(2):420-3. Epub 2014/03/19. doi: 10.1016/j.lfs.2014.03.004. PubMed PMID: 24632478.

Difficult cases Which terms denote individuals and which do not? The subordinate clauses make this an interesting case. "First, a massive amount of ET-1, which is a proliferation factor in malignant melanoma, was released due to mechanical stimulation from endoresection, a procedure in which the tumor is cut into very small fragments and aspirated (Fig. 6)." "As this is the first report of pulmonary hypertension after endoresection, it might be useful to determine the differences between our patient and other patients treated with endoresection.” We decided that that 'pulmonary hypertension' denotes a particular, but it is not clear to me that this is correct. Sato K, Saji T, Kaneko T, Takahashi K, Sugi K. Unexpected pulmonary hypertensive crisis after surgery for ocular malignant melanoma. Life Sci. 2014;118(2):420-3. Epub 2014/03/19. doi: 10.1016/j.lfs.2014.03.004. PubMed PMID: 24632478.

Next steps Can ontologists and philosophers agree more than the specialist annotators on the hard task? We will have working ontologists annotate a sample set of case reports for instance level statements. We will have philosophy graduate students annotate a sample set of case reports for instance level statements.

Acknowlegments Selja Seppälä, University of Cork Bill Hogan, University of Florida Carl Pepine, University of Florida Nathan Boire, University of Florida Chloe Herring, University of Florida This work was supported in part by the NIH/NCATS Clinical and Translational Science Award to the University of Florida UL1 TR000064. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the NCTE.