Acknowledgements Contact Information Objective An automated annotation tool was developed to assist human annotators in the efficient production of a high.

Slides:



Advertisements
Similar presentations
David P. Taylor, MS 1,2, Nathan C. Hulse, PhD 1,2, Grant M. Wood 2, Peter J. Haug, MD 1,2, Marc S. Williams, MD 1,2 1 University of Utah, Salt Lake City,
Advertisements

Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Sepsis Temporal Model Methodology Dynamic.
►Identify the importance of text complexity in disciplinary literacy. ►Compare the CCSS grade level expectations for text complexity. ►Identify the three.
The Massachusetts Early Childhood Linkage Initiative (MECLI) John A. Lippitt, Ph.D. Jack P. Shonkoff, M.D. Institute for Child, Youth, and Family Policy.
© author(s) of these slides including research results from the KOM research network and TU Darmstadt; otherwise it is specified at the respective slide.
MOLEDINA-1 CSE 5810 CSE5810: Intro to Biomedical Informatics The Role of AI in Clinical Decision Support Saahil Moledina University of Connecticut
A general-purpose text annotation tool called Knowtator is presented. Knowtator facilitates the manual creation of annotated corpora that can be used for.
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
Overview of Nursing Informatics
Grace CHENG Lewis CHOI Knowledge Management Unit Hospital Authority Leveraging Knowledge from Clinical Guidelines through Information Technologies.
Lessons from Biomedical Informatics for Nutrition Informatics James J. Cimino, M.D. Laboratory for Informatics Development NIH Clinical Center.
Theresa Tsosie-Robledo MS RN-BC February 15, 2012
HTA as a framework for task analysis Presenter: Hilary Ince, University of Idaho.
Mary Solheid #69 Nutrition Education and Counseling 3614 Fall 2013 Lesson Plan for 1500mg Diet.
Using observation to improve teaching and learning Robert C. Pianta, Ph.D. Dean, Curry School of Education Director, Center for Advanced Study of Teaching.
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Collaborating with Business: A Survey of Employers Participating in PWDNET December, 2012 Leah Lobato, Utah State Office of Rehabilitation Carol Ruddell,
General Considerations for Implementation
Darts anyone? A study in probability Sean Macduff ETEC 442 June 23, 2005.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
© VANDERBILT UNIVERSITY 2009 B I O M E D I C A L I N F O R M A T I C S A System to Improve Medication Safety in the Setting of Acute Kidney Injury Intervention.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Recruiting a representative patient population in a busy Emergency Department Lea H. Becker, MT(ASCP); Elaine Dube, CCRP; Weitao Wang, Kaitlyn Brill, Robert.
Cardiovascular and Diabetes Coalition of Indiana September 9, 2015 Tamara Finn, PHA Managing Advisor.
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
Open Health Natural Language Processing Consortium (OHNLP)
IPAW'08 – Salt Lake City, Utah, June 2008 Exploiting provenance to make sense of automated decisions in scientific workflows Paolo Missier, Suzanne Embury,
Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad
Graduate Certificate in Health Informatics Isabelle Bichindaritz, Ph.D. Institute of Technology University of Washington, Tacoma.
Biomedical and Health Informatics (BHI) at the University of Washington Peter Tarczy-Hornoch bhi.washington.edu IPHIE 2008 – Salt Lake City, UT.
Tagset Reductions in Morphosyntactic Tagging of Croatian Texts Željko Agić, Marko Tadić and Zdravko Dovedan University of Zagreb {zagic, mtadic,
VJ Periyakoil Productions presents. Byron Bair, MD, MBA, Director Veterans Rural Health Resource Center—Western Region, Salt Lake City, Utah VJ Periyakoil,
Component 6 - Health Management Information Systems Unit 1-2 What is Health Informatics?
Outcomes Tier 2 – PI-LDP Course Tier 3 – ATP or mini-ATP Tier 1 – ACT Program Three Tiers of QI TrainingAbstract DEVELOPMENT OF FACULTY MENTORS IN QUALITY.
Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.
Natural language processing tools Lê Đức Trọng 1.
Integrating a Federated Healthcare Data Query Platform With Electronic IRB Information Systems Shan He IPHIE 2010.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Using Text Mining and Natural Language Processing for.
MedKAT Medical Knowledge Analysis Tool December 2009.
BMI 205: P RECISION P RACTICE WITH B IG D ATA Daniel L. Rubin, MD, MS Associate Professor of Radiology and of Medicine (Biomedical Informatics) Department.
Department of Social Informatics Graduate School of Informatics Kyoto University, Japan July 8, 2004 The Social Informatics of Healthcare Infrastructure.
POS Tagger and Chunker for Tamil
Assessing Learners The Teaching Center Department of Pediatrics UNC School of Medicine The Teaching Center.
RTI International is a trade name of Research Triangle Institute Nancy Berkman, PhDMeera Viswanathan, PhD
Ontology-Based Interoperability Service for HL7 Interfaces Implementation Carolina González, Bernd Blobel and Diego López eHealth Competence Center, Regensurg.
EXTRAMURAL SUPPORT GRANTS AND CONTRACTS NATIONAL LIBRARY OF MEDICINE.
Pastra and Saggion, EACL 2003 Colouring Summaries BLEU Katerina Pastra and Horacio Saggion Department of Computer Science, Natural Language Processing.
Open Health Natural Language Processing Consortium
Discussion A considerable number of patients do not identify a PCP when admitted for inpatient care, and not all follow-up appointments take place with.
Practice Based Learning and Improvement Stephen J. Kimatian MD Assistant Professor of Anesthesiology and Pediatrics The Penn State, Milton S. Hershey Medical.
Figure 1. Data Flow Diagram of Davis County School Absenteeism Surveillance System. Shuying Shen, MStat 1,2,3 ; Nicole Stone, MPH 4 ; Brian Hatch, MPH.
Automating Maintenance of Care Team Relationships from Electronic Health Administrative Data to Decrease Variability of Care Coordination using the Health.
Using Interpreters in Medical Encounters: Assessing Medical Students STFM Predoctoral Conference, Feb Charleston, SC Désirée Lie, MD, MSED, Charles.
BMI 205: P RECISION P RACTICE WITH B IG D ATA Daniel L. Rubin, MD, MS Associate Professor of Radiology, of Medicine (Biomedical Informatics), and of Biomedical.
Expediting Precision Medicine Initiatives for Clinical Genomics and Pharma through the Use of Knowledge Automation and Analytics Presenters: Dr. Scott.
Preliminary Themes Related to the Stakeholder Engagement for Automated Data Acquisition for Heart Failure Megha Kalsy, MS1, Natalie Kelly, MBA3, Jennifer.
A knowledge-based text annotation tool
Introduction Characteristics Advantages Limitations
Healthcare Informatics and Information Systems
Idealized Natural A Systematic Yet Flexible Systems Analysis Framework
Automated Vocabulary Maintenance System for the Open Access, Collaborative Consumer Health Vocabulary Kristina M Doing-Harris, BCompSci, MA, MS, PhD; Qing.
Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD
HI 5354 – Cognitive Engineering
The Impact of Health IT Adoption: Are We Measuring the Right Outcomes?
Computational Linguistics: New Vistas
PURE Learning Plan Richard Lee, James Chen,.
RegionAl: an Optimized Regional Classifier to Predict Mortality in
Information Retrieval
Presentation transcript:

Acknowledgements Contact Information Objective An automated annotation tool was developed to assist human annotators in the efficient production of a high quality, reference standard of part-of-speech tags. It was important to reduce demand effect bias that could be caused by system generated part-of-speech cues being accepted by human annotators even when inaccurate. Background For an ongoing research study exploring domain adaptation of a Natural Language Processing (NLP), part-of- speech tagging system, a reference standard for the target domain of healthcare was created using clinical text reports. One of the challenges faced in clinical NLP is the limited availability of high quality annotated clinical text required for training NLP machine learning models, as in this case, a Transformation Based Learner. Methods (con’t)... was not prompted based on a tag being displayed multiple times. A web- interface was created to display each sentence to be annotated. For each word, the auto-generated cue tags along with a drop-down containing the complete tag-set were displayed, as illustrated in Figure 1.The drop-down was populated with a default only when all systems generated the same tag, otherwise, the human annotator needed to actively select a tag. Regardless of default, any tag could be manually selected as the appropriate tag for each word. Providing Annotation Options, Without Revealing Too Much Jeffrey P Ferraro MS, 1-2 Scott L DuVall PhD, 3-4 Peter Haug MD Intermountain Healthcare, Salt Lake City, UT –– 2 Department of Biomedical Informatics University of Utah, Salt Lake City, UT 3 Internal Medicine University of Utah, Salt Lake City, UT –– 4 VA Salt Lake City Health Care System, Salt Lake City, UT Methods Six (6) part-of-speech tagging systems were used with their out-of-the-box tagging models to initially tag the clinical text corpus. These include the Stanford Log-linear Part-Of-Speech Tagger 1, OpenNLP pos–tagger 2, MorphAdorner 3, LingPipe 4, the Illinois Part of Speech Tagger 5, and the Specialist Lexicon 6 modified to only contain non–ambiguous terms. Tag output from each system was mapped to a normalized tag-set. To control for demand effect, tags generated by the six taggers were filtered for duplicates leaving only distinct tag choices so that selection... Results A clinical text corpus of 212 randomly selected sentences from the 10 most common clinical report types from Intermountain Medical Center were annotated by two annotators using the annotation tool. An Inter-annotator agreement of 0.95 (p-value < ) was achieved over 3,672 tagged words using Fleiss’ kappa. Demand effect was analyzed by evaluating the kappa between the human annotators and the six taggers. We would expect to see a relatively high kappa score between human annotators and the taggers as these taggers are on average 80-86% accurate on clinical texts. We would not, however, expect to see scores between human annotator and tagger reaching the same levels of which are seen between human annotators. This would be indicative of passive system agreement and demand effect. We would expect, and did see, higher kappa scores between human annotators reflecting reliance on their expert knowledge of the target domain. Conclusion Using the approach described, we were able to leverage existing, moderately accurate, NLP tools to assist human annotators in producing a high quality reference standard without the introduction of bias due to the semi- automated support for the manual annotation process. References 1.The Stanford Natural Language Processing Group. Stanford Log-linear Part-Of-Speech Tagger. 2.OpenNLP. OpenNLP Tools. sourceforge.net// 3. Academic and Research Technologies, Northwestern University. MorphAdorner Alias-i. LingPipe. 5.Cognitive Computation Group. University of Illinois at Urbana-Champaign. Illinois Part of Speech Tagger. BJPOS 6.The Lexical Systems Group. National Library of Medicine. Specialist NLP Tools. AnnotatorsKappa HA1 vs. HA HA1 vs. Illinois Tagger0.859 HA2 vs. Illinois Tagger0.852 HA1 vs. Stanford Tagger0.850 HA2 vs. Stanford Tagger0.843 HA1 vs. OpenNLP0.850 HA2 vs. OpenNLP0.841 HA1 vs. MorphAdorner0.822 HA2 vs. MorphAdorner0.820 HA1 vs. LingPipe0.810 HA2 vs. LingPipe0.807 HA1 vs. Spec. Lexicon0.803 HA2 vs. Spec.Lexicon0.815