Ioana Barbantan and Rodica Potolea. Lots of technology to capture health information.

Slides:



Advertisements
Similar presentations
Healthcare Eligibility Benefit Inquiry & Response (270/271) A High-Level Comparison of v4010A1 to v5010 and The CAQH CORE Operating Rules Help National.
Advertisements

A centralized approach to language resources Piek Vossen S&T Forum on Multilingualism, Luxembourg, June 6th 2005.
The Meaning of Language
Intelligent Information Retrieval CS 336 –Lecture 3: Text Operations Xiaoyan Li Spring 2006.
Assistant Professor, (Program for Linguistics)
Text Mining of Medical Documents Michael Elhadad - Raphael Cohen Dept of Computer Science.
Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid.
Project topics Projects are due till the end of May Choose one of these topics or think of something else you’d like to code and send me the details (so.
PSY 369: Psycholinguistics Some basic linguistic theory part3.
Transcultural Research Prof. N.J.Mathers Institute of primary and community care University of Sheffield.UK 06/Nov/2007.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
Negative Prefixes pleasant dress decided Un cooked acceptable did
Medical Informatics Basics
Debbie Poslosky Taken from the Common Core Standard Document.
LIFELONG LEARNING PROGRAMME LEONARDO DA VINCI Transfer of innovation GR1-LEO EcoQualify III: Workshop 4 – May 30 th - June 1 st,
Assessing higher education learning outcomes globally Professor Hamish Coates
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Querying Across Languages: A Dictionary-Based Approach to Multilingual Information Retrieval Doctorate Course Web Information Retrieval Speaker Gaia Trecarichi.
Towards Drafting a Risk Ontology based on the IRIS Risk Glossary SUMMER ACADEMY Sep 1 st – Sep 4 th 2009 Nick Bassiliades, Dimitris Vrakas Logic Programming.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Nathan Walker building an ediscovery framework. armasv.org Objective Present an IT-centric perspective to consider when building an eDiscovery framework.
The PATENTSCOPE search system: CLIR February 2013 Sandrine Ammann Marketing & Communications Officer.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
HYGIA: Design and Application of New Techniques of Artificial Intelligence for the Acquisition and Use of Represented Medical Knowledge as Care Pathways.
1 Using Semantic Dependencies to Mine Depressive Symptoms from Consultation Records Chung-Hsien Wu and Liang-Chih, Yu National Cheng Kung University Fong-Lin.
1 A Hierarchical Approach to Wrapper Induction Presentation by Tim Chartrand of A paper bypaper Ion Muslea, Steve Minton and Craig Knoblock.
IntroductionMethods & MaterialsResults Conclusions The Office of Standards and Interoperability (OFTSI) of the Foundation TicSalut, is working on the need.
Quality Control for Wordnet Development in BalkaNet Pavel Smrž Faculty of Informatics, Masaryk University in Brno, Czech.
Dr. Tarek El Sewedy Department of Medical Laboratory Technology Faculty of Allied Medical Sciences Faculty of Allied Medical Sciences.
Medical Terminology Fundamentals. Medical Terminology The study of terms that are used in the art & science of medicine. It is the universal language.
Table of Contents Health Science and Technology Education A PPLIED E DUCATIONAL S YSTEMS Word Parts: Roots.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
“Set” American Sign Language IV. What do I do?  Each of you will receive a 3x5 card.
CHAPTER 10 – VOCABULARY: STUDENTS IN CHARGE Presenter: Laura Mizuha 1.
EENG 4910/4990 Engineering Design Murali Varanasi September 02, 2009.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
Chapter Eleven Individuals With Speech and Language Impairments.
1 Language Specific Crawler for Myanmar Web Pages Pann Yu Mon Management and Information System Engineering Department Nagaoka University of Technology,
Designing a Machine Translation Project Lori Levin and Alon Lavie Language Technologies Institute Carnegie Mellon University CATANAL Planning Meeting Barrow,
Detection of Spelling Errors in Swedish Clinical Text Nizamuddin Uddin and Hercules Dalianis Department of Computer and Systems Sciences, (DSV)
MORPHOLOGY definition; variability among languages.
BASIC TRANSLATION THEORIES
Translation quality control in DGT Derk Huizing Quality adviser Directorate-general for Translation Directorate C 6 November 2015.
Common Core State Standards in English/Language Arts What science teachers need to know.
Slang. Informal verbal communication that is generally unacceptable for formal writing.
How to search for relevant information. Preparing to search: PLAN WHAT am I looking for? WHY do I want it? WHEN? Time period? HOW? Document type? What.
Health IT Workforce Curriculum Version 1.0/Fall 2010 Component 10/Unit 5a 1 Fundamentals of Workflow Analysis and Process Redesign Unit 10.5a Process Analysis.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A method of extracting malicious expressions in bulletin board systems by using context analysis Presenter:
Best-of-Breed Hybrid Methods for Text De-identification Yang H, Garibaldi JM. Automatic detection of protected health information from clinical narratives.
EuroRec Functional Statements Repository Regional Conference on Quality Labelling and Certification Belgrade, November 21, 2011.
SQA project process standards IEEE software engineering standards
Implementation of Electronic Health Records(EHR) at Victoria Hospital
Using a Medical Dictionary
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
SQA project process standards IEEE software engineering standards
Statistical NLP: Lecture 9
Token generation - stemming
Clustering Similar Clinical Documents in Electronic Health Records
Morphoogle - A Multilingual Interface to a Web Search Engine
SNOMED-CT representation Radiologic report Admission Letter
What is Dictogloss?.
Information Retrieval and Web Design
Text Mining of Medical Documents
Statistical NLP : Lecture 9 Word Sense Disambiguation
Using Dictionaries in Translation (223 TRAJ)
Presentation transcript:

Ioana Barbantan and Rodica Potolea

Lots of technology to capture health information.

What happens when the device’s native language is not the speaker’s native language?

Goal Identifying negation in the EHR, towards retrieving relations among medical concepts. Adapting Romanian based on established English methodology.

Basic methods Interpreting the structure of the words and evaluating existence of the words with and without prefix in the language.

Morphologic negation indicated by prefixes such as in-, im-, il-, dis-, un- or by the suffixes –less, and –out (eg, without) Negation, Text Worlds, and Discourse: The Pragmatics of Fiction By Laura Hidalgo-Downing

Medical Records Expectation that negative prefixes are broadly used and negation is clearly formulated as the EHR should be clear and as few ambiguous terms as possible

Negation in medical records There are lots of ways to say the same thing Eg – The patient has no symptoms. (syntactic negation) – The patient is asymptomatic. (Morphologic Negation) – The patient doesn’t have symptoms. (syntactic negation)

Morphologic Negation

Goal: Source language (English) to target language (Romanian) – Instantiate cross language methodology that identifies morphologic negation in both the source and target languages Task: Negation identification in EHRs

Dataset EHRs available in English – Semi-structured documents – Inpatient – Contains symptoms, history, procedures, medications 1. Translate into Romanian using online translation service. 2. Use a dictionary-based approach to identify morphologic negation

Rules for negation identification

Proposed methodology

RoPreNext Algorithm Considers words with Romanian dictionary online The dictionary interlinks the words with their definitions (and has integrated synonyms) Included an additional verification step (for regional/rural expressions)

Lemmatization process For each word that is a possible negated concept, – Remove prefix and preprocess If a match with the preprocess and the dictionary, – Send to the negation identification rules

Morphologic negation rules Literal words: preprocessing step applied for words in the dictionary Definition content: identifies negation based on the definition. Undefined prefix word: word not defined in the dictionary (and could be domain specific).

Experiments

Rules coverage

Limitations Translated documents – Not one-on-one translation Did not include any language-specific methodologies for text analysis Word-level issues – Root structure changes not caught Dictionary level issues – May not have specialized terms (atraumatic)

Conclusions Reliable – False-positives are not medical-related concepts Future work – Will first spell-check documents – Look into abbreviations

Questions How can we apply this? Could it be used for additional languages?