Download presentation
Presentation is loading. Please wait.
Published byDenise Ferriss Modified over 10 years ago
1
Automatic Mapping of Clinical Documentation to SNOMED CT Holger Stenzhorn Saarland University Hospital, Homburg, Germany Edson Pacheco Percy Nohama Stefan SchulzFreiburg University Medical Center, Germany Federal Technological University of Paraná, Brazil
2
Introduction Methods Results Conclusion
3
Background Important role of narrative content in the EHR Manual coding: cost, quality and scope problems Increasing demand for high-quality structured data SNOMED CT as a new terminological standard claims to represent the whole clinical process Can language technology help semantically enrich narratives in the Electronic Health Record ? Introduction Methods Results Conclusion
4
Case study Source: – discharge summaries from the cardiology department of the Hospital de Clínicas de Porto Alegre, Brazil – Language: Portuguese Target SNOMED Clinical Terms, 01/2009 Languages: English, Spanish Introduction Methods Results Conclusion
5
Sample Discharge Summary # HAS # DM # Miocardiopatia dilatada chagásica (FE 35%) # Ca de prostata - orquiectomia (2004) # Cardiopatia isquêmica - IAM em 2005, com colocação de stent em DA e lesão severa inoperável em CD Pct vem a emergência em 20/03 com quadro de dor torácica típica, sem elevação enzimática, com diagnóstico de angina instável e fibrilação atrial não identificada em avaliações prévias. Adicionalmente, apresentava descompensação do diabetes com sindrome hiperosmlar não cetótica. Recebe tratamento clínico para otimização do quadro e é submetido a novo cateterismo em 28/03, que demonstra CD ocluída no terço proximal, DA com stent rpoximal com lesão de 40% no seu interior e Mg de Cx com lesão de 60-65%. Recebe alta em bom estado geral, sem dor torácica, anticoagulado, com plano de retorno ambulatorial para equipe de cardiopatia isquêmica e para o ambulatório de anticoagulação. Acronyms Abbreviations Punctuation errors Typing errors Telegram Style Introduction Methods Results Conclusion
7
NLP pipeline sentence detecting spell checking acronym expansion NE recognition POS tagging NP extraction context detection morpho- semantic abstraction SCT - EN SCT - SP subset creation morpho- semantic abstraction MID- Representation SNOMED CT MID- Representation Term candidates Introduction Methods Results Conclusion
8
Language processing tools implemented Sentence splitter, POS tagger: openNLP, trained with manually annotated texts Acronym expander: RegExp matching against acronym database, disambiguation by local context (token cooccurrence in a three token window) Noun phrase detector: driven by typical POS patterns in Spanish SNOMED CT descriptions (with few adaptations to Portuguese, due to the similarity between the two languages) Not yet implemented: spell checker, NE-recognizer, context (e.g. negation) detector Introduction Methods Results Conclusion
9
Morphosemantic Abstraction Using MSI (morphosemantic indexing) toolkit (Averbis GmbH, Freiburg) Extraction of significant word fragments (subwords) and mapping to semantic identifiers (MIDs): #derm = {heart, cardiac, herz, kard, corac, cardiac, coeur, … } #inflamm = { inflamm, -itic, -itis, -phlog, entzuend, -itis, inflam, flog, inflam, flog,... } Thesaurus ~ 21.000 equivalence classes Lexicon entries: – English:~23.000 – German:~24.000 – Portuguese:~15.000 – Spanish:~11.000 – French:~ 8.000 – Swedish:~10.000 – Italian:~ 4.000 muscle myo muskel muscul inflamm -itis inflam entzünd Eq Class subword herz heart card corazon card INFLAMM MUSCLE HEART Introduction Methods Results Conclusion
10
Methods: NLP pipeline sentence detecting spell checking acronym expansion NE recognition POS tagging NP extraction context detection morpho- semantic abstraction SCT - EN SCT - SP subset creation morpho- semantic abstraction MID- Representation SNOMED CT MID- Representation Term candidates Mapping Heuristics Introduction Methods Results Conclusion
11
SNOMED CT Concepts as Subwords SNOMED CT Concept Description MIDs ENG: Congestive heart failure#abund #cardiac #deficien ENG: Congestive heart disease#abund #cardiac #disorder ENG: Congestive cardiac failure#abund #cardiac #deficien SPA: Insuficiencia cardíaca #insuff #cardiac SPA: Insuficiencia cardíaca congestiva#insuff #cardiac #abund Introduction Methods Results Conclusion
12
Mapping heuristics For each term candidate decide whether there is a matching SNOMED description if yes, find the best SNOMED description map to the pertaining SNOMED description Preference criteria: matching with “term-typical” POS patterns MID coincidence (weighted by tf-idf) threshold: 60% In case of failure: test whether term candidate corresponds to two SNOMED concepts. Plausibility of concept coordinations using SNOMED relationship table Introduction Methods Results Conclusion
13
Gold standard (kappa = 0.89) Introduction Methods Results Conclusion
14
First results Introduction Methods Results Conclusion
15
Conclusion Work in progress – Encouraging preliminary results – SNOMED mapping possible across language boundaries Future work – Implement and test pipeline elements not implemented so far – Measure impact of each pipeline element for mapping quality – Scientific challenges: Automated context (e.g. plan, order, negation) identification Use of SNOMED CT’s ontological structure for improving mapping result Introduction Methods Results Conclusion
16
Acknowledgements German Research Foundation (DFG) International Bureau of the German Ministry of Research (BMBF-IB) Brazilian National Research Council (CNPq) Hospital de Clínicas de Porto Alegre (HCPA) Averbis GmbH, Germany
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.