Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open Health Natural Language Processing Consortium

Similar presentations

Presentation on theme: "Open Health Natural Language Processing Consortium"— Presentation transcript:

1 Open Health Natural Language Processing Consortium
(part of caBIG Vocabulary Knowledge Center web presence) Goal foster an open-source collaborative community around clinical NLP that can deliver best-of-breed annotators, leverage the dynamic features of UIMA flow-control, and establish the infrastructure for clinical NLP. Two open source releases as part of OHNLP Mayo’s pipeline for processing clinical notes (cTAKES) IBM’s pipeline for processing medical notes (MedKAT) and pathology reports (MedKAT/P)



4 cTAKES Technical Details
Open source release March 15, 2009 Downloads: Documentation and Downloads Technical details: Publications Framework IBM’s Unstructured Information Management Architecture (UIMA) open source framework Methods Natural Language Processing methods (NLP) Application High-throughput phenotype extraction system (80M+ notes; 80B+ tokens)

5 cTAKES Components Core components
Sentence boundary detection (OpenNLP) Tokenization (rule-based) Morphologic normalization (NLM’s “norm”) POS tagging (OpenNLP) Shallow parsing (OpenNLP) Named Entity Recognition Diseases/disorders, signs/symptoms, procedures, anatomical sites, medications Dictionary mapping (lookup algorithm) Machine learning (MAWUI) Negation and status identification (NegEx)

6 cTAKES Type System

7 cTAKES example

8 Current Efforts - I Anaphoric relations and coreference (as part of the Ontology Development and Information Extraction project, University of Pittsburgh) ( ) In collaboration with Chapman and Crowley Semantic processing of the clinical text (in collaboration with Palmer, Martin and Ward, University of Colorado) ( ) Treebanking (deep parses) Predicate-argument structure and semantic labeling (PropBanking) UMLS relations (except temporal relations)

9 Current Efforts - II Temporal relation discovery (2010-2014)
In collaboration with Palmer, Martin and Ward, University of Colorado Lexical resources for the clinical domain ( ) In collaboration with Chapman, University of Colorado and Elhadad, Columbia University A la Treebank and clinical named entities with attributes and modifiers

Download ppt "Open Health Natural Language Processing Consortium"

Similar presentations

Ads by Google