Presentation is loading. Please wait.

Presentation is loading. Please wait.

CTAKES The clinical Text Analysis and Knowledge Extraction System.

Similar presentations


Presentation on theme: "CTAKES The clinical Text Analysis and Knowledge Extraction System."— Presentation transcript:

1 cTAKES The clinical Text Analysis and Knowledge Extraction System

2 cTAKES Overview Open source software Natural Language Processing (NLP)
Developed at Mayo Clinic Contributed to the Open Health Natural Language Processing (OHNLP) Consortium Built on the Apache UIMA framework Unstructured Information Management Architecture UIMA framework itself is also open source

3 Open Health Natural Language Processing (OHNLP) Consortium
Goal: Foster an open-source collaborative community around clinical NLP that can deliver best-of-breed annotators, leverage the dynamic features of UIMA flow- control, and establish the infrastructure for clinical NLP.

4 www.ohnlp.org Gateway to News Documentation Downloads
Forums for asking questions Bug tracker for reporting issues List of publications

5 cTAKES Goals Phenotype extraction
Generic – to be used for a variety of retrievals and use cases Expandable – at the information model level and methods Modular Cutting edge technologies – best methods combining existing practices and novel research with rapid technology transfer Best software practices (80M+ notes) Commitment to both R and D in R&D

6 Original cTAKES Components
Sentence boundary detection (OpenNLP technology) Tokenization (rule-based) Morphologic normalization (NLM’s LVG) POS tagging (OpenNLP technology) Shallow parsing (OpenNLP technology) Named Entity Recognition Dictionary mapping (lookup algorithm) Negation and context identification (both based on NegEx)

7 Original cTAKES Named Entities
Drug mentions Disease/disorder mentions Sign/symptom mentions Anatomical site mentions With these attributes RxNorm code or Concept Unique Identifier (CUI) and SNOMED-CT codes. Negation (denies chest pain) Status (history of, family history of, possible/probable)

8 Additional cTAKES Components
Smoking status classifier More detailed drug mention annotator dosage route form drug change status and more Peripheral Artery Disease (PAD) annotator Dependency parser

9 Output Example: Disorder Object
“No evidence of unstable angina.” Text: unstable angina Associated codes: SNOMED UMLS CUI C Named entity type: disease/disorder Negation: true

10 cTAKES Configuration Options
XML configuration files Control many things, such as Dictionary location Dictionary format Which dictionaries to use Type of input (plain text or CDA) Forums contain details on creating your own dictionary

11 cTAKES Methods Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications Guergana K Savova, James J Masanz, Philip V Ogren, Jiaping Zheng, Sunghwan Sohn, Karin C Kipper-Schuler, Christopher G Chute. JAMIA 2010;17:

12 References


Download ppt "CTAKES The clinical Text Analysis and Knowledge Extraction System."

Similar presentations


Ads by Google