Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference.

Similar presentations


Presentation on theme: "1 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference."— Presentation transcript:

1 1 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems Architecture of a Medical Information Extraction System Dalila Bekhouche (dalila.bekhouche@ loria.fr) Yann Pollet (pollet@cnam.fr) Bruno Grilheres (bruno.grilheres@sysde.eads.net) Xavier Denis (xavier.denis@tiscali.fr)

2 2 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems Index Introduction Information extraction The architecture of the IE System Extraction of lexical and medical terms Evaluation of ICD-10 and CCMA results Limits of this approach and future work

3 3 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems Database 1- Introduction Problem: Difficult to access and exploit this amount of information Variety of content Specific terminology The practionners use uncertain expressions and sens modifying Difficulties in understanding for most NLP tools

4 4 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems 2- Information extraction Aim  Identify and Extract relevant information from medical documents (examination report as colonoscopy) Aim  Identify and Extract relevant information from medical documents (examination report as colonoscopy)  How to identify the relevant information?  Relevant information: events and entities described in texts which concern the patient (signs, diagnosis, acts, results)  How to identify the relevant information?  Relevant information: events and entities described in texts which concern the patient (signs, diagnosis, acts, results) Relevant information Extraction Domain knowledge Documents Free text Lexical Ressource

5 5 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems 3- The architecture of the IE System Documents Thesauri ICD- 10/Vidal/CCMA dictionary Database validation Extraction Generation resources and rules 1- Lexical level Named entities (Name,Medical terms) Date of examination Document type Signs Diagnosis Acts Results 2-Sub-sentence level Signs, symptoms

6 6 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems REGEX(words) or dictionary REGEX(words) and level 1 Mr was addressed for a checkup by McGann Mr Smith was addressed for a checkup by McGann Level 1 Level 2 Named entities(location, companies, organizations, dates) 4- Extraction of the lexical terms

7 7 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems 5- Extraction of the ICD-10 and CCMA 1-Preprocessing step: Reduce the text and thesauri Standardisation of words, removing irrelevant words 2-Recognizing of the discminate terms 3-Evaluate the Similarity (cosine measure) between the neighbouring terms in text and each candidate entry of the ICD-10 in relationship with indexing term Identify the various occurrences of these thesauri ICD-10: International classification of the diseases CCMA: Common Classification of the Medical Acts

8 8 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems N. docPrPr ReRe Before adding knowledge3130,5000,710 After adding knowledge3700,8770,798 6- Evaluation of ICD-10 and CCMA results valid annotations found by the system valid annotations found by the practitionner Precision = valid annotations found by the system all annotations found by the system Recall = 50% correct annotations. After adding knowledge, the precision increases up to 87,7% Recall is approximatively the same, it represents problems due to ambiguous words.

9 9 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems 7- Limits of this approach and future work French medical texts only and specifics domains colonoscopy & oncology records. Simple sentences as medical records but may have difficulties to analyse complex sentences needing a deep syntactic analysis we will focus on the generation and acquisition steps. Taking into account synonyms and feedback users

10 10 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference on the Application of Natural Language to Information Systems Thank you! dalila.bekhouche@ loria.fr PSI (Perception, system, information) Insa Rouen, Place E. Blondel, 76130 Mont St Aignan, France


Download ppt "1 D. Bekhouche/ Y. Pollet/ B. Grilheres/ X. Denis University of Salford, UK 06/24/2004 PSI Rouen Perception System Information 9 th International Conference."

Similar presentations


Ads by Google