Download presentation
Presentation is loading. Please wait.
Published byCharleen Bell Modified over 9 years ago
1
Correlating Knowledge Using NLP: Relationships between the concepts of blood cancers, stem cell transplantation, and biomarkers Katy Zou and Weizhong Zhu PhD Translational Research Informatics; Division of Research Informatics
2
Computer is able to “understand” human language It extracts the data according to linguistic rules Patient prognosis Clinical narratives/ PubMed articles treatment disease biomarkers
3
Clinical DocumentationNLP Analysis The patient has a fracture of the left femur with no underlying arterial injury. Pain was controlled with 5 mg of Morphine iv. Problem: primary Term: fracture body Location: femur body Side: left ICD-9-: 821.0 (fracture of unspecified part of femur) ICD-10: S72-92XA (fracture of left femur) SNOMED: 71620000 (fracture of femur) Source: http://healthfidelity.com/issue-brief-nlp
4
gap between research done and treatment methods propose to extract the unstructured data stored in databases find relationships between research and patient data, transforming them into structured data. Using a commercial I2E® we began developing the process for meaningful contextual data extraction, specific to relating biomarker utilization towards disease therapy and their reported outcome.
5
Goal: The goal of this project is to create an example smart- query using I2E®, displaying the relationships between three specific biomedical concepts: “blood cancers”, “stem cell transplantation (SCT)”, and “biomarkers”. Articles from PubMed containing correlations between these three concepts will be extracted. Significance: Saved outputs of the smart-query allow clinicians and/or doctors to easily gain access to vast amounts of data, to precisely locate relevant data in a matter of seconds, and to find correlations between concepts across fields of study.
6
A concept is different from a keyword in that it allows extraction systems to find not only specific words but the potential of a word. A smart-query is a mapped out, linguistic relationship between two or more concepts that can be saved for easy accessibility for future use.
7
Q: If SCT was given to a lymphoma diagnosed patient, which biomarkers would indicate the possible outcome of the patient? Concepts: “lymphoma”, “stem cell transplantation”, and “biomarkers” ontology 1.Use stem cell treat for lymphoma 2.Lymphoma treat by stem cell 3.Stem cell use treat lymphoma 4.Stem cell treat lymphoma 5.Stem cell therapy for treat for lymphoma 6.Stem cell therapy as treat for lymphoma
8
Defining linguistic rules and relationships System processing Manual and systematic evaluation of output Output saved into database for easy access in the future Multiple queries may be combined to increase retrieval and variation Source: http://archives.limsi.fr/RS2005/chm/lir/lir9/
9
Patient cases *Data was retrieved from the PubMed database as well as COH bone marrow pathology reports
10
PubMed articles *Data was retrieved from the PubMed database as well as COH bone marrow pathology reports
11
Biomarker to outcome *Data was retrieved from the PubMed database as well as COH bone marrow pathology reports
12
Biomarker to biomarker relation within Lymphoma
13
PubMed biomarker to biomarker cluster without disease relation
14
Pubmed Biomarker to biomarker cluster with lymphoma relation
15
Biomarker to lymphoma relation
16
Public open source BindingDB Database Protein to Protein interaction in organic layout
17
PubMed Protein to Protein interaction in organic layout
18
PubMed Protein to Protein interaction in hierarchical layout
19
Source: Wikipedia Precision: the percentage of actual answers given that are correct. Recall: the percentage of possible answers that are correctly extracted. We defined relevant documents as ones that showed a relationship of: “blood cancer” treated by “stem cell transplantation”, potential outcome indicated by biomarkers
20
Correlating concepts between different fields of study: -complex -time-consuming With the use of NLP process is shortened → efficiency of computerized processing. used to solve questions concerning: -effect of certain therapeutic procedures on eventual outcome possible applications between concepts are virtually endless. We recognize the many limitations of relying on NLP to categorize and identify the importance of biomarkers or treatment. Many biomarkers are not “stable”, or do not have transcending standards of evaluation with all practitioners or clinicians. However, even with these implications, NLP still stands as a promising method of tying in the gap between fields.
21
These are some future applications that are aimed to cater to the specific research questions of practitioners: 1)Identifying biomarker to biomarker relationships and their influence on patient outcome 2) Whether rituximab should be used in the R-CHOP regimen even when CD20- is present. Rituximab is designed to kill cells with the CD20 antigen so when there’s no CD20s present what is the point of using rituximab? Is there evidence in databases showing the need for this drug or does this drug serve no purpose but as a standard protocol? Do patient cases prove this hypothesis? As suggested by Dr. Chan MD
22
Thank you for Watching!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.