SemMed-Interface: A PERL Module

Slides:



Advertisements
Similar presentations
Bellman-Ford algorithm
Advertisements

Relationship between Variables Assessment Statement Explain that the existence of a correlation does not establish that there is a causal relationship.
Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme Presented by Smitashree Choudhury.
CS 206 Introduction to Computer Science II 11 / 07 / 2008 Instructor: Michael Eckmann.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 CHAPTER 4 - PART 2 GRAPHS 1.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
ITEC200 – Week 12 Graphs. 2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study.
DAST 2005 Tirgul 11 (and more) sample questions. DAST 2005 Q.Let G = (V,E) be an undirected, connected graph with an edge weight function w : E→R. Let.
Shortest Path Algorithm By Weston Vu CS 146. What is Shortest Paths? Shortest Paths is a part of the graph algorithm. It is used to calculate the shortest.
CS 206 Introduction to Computer Science II 03 / 30 / 2009 Instructor: Michael Eckmann.
CSC 2300 Data Structures & Algorithms April 3, 2007 Chapter 9. Graph Algorithms.
Using Dijkstra’s Algorithm to Find a Shortest Path from a to z 1.
Automated Social Hierarchy Detection through Network Analysis (SNAKDD07) Ryan Rowe, Germ´an Creamer, Shlomo Hershkop, Salvatore J Stolfo 1 Advisor:
Shelly Warwick, MLS, Ph.D – Permission is granted to reproduce and edit this work for non-commercial educational use as long as attribution is provided.
Data Structures and Algorithms Ver. 1.0 Session 17 Objectives In this session, you will learn to: Implement a graph Apply graphs to solve programming problems.
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Graphs Part II Lecture 7. Lecture Objectives  Topological Sort  Spanning Tree  Minimum Spanning Tree  Shortest Path.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
.  Relationship between two sets of data  The word Correlation is made of Co- (meaning "together"), and Relation  Correlation is Positive when the.
Homework 1 Problem 1: (5 points) Both Dijkstras algorihm and Bellmanford Algorithm generates shortest paths to all destinations. Modify the algorithm to.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Dijkstra animation. Dijksta’s Algorithm (Shortest Path Between 2 Nodes) 2 Phases:initialization;iteration Initialization: 1. Included:(Boolean) 2. Distance:(Weight)
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
1 Single-source shortest path Shortest Path Given a weighted directed graph, one common problem is finding the shortest path between two given vertices.
© 2006 Pearson Addison-Wesley. All rights reserved14 B-1 Chapter 14 (continued) Graphs.
Graphs – Breadth First Search
Shortest Paths.
Neighborhood - based Tag Prediction
Graphs Representation, BFS, DFS
Programming Abstractions
Shortest Path from G to C Using Dijkstra’s Algorithm
CSC317 Shortest path algorithms
Graph Algorithms BFS, DFS, Dijkstra’s.
CSE (c) S. Tanimoto, 2002 Search Algorithms
Spearman’s Rank Correlation Test
Unweighted Shortest Path Neil Tang 3/11/2010
Programming Abstractions
Bridget McInnes Ted Pedersen Serguei Pakhomov
Algorithms for Exploration
CS120 Graphs.
Discrete Maths 9. Graphs Objective
Comp 245 Data Structures Graphs.
CSC 172 DATA STRUCTURES.
Graphs Chapter 13.
Graphs Chapter 11 Objectives Upon completion you will be able to:
Lecture 13 CSE 331 Oct 1, 2012.
Graph Representation (23.1/22.1)
Statistical Inference for Managers
Lecture 12 CSE 331 Sep 26, 2011.
Correlation and the Pearson r
Algorithms Lecture # 29 Dr. Sohail Aslam.
Lecture 11 CSE 331 Sep 21, 2017.
Lecture 12 CSE 331 Sep 22, 2014.
GRAPHS G=<V,E> Adjacent vertices Undirected graph
Chapter 16 1 – Graphs Graph Categories Strong Components
Correlations: Correlation Coefficient:
Lecture 11 CSE 331 Sep 22, 2016.
Spanning Trees Lecture 20 CS2110 – Spring 2015.
Chapter 14 Graphs © 2011 Pearson Addison-Wesley. All rights reserved.
Scatter Graphs Spearman’s Rank correlation coefficient
Spearman’s Rank Correlation Coefficient
Lecture 12 Shortest Path.
Weighted Graphs AQR.
Richard Anderson Winter 2019 Lecture 6
Correlation & Regression
Lecture 15 CSE 331 Oct 4, 2010.
CSE (c) S. Tanimoto, 2004 Search Algorithms
Concepts of Computation
Richard Anderson Autumn 2015 Lecture 6
Presentation transcript:

SemMed-Interface: A PERL Module Andriy Mulyar Dr. Bridget McInnes

The Problem In order to help computers understand human language, they must first be able to determine how words relate to one another Humans can quickly determine the extent to which two words are related. The SemMed Interface, utilizing linguistic data, will provide an algorithmic means for a computer to answer this same question.

The Data: Semantic Medline Database 13,000,000+ Medical Terms. 85,000,000+ Word Relationships. A whole lot of data.

The Data Tumor Growth Ibuprofen AFFECTS IS-A TREATS ADVIL Myocardial Infarction CAUSES TREATS IS A Heart Arrest Myalgia ASSOCIATED-WITH IS-A CAUSED BY Inflammatory disorder Tissue Damage

The Goal Given 2 medical terms in the database, give a score indicating how similar/associated those words are.

The Plan of Attack: Graph Traversal Consider WORDS as a graph vertices Consider WORD RELATIONSHIPS as graph edges Consider UMLS::Association Scores as edge weights CAUSES Pain 8.7060* Myalgia *T-score

Calculation of Edge Weights: UMLS::Association PERL Package The UMLS::Association package provides a PERL interface that, given 2 concepts(medical terms), returns an association score between them. This score is a statistic calculated by analyzing word co-occurrences in large texts.

Algorithms Breadth-First-Search (BFS) Dijkstra's Shortest Path

Breadth First Search Source: ADVIL Destination: Heart Arrest Tumor Growth Ibuprofen Path Length: 3 AFFECTS IS-A TREATS ADVIL Myocardial Infarction CAUSES TREATS PRECEDES Heart Arrest Myalgia ASSOCIATED-WITH IS-A CAUSED BY Inflammatory disorder Tissue Damage

ADVIL -> Heart Arrest Distance: 3 Terms Apart Relatedness Score: ⅓

This approach isn’t perfect Even though two concepts may be connected, the connection does not tell us how strongly associated the concepts are. TREATS ADVIL PAIN Strong Association ADVIL MILK Weaker Association CO-EXISTS WITH

Okay...A little less than ideal

Assigning Edge Weights TREATS ADVIL PAIN Strong Association 3.8778 CO-EXISTS WITH ADVIL MILK Weaker Association 0.2085

Assign Weights to the Edges -1 Tumor Growth Ibuprofen 1 / 7.833 AFFECTS IS-A TREATS TREATS -1 ADVIL Myocardial Infarction CAUSES 1 / .7940 TREATS IS-A Heart Arrest Myalgia ASSOCIATED-WITH IS-A CAUSED BY Tissue Damage 1 / 2.4644 Inflammatory disorder 1 / 20.6011

Assign Weights to the Edges -1 Tumor Growth Ibuprofen 0.1276 AFFECTS IS-A TREATS TREATS -1 ADVIL Myocardial Infarction CAUSES 1.2594 TREATS IS-A Heart Arrest Myalgia ASSOCIATED-WITH IS-A CAUSED BY Tissue Damage 0.4057 Inflammatory disorder 0.0485

Dijkstra's -1 Tumor Growth Ibuprofen 0.1276 AFFECTS IS-A TREATS TREATS -1 ADVIL Myocardial Infarction CAUSES 1.2594 TREATS IS-A Heart Arrest Myalgia ASSOCIATED-WITH IS-A CAUSED BY Tissue Damage 0.4057 Inflammatory disorder 0.0485 1.6651 0.1761

Results The scores obtained by running these algorithms between medical terms in the database were evaluated against a human produced gold standard. Evaluation was conducted by obtaining the Spearman Rank Correlation statistic comparing the ranking of scores in the gold standard to the algorithm produced data.

Gold Standard Algorithm Generated

Results Our initial set of algorithm produced data provided a low ranking coefficient when compared to the gold standard proving that just path length based aggregation would not provide enough information to determine word associativity.

Results We currently plan to utilize the associativity data provided in the UMLS::Association package to gather results for further evaluation.

Questions