Download presentation
Presentation is loading. Please wait.
Published byMilo Simpson Modified over 8 years ago
1
Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information -------Using Skip-gram Model for word embedding
2
Outline Introduction & Background – Word embedding – Skip-Gram Model Similarity Experiment – Training Process – Result
3
Word Embedding Word Embedding: also called word representation, a technique in NLP where word was represented (embedded) as a low dimensional continuous vector space (semantically similar words are mapped to nearby points). 1.Methods to do this representation include: Neural Network, Dimensionality reduction on the word co-occurrence matrix, Probabilistic model. 2.Based on Distributional hypothesis: linguistic items with similar distributions have similar meanings.
4
Word Embedding Count Based methods: compute the statistics of how often some word co-occurs with its neighbour words in a large text corpus, then map these count-statistics down to a small, dense vector for each word. Predictive methods: try to predict a word from its neighbours in terms of learned small, dense embedding vectors (considered parameters of the model). Tips: to check whether the embedding vector is meaningful or not, we can always do similarity comparison; WE boost the performance in NLP tasks such as syntactic parsing and sentiment analysis.
5
Word Embedding For Predictive methods, WE comes into two ways: (1)the Continuous Bag-of-Words model (CBOW) – Input of training: wi−2,wi−1,wi+1,wi+2 – Output of training: wi – Idea: predicting the word given its context. – Advantages: slightly better accuracy for the frequent words, better when using a larger dataset. (2) the Skip-Gram model – Input of training: wi – Output of training: wi−2,wi−1,wi+1,wi+2 – Idea: predicting the context given a word. – Advantage: works well with small amount of the training data, represents well even rare words or phrases.
6
Skip-Gram Model
7
Notation: W t ----input word; C ----window size; T----training size; W—vocabulary size V w ---vector representation of output Objective Function maximise the above average log probability. P(W o |W I )
8
Outline Introduction & Background – Word embedding – Skip-Gram Model Similarity Experiment – Training Process – Method – Result & Discussion Conclusion
9
Similarity Experiment
10
Training Process Two sets of training data – MedTrack: a collection of 17,198 clinical patient records used in the TREC 2011 and 2012 Medical Records Track. – OHSUMED: a collection of 348,566 MEDLINE medical journal abstracts used in TREC 2000 Filtering Track. Replace the terms/compound terms with concept id – Using MetaMap (provided in UMLS) to convert the free- text sequence to concept sequence.
11
Training Process Parameter setting for Skip-Gram model – Window size. – Dimension of embedding vector.
12
Methods Experiment Steps – 1. After get the vector of each concept from the training dataset, do the cosine similarity comparison for each concept pairs in Test dataset. – 2. Do the Pearson correlation coefficient for the similarities values from the expert and NLM. – 3. Compare the performance of NLM with other semantics similarity approaches.
13
Methods Test datasets – Ped: 29 UMLS medical concept pair developed by Pedersen et al. [15]. Semantic similarity judgements were provided by 3 physician and 9 clinical terminologists, with an inter-coder correlation of 0.85. – Cav: 45 MeSH/UMLS concept pairs developed by Cavides and Cimino [5]. Similarity between concept pairs was judged by 3 physicians, with no exact consensus value reported by Cavides and Cimino.
14
Methods
16
Results
17
Reference De Vine, L., Zuccon, G., Koopman, B., Sitbon, L., & Bruza, P. (2014, November). Medical semantic similarity with a neural language model. InProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (pp. 1819-1822). ACM. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
18
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.