Download presentation
Presentation is loading. Please wait.
Published byBruno Johns Modified over 9 years ago
1
Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel
2
Neural Embeddings
3
Our Main Contribution: Generalizing Skip-Gram with Negative Sampling
4
Skip-Gram with Negative Sampling v2.0 Original implementation assumes bag-of-words contexts We generalize to arbitrary contexts Dependency contexts create qualitatively different word embeddings Provide a new tool for linguistically analyzing embeddings
5
Context Types
6
Australian scientist discovers star with telescope Example
7
Australian scientist discovers star with telescope Target Word
8
Australian scientist discovers star with telescope Bag of Words (BoW) Context
9
Australian scientist discovers star with telescope Bag of Words (BoW) Context
10
Australian scientist discovers star with telescope Bag of Words (BoW) Context
11
Australian scientist discovers star with telescope Syntactic Dependency Context
12
Australian scientist discovers star with telescope Syntactic Dependency Context prep_withnsubj dobj
13
Australian scientist discovers star with telescope Syntactic Dependency Context prep_withnsubj dobj
14
Generalizing Skip-Gram with Negative Sampling
15
How does Skip-Gram work?
16
Text Bag of Words Context Word-Context Pairs Learning
17
How does Skip-Gram work? Text Bag of Words Contexts Word-Context Pairs Learning
18
Our Modification Text Arbitrary Contexts Word-Context Pairs Learning
19
Our Modification Text Arbitrary Contexts Word-Context Pairs Learning Modified word2vec publicly available!
20
Our Modification: Example Text Syntactic Contexts Word-Context Pairs Learning
21
Our Modification: Example Text (Wikipedia) Syntactic Contexts Word-Context Pairs Learning
22
Our Modification: Example Text (Wikipedia) Syntactic Contexts (Stanford Dependencies) Word-Context Pairs Learning
23
What is the effect of different context types?
24
Thoroughly studied in explicit representations (distributional) Lin (1998), Padó and Lapata (2007), and many others… General Conclusion: Bag-of-words contexts induce topical similarities Dependency contexts induce functional similarities Share the same semantic type Cohyponyms Does this hold for embeddings as well?
25
Embedding Similarity with Different Contexts Target WordBag of Words (k=5)Dependencies DumbledoreSunnydale hallowsCollinwood Hogwartshalf-bloodCalarts (Harry Potter’s school)MalfoyGreendale SnapeMillfield Related to Harry Potter Schools
26
Embedding Similarity with Different Contexts Target WordBag of Words (k=5)Dependencies nondeterministicPauling non-deterministicHotelling TuringcomputabilityHeting (computer scientist)deterministicLessing finite-stateHamming Related to computability Scientists
27
Online Demo! Embedding Similarity with Different Contexts Target WordBag of Words (k=5)Dependencies singing dancerapping dancingdancesbreakdancing (dance gerund)dancersmiming tap-dancingbusking Related to dance Gerunds
28
Embedding Similarity with Different Contexts Dependency-based embeddings have more functional similarities This phenomenon goes beyond these examples Quantitative Analysis (in the paper)
29
Dependency-based embeddings have more functional similarities Quantitative Analysis Dependencies BoW (k=2) BoW (k=5)
30
Why do dependencies induce functional similarities?
31
Dependency Contexts & Functional Similarity Thoroughly studied in explicit representations (distributional) Lin (1998), Padó and Lapata (2007), and many others… In explicit representations, we can look at the features and analyze But embeddings are a black box! Dimensions are latent and don’t necessarily have any meaning
32
Analyzing Embeddings
33
Peeking into Skip-Gram’s Black Box
34
Associated Contexts Target WordDependencies students/prep_at -1 educated/prep_at -1 Hogwartsstudent/prep_at -1 stay/prep_at -1 learned/prep_at -1
35
Associated Contexts Target WordDependencies machine/nn -1 test/nn -1 Turingtheorem/poss -1 machines/nn -1 tests/nn -1
36
Associated Contexts Target WordDependencies dancing/conj dancing/conj -1 dancingsinging/conj -1 singing/conj ballroom/nn
37
Analyzing Embeddings We found a way to linguistically analyze embeddings Together with the ability to engineer contexts… …we now have the tools to create task-tailored embeddings!
38
Conclusion
39
Generalized Skip-Gram with Negative Sampling to arbitrary contexts Different contexts induce different similarities Suggest a way to peek inside the black box of embeddings Code, demo, and word vectors available from our websites Make linguistically-motivated task-tailored embeddings today! Thank you for listening :)
40
How does Skip-Gram work?
46
Generalize Skip-Gram to Arbitrary Contexts
47
Quantitative Analysis
48
WordSim353
49
Quantitative Analysis Define an artificial task of ranking functional pairs above topical ones Use embedding similarity (cosine) to rank Evaluate using precision-recall curve Higher curve means higher affinity to functional similarity
50
Quantitative Analysis Dependencies BoW (k=2) BoW (k=5) Dependencies BoW (k=2) BoW (k=5) Dependency-based embeddings have more functional similarities
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.