Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel.

Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Neural Embeddings

Our Main Contribution: Generalizing Skip-Gram with Negative Sampling

Skip-Gram with Negative Sampling v2.0 Original implementation assumes bag-of-words contexts We generalize to arbitrary contexts Dependency contexts create qualitatively different word embeddings Provide a new tool for linguistically analyzing embeddings

Context Types

Australian scientist discovers star with telescope Example

Australian scientist discovers star with telescope Target Word

Australian scientist discovers star with telescope Bag of Words (BoW) Context

Australian scientist discovers star with telescope Syntactic Dependency Context

Australian scientist discovers star with telescope Syntactic Dependency Context prep_withnsubj dobj

Generalizing Skip-Gram with Negative Sampling

How does Skip-Gram work?

Text Bag of Words Context Word-Context Pairs Learning

How does Skip-Gram work? Text Bag of Words Contexts Word-Context Pairs Learning

Our Modification Text Arbitrary Contexts Word-Context Pairs Learning

Our Modification Text Arbitrary Contexts Word-Context Pairs Learning Modified word2vec publicly available!

Our Modification: Example Text Syntactic Contexts Word-Context Pairs Learning

Our Modification: Example Text (Wikipedia) Syntactic Contexts Word-Context Pairs Learning

Our Modification: Example Text (Wikipedia) Syntactic Contexts (Stanford Dependencies) Word-Context Pairs Learning

What is the effect of different context types?

Thoroughly studied in explicit representations (distributional) Lin (1998), Padó and Lapata (2007), and many others… General Conclusion: Bag-of-words contexts induce topical similarities Dependency contexts induce functional similarities Share the same semantic type Cohyponyms Does this hold for embeddings as well?

Embedding Similarity with Different Contexts Target WordBag of Words (k=5)Dependencies DumbledoreSunnydale hallowsCollinwood Hogwartshalf-bloodCalarts (Harry Potter’s school)MalfoyGreendale SnapeMillfield Related to Harry Potter Schools

Embedding Similarity with Different Contexts Target WordBag of Words (k=5)Dependencies nondeterministicPauling non-deterministicHotelling TuringcomputabilityHeting (computer scientist)deterministicLessing finite-stateHamming Related to computability Scientists

Online Demo! Embedding Similarity with Different Contexts Target WordBag of Words (k=5)Dependencies singing dancerapping dancingdancesbreakdancing (dance gerund)dancersmiming tap-dancingbusking Related to dance Gerunds

Embedding Similarity with Different Contexts Dependency-based embeddings have more functional similarities This phenomenon goes beyond these examples Quantitative Analysis (in the paper)

Dependency-based embeddings have more functional similarities Quantitative Analysis Dependencies BoW (k=2) BoW (k=5)

Why do dependencies induce functional similarities?

Dependency Contexts & Functional Similarity Thoroughly studied in explicit representations (distributional) Lin (1998), Padó and Lapata (2007), and many others… In explicit representations, we can look at the features and analyze But embeddings are a black box! Dimensions are latent and don’t necessarily have any meaning

Analyzing Embeddings

Peeking into Skip-Gram’s Black Box

Associated Contexts Target WordDependencies students/prep_at -1 educated/prep_at -1 Hogwartsstudent/prep_at -1 stay/prep_at -1 learned/prep_at -1

Associated Contexts Target WordDependencies machine/nn -1 test/nn -1 Turingtheorem/poss -1 machines/nn -1 tests/nn -1

Associated Contexts Target WordDependencies dancing/conj dancing/conj -1 dancingsinging/conj -1 singing/conj ballroom/nn

Analyzing Embeddings We found a way to linguistically analyze embeddings Together with the ability to engineer contexts… …we now have the tools to create task-tailored embeddings!

Conclusion

Generalized Skip-Gram with Negative Sampling to arbitrary contexts Different contexts induce different similarities Suggest a way to peek inside the black box of embeddings Code, demo, and word vectors available from our websites Make linguistically-motivated task-tailored embeddings today! Thank you for listening :)

How does Skip-Gram work?

Generalize Skip-Gram to Arbitrary Contexts

Quantitative Analysis

WordSim353

Quantitative Analysis Define an artificial task of ranking functional pairs above topical ones Use embedding similarity (cosine) to rank Evaluate using precision-recall curve Higher curve means higher affinity to functional similarity

Quantitative Analysis Dependencies BoW (k=2) BoW (k=5) Dependencies BoW (k=2) BoW (k=5) Dependency-based embeddings have more functional similarities

Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel.

Similar presentations

Presentation on theme: "Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel.

Similar presentations

Presentation on theme: "Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel."— Presentation transcript:

Similar presentations

About project

Feedback