Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vector space word representations

Similar presentations

Presentation on theme: "Vector space word representations"— Presentation transcript:

1 Vector space word representations
Rani Nelken, PhD Director of Research, Outbrain @RaniNelken


3 Words = atoms?


5 That would be crazy for numbers

6 The distributional hypothesis
What is a word? Wittgenstein (1953): The meaning of a word is its use in the language Firth (1957): You shall know a word by the company it keeps

7 From atomic symbols to vectors
Map words to dense numerical vectors “representing” their contexts Map words with similar contexts to vectors with small angle

8 History Hard Clustering: Brown clustering
Soft clustering: LSA, Random projections, LDA Neural nets

9 Feedforward Neural Net Language Model

10 Training Input is one-hot vectors of context (0…0,1,0…0)
We’re trying to learn a vector for each word (“projection”) Such that the output is close to the one-hot vector of w(t)

11 Simpler model: Word2Vec


13 What can we do with these representations?
Plug them into your existing classifier Plug them into further neural nets – better! Improves accuracy on many NLP tasks Named entity recognition POS tagging sentiment analysis semantic role labeling

14 Back to cheese… cos(crumbled, cheese) = 0.042
cos(crumpled, cheese) = 0.203

15 And now for the magic

16 “Magical” property [Paris] - [France] + [Italy] ≈ [Rome]
[king] - [man] + [woman] ≈ [queen] We can use it to solve word analogy problems Boston: Red_Sox= New_York: ? Demo


18 Why does it work? [king] - [man] + [woman] ≈ [queen]
cos (x, ([king] – [man] + [woman])) = cos (x, [king]) – cos(x, [man]) + cos(x, [woman]) [queen] is a good candidate

19 It doesn’t always work London : England = Baghdad : ?
We expect Iraq, but get Mosul We’re looking for a word that is close to Baghdad, and to England, but not to London

20 Why did it fail? London : England = Baghdad : ?
cos(Mosul, Baghdad) >> cos(Iraq, London) Instead of adding the cosines, multiply them Improves accuracy

21 Word2Vec Open source C implementation from Google
Comes with pre-learned embeddings Gensim: fast python implementation

22 Active field of research
Bilingual embeddings Joint word and image embeddings Embeddings for sentiment Phrase and document embeddings

23 Bigger picture: how can we make NLP less fragile?
90’s: Linguistic engineering 00’s: Feature engineering 10’s: Unsupervised preprocessing

24 References

25 Thanks @RaniNelken We’re hiring for NLP positions

Download ppt "Vector space word representations"

Similar presentations

Ads by Google