Presentation is loading. Please wait.

Presentation is loading. Please wait.

Authors : Ramon F. Astudillo, Silvio Amir, Wang Lin, Mario Silva, Isabel Trancoso Learning Word Representations from Scarce Data By: Aadil Hayat (13002)

Similar presentations


Presentation on theme: "Authors : Ramon F. Astudillo, Silvio Amir, Wang Lin, Mario Silva, Isabel Trancoso Learning Word Representations from Scarce Data By: Aadil Hayat (13002)"— Presentation transcript:

1 Authors : Ramon F. Astudillo, Silvio Amir, Wang Lin, Mario Silva, Isabel Trancoso Learning Word Representations from Scarce Data By: Aadil Hayat (13002)

2 Learning Word Representations from Scarce and Noisy Data with Embedding Sub-spaces 1Introduction 2Theory 3Results

3 Introduction Unsupervised word embedding for scarce and noisy data 1

4 Abstract  A technique to adapt unsupervised word embeddings to specific applications, when only small and noisy labeled datasets are available.  Current methods use pre-trained embeddings to initialize model parameters, and then use the labeled data to tailor them for the intended task.  But this approach is prone to overfitting when the training is performed with scarce and noisy data.  To overcome this issue here the supervised data to find an embedding subspace that fits the task complexity.  All the word representations are adapted through a projection into this task-specific sub-space.  This approach was recently used in the SemEval 2015 Twitter sentiment analysis challenge, attaining state-of-the-art results.

5 2 Theory

6 Unsupervised Structured Skip-Gram

7 Adapting Embedding with Sub-space Projections  Word embeddings are useful unsupervised techniques to attain initial model values or features prior to supervised learning. These models can be then retrained using the available labeled data.  Embedding provide a compact real valued representations of each word in a vocabulary.  Even then the total number of parameters in the model can be rather high. Very often a small amount of supervised data is available which can lead to severe overfitting.  Even if regularization is used to reduce the overfitting only a reduced subset of words will actually be present in the labeled dataset. Words not seen during training will never get their embeddings updated.  In the following slides, simple solution to this problem is explained.

8 Embedding Sub-space

9 Non-Linear Sub-space Embedding Model  The concept of embedding sub-space can be applied to log-linear classifiers or deep learning architecture that uses embeddings.  The NLSE can be interpreted as a simple feed-forward neural network model with one single hidden layer utilizing the embedding sub-space approach.

10 3 Results

11 Twitter Sentiment Analysis Average F-measure on the SemEval test sets varying with embedding sub-space size s. Sub-space size 0 used to denote the baseline (log-linear model)

12 Comparison of two baselines with two variations Performance of state-of-the-art systems for Twitter sentiment prediction

13 Thank You


Download ppt "Authors : Ramon F. Astudillo, Silvio Amir, Wang Lin, Mario Silva, Isabel Trancoso Learning Word Representations from Scarce Data By: Aadil Hayat (13002)"

Similar presentations


Ads by Google