Word embeddings based mapping

Word embeddings based mapping
Raymond ZHAO Wenlong (Updated on 15/08/2018 )

Word embeddings Vector space models represent words using low-fixed-dim vector Can group semantically similar words, and encode rich linguistic patterns( like word2vec (Mikolov et al., 2013) or GloVe (Pennington et al., 2014)) To apply vector model to sentence / doc, one must select an appro composition fuction

A typical NN Model A composition function g + classifier (on final representation) Unordered functions: treat input texts as bags of word embeddings Syntactic functions take word order and sentence structure into account - like NN ( CNN/RNN, g depends on a parse tree of the input sequence) Composition function is a math process for combining multiple words into a single vector Syntactic functions require more training time for huge datasets - RNN - computer a syntactic parse tree A deep unordered model Apply a composition function g to the sequence of word embeddings Vw The output is a vector z that servers as input to a logistic regression function Syntactic functions - g depend on a parse tree of the input sequence

- A deep unordered model
SWEB Model - A deep unordered model By Duke University 2018, ACL Source code is on github Could obtains near state-of-the-art accuracies on sentence and document-level tasks

Paper’s result Document-level classification
Dataset: Yahoo! Ans. and AG News SWEM model exhibits stronger performances, relative to both LSTM and CNN compositional architectures Marry the speed of unordered functions with the accuracy of syntactic functions Computational efficient - fewer parameters

Paper’s result Sentence-level task SWEM yields inferior accuracies
Approximate 20 words on average

Simple word-embedding model
SWEM-aver: take the information of each sequence into account via the addition operation ( take the info of each word) Max pooling: extract the most salient features (get the info of key words) SWEM-concat SWEM-hier: swem-aver on a local window, then a global max-pooling for each window (like n-grams)

The experiments SWEM-aver using Keras Current baseline model
Use our amazon review texts (830k texts and 21k tokens) Use pre-trained Glove word embeddings (a dataset of 1B tokens) Current baseline model - multiclass logistic regression - activation =’sigmoid’ + loss = ‘categorical_crossentropy’ Current Accuracy on cpu classification: On Keras/Tensorflow Multi-class classification

The experiments On SWEM-aver Alg using Keras
- The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator

The experiments On SWEN-max Alg
- The current accuracy on Hard-Disk: (to improve) * How to configure the labels is very important * objective: from the field experts - TODO

The experiments - todo Data preprocessing Try to use SWEM-con Alg
Stemming Remove punctuation/stop words Try to use SWEM-con Alg Try to learn task-specific embedding Try to use topic model for short Texts

Thanks Thanks Dr. Wong

Word embeddings based mapping

Similar presentations

Presentation on theme: "Word embeddings based mapping"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Word embeddings based mapping

Similar presentations

Presentation on theme: "Word embeddings based mapping"— Presentation transcript:

Similar presentations

About project

Feedback