Download presentation
Presentation is loading. Please wait.
1
Word embeddings based mapping
Raymond ZHAO Wenlong (Updated on 15/08/2018 )
2
Word embeddings Vector space models represent words using low-fixed-dim vector Can group semantically similar words, and encode rich linguistic patterns( like word2vec (Mikolov et al., 2013) or GloVe (Pennington et al., 2014)) To apply vector model to sentence / doc, one must select an appro composition fuction
3
A typical NN Model A composition function g + classifier (on final representation) Unordered functions: treat input texts as bags of word embeddings Syntactic functions take word order and sentence structure into account - like NN ( CNN/RNN, g depends on a parse tree of the input sequence) Composition function is a math process for combining multiple words into a single vector Syntactic functions require more training time for huge datasets - RNN - computer a syntactic parse tree A deep unordered model Apply a composition function g to the sequence of word embeddings Vw The output is a vector z that servers as input to a logistic regression function Syntactic functions - g depend on a parse tree of the input sequence
4
- A deep unordered model
SWEB Model - A deep unordered model By Duke University 2018, ACL Source code is on github Could obtains near state-of-the-art accuracies on sentence and document-level tasks
5
Paper’s result Document-level classification
Dataset: Yahoo! Ans. and AG News SWEM model exhibits stronger performances, relative to both LSTM and CNN compositional architectures Marry the speed of unordered functions with the accuracy of syntactic functions Computational efficient - fewer parameters
6
Paper’s result Sentence-level task SWEM yields inferior accuracies
Approximate 20 words on average
7
Simple word-embedding model
SWEM-aver: take the information of each sequence into account via the addition operation ( take the info of each word) Max pooling: extract the most salient features (get the info of key words) SWEM-concat SWEM-hier: swem-aver on a local window, then a global max-pooling for each window (like n-grams)
8
The experiments SWEM-aver using Keras Current baseline model
Use our amazon review texts (830k texts and 21k tokens) Use pre-trained Glove word embeddings (a dataset of 1B tokens) Current baseline model - multiclass logistic regression - activation =’sigmoid’ + loss = ‘categorical_crossentropy’ Current Accuracy on cpu classification: On Keras/Tensorflow Multi-class classification
9
The experiments On SWEM-aver Alg using Keras
- The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator
10
The experiments On SWEN-max Alg
- The current accuracy on Hard-Disk: (to improve) * How to configure the labels is very important * objective: from the field experts - TODO
11
The experiments - todo Data preprocessing Try to use SWEM-con Alg
Stemming Remove punctuation/stop words Try to use SWEM-con Alg Try to learn task-specific embedding Try to use topic model for short Texts
12
Thanks Thanks Dr. Wong
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.