Word embeddings based mapping

Slides:



Advertisements
Similar presentations
Deep Learning in NLP Word representation and how to use it for Parsing
Advertisements

Word/Doc2Vec for Sentiment Analysis
Distributed Representations of Sentences and Documents
Overview of Kernel Methods Prof. Bennett Math Model of Learning and Discovery 2/27/05 Based on Chapter 2 of Shawe-Taylor and Cristianini.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Final Presentation Tong Wang. 1.Automatic Article Screening in Systematic Review 2.Compression Algorithm on Document Classification.
Qual Presentation Daniel Khashabi 1. Outline  My own line of research  Papers:  Fast Dropout training, ICML, 2013  Distributional Semantics Beyond.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.
Omer Levy Yoav Goldberg Ido Dagan Bar-Ilan University Israel
Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.
Kai Sheng-Tai, Richard Socher, Christopher D. Manning
RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
DeepWalk: Online Learning of Social Representations
Fill-in-The-Blank Using Sum Product Network
Distributed Representations for Natural Language Processing
A Simple Approach for Author Profiling in MapReduce
Sentiment analysis using deep learning methods
Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images
Faster R-CNN – Concepts
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
IEEE BIBM 2016 Xu Min, Wanwen Zeng, Ning Chen, Ting Chen*, Rui Jiang*
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Deep Learning for Bacteria Event Identification
Object Detection based on Segment Masks
Dhruv Batra Georgia Tech
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Syntax-based Deep Matching of Short Texts
Relation Extraction CSCI-GA.2591
Perceptual Loss Deep Feature Interpolation for Image Content Changes
Enhancing User identification during Reading by Applying Content-Based Text Analysis to Eye- Movement Patterns Akram Bayat Amir Hossein Bayat Marc.
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Distributed Representations of Words and Phrases and their Compositionality Presenter: Haotian Xu.
Natural Language Processing of Knee MRI Reports
Neural networks (3) Regularization Autoencoder
Deep learning and applications to Natural language processing
Vector-Space (Distributional) Lexical Semantics
Word2Vec CS246 Junghoo “John” Cho.
Distributed Representation of Words, Sentences and Paragraphs
Convolutional Neural Networks for sentence classification
Word Embedding Word2Vec.
Word embeddings based mapping
The experiments based on CNN
RCNN, Fast-RCNN, Faster-RCNN
Vector Representation of Text
Word embeddings Text processing with current NNs requires encoding into vectors. One-hot encoding: N words encoded by length N vectors. A word gets a.
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.
Word embeddings (continued)
The experiments based on word-embedding and SVM
Attention for translation
Automatic Handwriting Generation
The Updated experiment based on LSTM
Presented by: Anurag Paul
Word representations David Kauchak CS158 – Fall 2016.
Modeling IDS using hybrid intelligent systems
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Neural Machine Translation using CNN
The experiments based on Recurrent Neural Networks
Baseline Model CSV Files Pandas DataFrame Sentence Lists
Bidirectional LSTM-CRF Models for Sequence Tagging
Vector Representation of Text
Week 7 Presentation Ngoc Ta Aidean Sharghi
Anirban Laha and Vikas C. Raykar, IBM Research – India.
Neural Machine Translation by Jointly Learning to Align and Translate
Visual Grounding.
The experiment based on hier-attention
Presentation transcript:

Word embeddings based mapping Raymond ZHAO Wenlong (Updated on 15/08/2018 )

Word embeddings Vector space models represent words using low-fixed-dim vector Try to capture word relations via inner products Can group semantically similar words, and encode rich linguistic patterns( like word2vec (Mikolov et al., 2013) or GloVe (Pennington et al., 2014)) To apply vector model to sentence / doc, one must select an appro composition fuction

A typical NN Model A composition function g + classifier (on final representation) Unordered functions: treat input texts as bags of word embeddings Syntactic functions take word order and sentence structure into account - like NN ( CNN/RNN, g depends on a parse tree of the input sequence) Composition function is a math process for combining multiple words into a single vector Syntactic functions require more training time for huge datasets - RNN - computer a syntactic parse tree A deep unordered model Apply a composition function g to the sequence of word embeddings Vw The output is a vector z that servers as input to a logistic regression function Syntactic functions - g depend on a parse tree of the input sequence

- A deep unordered model SWEB Model - A deep unordered model By Duke University 2018, ACL Source code is on github Could obtains near state-of-the-art accuracies on sentence and document-level tasks

Paper’s result Document-level classification Dataset: Yahoo! Ans. and AG News SWEM model exhibits stronger performances, relative to both LSTM and CNN compositional architectures Marry the speed of unordered functions with the accuracy of syntactic functions Computational efficient - fewer parameters

Paper’s result Sentence-level task SWEM yields inferior accuracies Approximate 20 words on average

Simple word-embedding model SWEM-aver: take the information of each sequence into account via the addition operation ( take the info of each word) Max pooling: extract the most salient features (get the info of key words) SWEM-concat SWEM-hier: swem-aver on a local window, then a global max-pooling for each window (like n-grams)

The experiments SWEM-aver using Keras Current baseline model Use our amazon review texts ( 830k texts and 19.8k unique tokens) Use pre-trained Glove word embeddings (a dataset of 1B tokens) Current baseline model - multiclass logistic regression - activation =’sigmoid’ + loss = ‘categorical_crossentropy’ Current Accuracy on cpu classification: 0.6106 On Keras/Tensorflow Multi-class classification

The experiments On SWEM-aver Alg using Keras - The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator

The experiments On SWEN-max Alg - The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator

The experiments On SWEN-con Alg Concatenate SWEM-aver and SWEM-max together Remove punctuation (a bit improvements)

The experiments - todo Try to use SWEM-hier alg Try to use SVM/CRF classifiers Currently use multiclass logistic regression Try to use topic model for short Texts

Thanks Thanks Dr. Wong