Word embeddings based mapping

Slides:

Advertisements

Similar presentations

Deep Learning in NLP Word representation and how to use it for Parsing

Advertisements

Word/Doc2Vec for Sentiment Analysis

Distributed Representations of Sentences and Documents

Overview of Kernel Methods Prof. Bennett Math Model of Learning and Discovery 2/27/05 Based on Chapter 2 of Shawe-Taylor and Cristianini.

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Final Presentation Tong Wang. 1.Automatic Article Screening in Systematic Review 2.Compression Algorithm on Document Classification.

Qual Presentation Daniel Khashabi 1. Outline  My own line of research  Papers:  Fast Dropout training, ICML, 2013  Distributional Semantics Beyond.

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.

Omer Levy Yoav Goldberg Ido Dagan Bar-Ilan University Israel

Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.

Kai Sheng-Tai, Richard Socher, Christopher D. Manning

RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

DeepWalk: Online Learning of Social Representations

Fill-in-The-Blank Using Sum Product Network

Distributed Representations for Natural Language Processing

A Simple Approach for Author Profiling in MapReduce

Sentiment analysis using deep learning methods

Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images

Faster R-CNN – Concepts

Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :

IEEE BIBM 2016 Xu Min, Wanwen Zeng, Ning Chen, Ting Chen*, Rui Jiang*

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

Deep Learning for Bacteria Event Identification

Object Detection based on Segment Masks

Dhruv Batra Georgia Tech

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

Syntax-based Deep Matching of Short Texts

Relation Extraction CSCI-GA.2591

Perceptual Loss Deep Feature Interpolation for Image Content Changes

Enhancing User identification during Reading by Applying Content-Based Text Analysis to Eye- Movement Patterns Akram Bayat Amir Hossein Bayat Marc.

Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Distributed Representations of Words and Phrases and their Compositionality Presenter: Haotian Xu.

Natural Language Processing of Knee MRI Reports

Neural networks (3) Regularization Autoencoder

Deep learning and applications to Natural language processing

Vector-Space (Distributional) Lexical Semantics

Word2Vec CS246 Junghoo “John” Cho.

Distributed Representation of Words, Sentences and Paragraphs

Convolutional Neural Networks for sentence classification

Word Embedding Word2Vec.

Word embeddings based mapping

The experiments based on CNN

RCNN, Fast-RCNN, Faster-RCNN

Vector Representation of Text

Word embeddings Text processing with current NNs requires encoding into vectors. One-hot encoding: N words encoded by length N vectors. A word gets a.

Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler

Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.

Word embeddings (continued)

The experiments based on word-embedding and SVM

Attention for translation

Automatic Handwriting Generation

The Updated experiment based on LSTM

Presented by: Anurag Paul

Word representations David Kauchak CS158 – Fall 2016.

Modeling IDS using hybrid intelligent systems

Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.

Neural Machine Translation using CNN

The experiments based on Recurrent Neural Networks

Baseline Model CSV Files Pandas DataFrame Sentence Lists

Bidirectional LSTM-CRF Models for Sequence Tagging

Vector Representation of Text

Week 7 Presentation Ngoc Ta Aidean Sharghi

Anirban Laha and Vikas C. Raykar, IBM Research – India.

Neural Machine Translation by Jointly Learning to Align and Translate

Visual Grounding.

The experiment based on hier-attention

Presentation transcript:

Word embeddings based mapping Raymond ZHAO Wenlong (Updated on 15/08/2018 )

Word embeddings Vector space models represent words using low-fixed-dim vector Try to capture word relations via inner products Can group semantically similar words, and encode rich linguistic patterns( like word2vec (Mikolov et al., 2013) or GloVe (Pennington et al., 2014)) To apply vector model to sentence / doc, one must select an appro composition fuction

A typical NN Model A composition function g + classifier (on final representation) Unordered functions: treat input texts as bags of word embeddings Syntactic functions take word order and sentence structure into account - like NN ( CNN/RNN, g depends on a parse tree of the input sequence) Composition function is a math process for combining multiple words into a single vector Syntactic functions require more training time for huge datasets - RNN - computer a syntactic parse tree A deep unordered model Apply a composition function g to the sequence of word embeddings Vw The output is a vector z that servers as input to a logistic regression function Syntactic functions - g depend on a parse tree of the input sequence

- A deep unordered model SWEB Model - A deep unordered model By Duke University 2018, ACL Source code is on github Could obtains near state-of-the-art accuracies on sentence and document-level tasks

Paper’s result Document-level classification Dataset: Yahoo! Ans. and AG News SWEM model exhibits stronger performances, relative to both LSTM and CNN compositional architectures Marry the speed of unordered functions with the accuracy of syntactic functions Computational efficient - fewer parameters

Paper’s result Sentence-level task SWEM yields inferior accuracies Approximate 20 words on average

Simple word-embedding model SWEM-aver: take the information of each sequence into account via the addition operation ( take the info of each word) Max pooling: extract the most salient features (get the info of key words) SWEM-concat SWEM-hier: swem-aver on a local window, then a global max-pooling for each window (like n-grams)

The experiments SWEM-aver using Keras Current baseline model Use our amazon review texts ( 830k texts and 19.8k unique tokens) Use pre-trained Glove word embeddings (a dataset of 1B tokens) Current baseline model - multiclass logistic regression - activation =’sigmoid’ + loss = ‘categorical_crossentropy’ Current Accuracy on cpu classification: 0.6106 On Keras/Tensorflow Multi-class classification

The experiments On SWEM-aver Alg using Keras - The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator

The experiments On SWEN-max Alg - The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator

The experiments On SWEN-con Alg Concatenate SWEM-aver and SWEM-max together Remove punctuation (a bit improvements)

The experiments - todo Try to use SWEM-hier alg Try to use SVM/CRF classifiers Currently use multiclass logistic regression Try to use topic model for short Texts

Thanks Thanks Dr. Wong