Yang-de Chen yongde0108@gmail.com Tutorial: word2vec Yang-de Chen yongde0108@gmail.com.

Slides:

Advertisements

Similar presentations

Measuring the Influence of Long Range Dependencies with Neural Network Language Models Le Hai Son, Alexandre Allauzen, Franc¸ois Yvon Univ. Paris-Sud and.

Advertisements

Deep Learning in NLP Word representation and how to use it for Parsing

Word/Doc2Vec for Sentiment Analysis

Radial Basis Functions

A simple classifier Ridge regression A variation on standard linear regression Adds a “ridge” term that has the effect of “smoothing” the weights Equivalent.

Distributed Representations of Sentences and Documents

Linguistic Regularities in Sparse and Explicit Word Representations

Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.

©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.

CS365 Course Project Billion Word Imputation Guide: Prof. Amitabha Mukherjee Group 20: Aayush Mudgal [12008] Shruti Bhargava [13671]

Rotation Invariant Neural-Network Based Face Detection

Subversion (SVN) Tutorial Source:

Producer 2003 By Mark White. Producer 2003 A add-on to PowerPoint 2003 Stand alone program Allows you to:  Create –audio and video  Edit  Synchronize.

Constructing Knowledge Graph from Unstructured Text Image Source: Kundan Kumar Siddhant Manocha.

Instance Construction via Likelihood- Based Data Squashing Madigan D., Madigan D., et. al. (Ch 12, Instance selection and Construction for Data Mining.

Efficient Estimation of Word Representations in Vector Space

Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.

Omer Levy Yoav Goldberg Ido Dagan Bar-Ilan University Israel

Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.

Deep Visual Analogy-Making

Accurate Cross-lingual Projection between Count-based Word Vectors by Exploiting Translatable Context Pairs SHONOSUKE ISHIWATARI NOBUHIRO KAJI NAOKI YOSHINAGA.

Text Summarization via Semantic Representation 吳旻誠 2014/07/16.

Ganesh J, Soumyajit Ganguly, Manish Gupta, Vasudeva Varma, Vikram Pudi

Vector Semantics Dense Vectors.

RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.

A Tutorial on ML Basics and Embedding Chong Ruan

Efficient Estimation of Word Representations in Vector Space By Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Google Inc., Mountain View, CA. Published.

Spectral Algorithms for Learning HMMs and Tree HMMs for Epigenetics Data Kevin C. Chen Rutgers University joint work with Jimin Song (Rutgers/Palentir),

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.

DeepWalk: Online Learning of Social Representations

Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.

Fill-in-The-Blank Using Sum Product Network

Distributed Representations for Natural Language Processing

An Introduction to Triple Scoring (WSDM Cup T2)

Sivan Biham & Adam Yaari

Zheng ZHANG 1-st year PhD candidate Group ILES, LIMSI

Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :

Comparison with other Models Exploring Predictive Architectures

Deep Learning for Bacteria Event Identification

Syntax-based Deep Matching of Short Texts

Neural Machine Translation by Jointly Learning to Align and Translate

A Deep Learning Technical Paper Recommender System

Intro to NLP and Deep Learning

Zhe Ye Word2vec Tutorial Zhe Ye

Distributed Representations of Words and Phrases and their Compositionality Presenter: Haotian Xu.

Vector-Space (Distributional) Lexical Semantics

Efficient Estimation of Word Representation in Vector Space

Word2Vec CS246 Junghoo “John” Cho.

Machine Learning Today: Reading: Maria Florina Balcan

Distributed Representation of Words, Sentences and Paragraphs

Word Embeddings with Limited Memory

Compilers, Make and SubVersion

Word Embedding Word2Vec.

Creating Data Representations

Word embeddings based mapping

Word embeddings based mapping

Sadov M. A. , NRU HSE, Moscow, Russia Kutuzov A. B

Resource Recommendation for AAN

Socialized Word Embeddings

Vector Representation of Text

A connectionist model in action

Word embeddings Text processing with current NNs requires encoding into vectors. One-hot encoding: N words encoded by length N vectors. A word gets a.

Word embeddings (continued)

Word representations David Kauchak CS158 – Fall 2016.

Natural Language Processing Is So Difficult

Baseline Model CSV Files Pandas DataFrame Sentence Lists

Vector Representation of Text

CSC 578 Neural Networks and Deep Learning

Presentation transcript:

Yang-de Chen yongde0108@gmail.com Tutorial: word2vec Yang-de Chen yongde0108@gmail.com

Download & Compile word2vec: https://code.google.com/p/word2vec/ Install subversion(svn) sudo apt-get install subversion Download word2vec svn checkout http://word2vec.googlecode.com/svn/trunk/ Compile make

CBOW and Skip-gram CBOW stands for “continuous bag-of-words” Both are networks without hidden layers. Reference: Efficient Estimation of Word Representations in Vector Space by Tomas Mikolov, et al.

Represent words as vectors Example sentence 謝謝學長祝學長研究順利 Vocabulary [ 謝謝, 學長, 祝, 研究, 順利 ] One-hot vector of 學長 [0 1 0 0 0 ]

Example of CBOW window = 1 謝謝學長祝學長研究順利 Input: [ 1 0 1 0 0] Target: [0 1 0 0 0] Projection Matrix × Input vector = vector(謝謝) + vector(祝) 1 4 7 2 5 8 3 6 9 1 0 1 =1 1 2 3 +0 4 5 6 +1 7 8 9

Training word2vec -train <training-data> -output <filename> -window <window-size> -cbow <0(skip-gram), 1(cbow)> -size <vector-size> -binary <0(text), 1(binary)> -iter <iteration-num> Example:

Play with word vectors distance <output-vector> - find related words word-analogy <output-vector> - analogy task, e.g. 𝑚𝑎𝑛→𝑘𝑖𝑛𝑔, 𝑤𝑜𝑚𝑎𝑛→?

Data: https://www.dropbox.com/s/tnp0wevr3u59ew8/d ata.tar.gz?dl=0

results

other Results

Analogy

analogy

Advanced Stuff – Phrase Vector Phrases You want to treat “New Zealand” as one word. If two words usually occur at the same time, we add underscore to treat them as one word. e.g. New_Zealand How to evaluate? If the score > threshold, we add an underscore. word2phrase -train <word-doc> -output <phrase-doc> -threshold 100 Reference: Distributed Representations of Words and Phrases and their Compositionality by Tomas Mikolov, et al.

Advanced Stuff – Negative Sampling Objective ( 𝑤 𝑡 , 𝑐 𝑡 ) 𝑐∈ 𝑐 𝑡 𝑙𝑜𝑔𝜎(𝑣𝑒𝑐 𝑤 𝑡 𝑇 𝑣𝑒𝑐 𝑐 ) − 𝑤 𝑡 , 𝑐 𝑡 ′ 𝑐 ′ ∈ 𝑐 𝑡 ′ 𝑙𝑜𝑔𝜎(𝑣𝑒𝑐 𝑤 𝑡 𝑇 𝑣𝑒𝑐 𝑐 ′ ) 𝑤 𝑡 :word, 𝑐 𝑡 : context, 𝑐 𝑡 ′ : random sample context 𝑃 𝑛 𝑤 = 𝑈𝑛𝑖𝑔𝑟𝑎𝑚 𝑤 0.75 𝑍