Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Object recognition and scene “understanding”

Distributed Representations of Sentences and Documents

Multiclass object recognition

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.

A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning Ronan Collobert Jason Weston Presented by Jie Peng.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

Sentiment analysis using deep learning methods

Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C

Machine Learning for Computer Security

Convolutional Neural Network

End-To-End Memory Networks

ECE 417 Lecture 1: Multimedia Signal Processing

CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.

The Relationship between Deep Learning and Brain Function

Deep Learning Amin Sobhani.

Compact Bilinear Pooling

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Recurrent Neural Networks for Natural Language Processing

Neural Networks for Machine Learning Lecture 1e Three types of learning Geoffrey Hinton with Nitish Srivastava Kevin Swersky.

Rotational Rectification Network for Robust Pedestrian Detection

Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .

Intelligent Information System Lab

Natural Language Processing of Knee MRI Reports

Neural networks (3) Regularization Autoencoder

Lecture 5 Smaller Network: CNN

Neural Networks 2 CS446 Machine Learning.

Convolutional Networks

CS6890 Deep Learning Weizhen Cai

Object detection as supervised classification

Non-linear classifiers Neural networks

Attention Is All You Need

convolutional neural networkS

Distributed Representation of Words, Sentences and Paragraphs

Convolutional Neural Networks for sentence classification

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Goodfellow: Chap 6 Deep Feedforward Networks

Image Classification.

Grid Long Short-Term Memory

Vessel Extraction in X-Ray Angiograms Using Deep Learning

Creating Data Representations

Deep Learning for Non-Linear Control

Analysis of Trained CNN (Receptive Field & Weights of Network)

John H.L. Hansen & Taufiq Al Babba Hasan

RCNN, Fast-RCNN, Faster-RCNN

Natural Language to SQL(nl2sql)

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Graph Neural Networks Amog Kamsetty January 30, 2019.

Neural networks (3) Regularization Autoencoder

Roc curves By Vittoria Cozza, matr

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Attention for translation

by Khaled Nasr, Pooja Viswanathan, and Andreas Nieder

Automatic Handwriting Generation

Human-object interaction

Word representations David Kauchak CS158 – Fall 2016.

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Topic: Semantic Text Mining

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Image recognition.

Presented By: Harshul Gupta

Week 7 Presentation Ngoc Ta Aidean Sharghi

Visual Grounding.

Presentation transcript:

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic content of a sentence for classification or generation The sentence modeling task is at the core of many tasks such as sentiment analysis, paraphrase detection, entailment recognition, summarization, discourse analysis, machine translation, grounded language learning and image retrieval The aim of sentence modeling is a feature function that guides the process by which features of a sentence are extracted.

One Dimensional Convolution

M-gram Dot product Vector of weights Size: m Filter Input Sequence Sentence

Produces a sequence c

Narrow Convolution Size of c : s – m + 1 It requires that s ≥ m

Wide Convolution Size: s + m - 1

Wide Convolution Size of c s+m-1 No requirement on s or m Out of range values are taken to be 0 Result of narrow convolution is subsequence of result of wide convolution

Advantages of Wide Convolution Guarantees that a valid non empty c will always be produced All weights in the filter reach the entire sentence Holds no limit on the size of m or s

Time Delay Neural Network A key feature for TDNN’s are the ability to express a relation between inputs in time. The sequence s is viewed as having a time dimension and the convolution is applied over the time dimension.

Max TDNN

Properties of Max TDNN Sensitive to order of the words Does not depend on external language specific features Largely uniform importance to the signal from each of the words Range of feature detectors is limited Higher order and long range feature detectors cannot be incorporated Multiple occurrences of features and the sequence ignored Pooling factor: s-m+1

k-Max Pooling

k – Max Pooling Given a value k and a sequence P of length p, k-max pooling selects the subsequence p-max of the k highest values of p. The order of the values in p-max corresponds to the original order in p.

k-Max Pooling k most active features Features may be number of positions apart Preserves the order of the features But is insensitive to their specific positions Can detect multiple occurrences of feature

What should k be? Why not let it decide for itself?

Dynamic k-Max Pooling Suppose length of sentence = 18 L = 3 Ktop = 3

Second Order Feature Map Multiple Feature Maps Feature Map Second Order Feature Map Convolution K-max Pooling Layer Non linear function

To increase the number of learnt feature detectors of a certain order, multiple feature maps may be computed in parallel at the same layer.

Folding Feature detectors in different rows are independent of each other.

Properties of Sentence Model The subsequence of n-grams extracted by the pooling operation induces invariance to absolute positions, but maintains their order and relative positions. DCNN feature graphs have a global range of the pooling operations DCNN has internal input dependent structure and does not rely on externally provided parse trees.

Experiments

Sentiment Prediction in Movie Reviews Concerns prediction of sentiment of movie reviews in Stanford Sentiment Treebank Output is binary in experiment 1 and “negative, somewhat negative, neutral, somewhat positive, positive” in experiment 2 Binary: MultiClass:

Question Type Classification TREC question dataset Six Different Question Types

Twitter Sentiment Prediction with Distant Supervision Large dataset of tweets Tweet is labelled positive or negative automatically based on emoticon Tweets are preprocessed

Conclusion Dynamic CNN defined, which uses Dynamic k-max Pooling Feature Graph captures word relation of varying size High performance on sentiment prediction and question classification without requiring external features