Learn to Comment Mentor: Mahdi M. Kalayeh

Slides:

Advertisements

Similar presentations

Deep Learning and Neural Nets Spring 2015

Advertisements

Spatial Pyramid Pooling in Deep Convolutional

Week 10 Presentation Wesna LaLanne - REU Student Mahdi M. Kalayeh - Mentor.

Image Captioning Approaches

Convolutional LSTM Networks for Subcellular Localization of Proteins

NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. SHOW.

Learning to Answer Questions from Image Using Convolutional Neural Network Lin Ma, Zhengdong Lu, and Hang Li Huawei Noah’s Ark Lab, Hong Kong

A Hierarchical Deep Temporal Model for Group Activity Recognition

Olivier Siohan David Rybach

Unsupervised Learning of Video Representations using LSTMs

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.

The Relationship between Deep Learning and Brain Function

Recurrent Neural Networks for Natural Language Processing

Summary of Week 1 (May 23 – May 27, 2016)

Recurrent Neural Networks

Show and Tell: A Neural Image Caption Generator (CVPR 2015)

Rochester Human-Computer Interaction (ROC HCI),University of Rochester

Combining CNN with RNN for scene labeling (segmentation)

Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .

Intelligent Information System Lab

Are End-to-end Systems the Ultimate Solutions for NLP?

mengye ren, ryan kiros, richard s. zemel

Shunyuan Zhang Nikhil Malik

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Textual Video Prediction

Image Question Answering

convolutional neural networkS

Bird-species Recognition Using Convolutional Neural Network

Visual Question Generation

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Attention-based Caption Description Mun Jonghwan.

Convolutional Neural Networks for Visual Tracking

Paraphrase Generation Using Deep Learning

Image Captions With Deep Learning Yulia Kogan & Ron Shiff

Counting in Dense Crowds using Deep Learning

Recurrent Neural Networks

Vessel Extraction in X-Ray Angiograms Using Deep Learning

Seminar Topics and Projects

The Big Health Data–Intelligent Machine Paradox

Lecture 16: Recurrent Neural Networks (RNNs)

Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions

Textual Video Prediction

Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler

Heterogeneous convolutional neural networks for visual recognition

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Natural Language Processing (NLP) Systems Joseph E. Gonzalez

Attention for translation

Lecture 21: Machine Learning Overview AP Computer Science Principles

Recurrent Neural Networks (RNNs)

Sequence to Sequence Video to Text

Automatic Handwriting Generation

Visual Question Answering

Presented by: Anurag Paul

The experiments based on Recurrent Neural Networks

Presented By: Harshul Gupta

Weeks 1 and 2 Aaron Ott.

Week 3 Presentation Ngoc Ta Aidean Sharghi.

UCF-REU in Computer Vision

Deep learning: Recurrent Neural Networks CV192

Week 3 Volodymyr Bobyr.

Bidirectional LSTM-CRF Models for Sequence Tagging

Week 7 Presentation Ngoc Ta Aidean Sharghi

Prabhas Chongstitvatana Chulalongkorn University

Mahdi Kalayeh David Hill

Visual Grounding.

CRCV REU 2019 Aaron Honculada.

Lecture 9: Machine Learning Overview AP Computer Science Principles

CVPR 2019 Poster.

Presentation transcript:

Learn to Comment Mentor: Mahdi M. Kalayeh REU Students: Lance Lebanoff | David Hill | Jonathan Pham

Problem Definition Given an image, generate comments about the image that mimic human comments. Use a combination of existing computer vision techniques to extract the features from the image: Object detection Scene understanding Sentiment analysis Train a deep neural network.

Useful Ideas From Previous Works Deep Learning structures CNN, LSTM Long Short Term Memory networks (LSTM) Accepts temporal sequences of arbitrary length Output natural language word by word Natural language processing for sentence descriptions Sentiment analysis from Sentibank

LRCNN Donahue, Jeff Long-term Recurrent Convolutional Networks for Visual Recognition and Description

LSTM Donahue, Jeff Long-term Recurrent Convolutional Networks for Visual Recognition and Description

Is deep learning enough? Is an end-to-end neural network sufficient? Training from scratch Issue of overfitting Pre-trained

Descriptions vs. Comments Image-sentence fragment alignment Comments can refer to concepts beyond the context of the image. Pop culture, current events, visual aesthetic, ... Unlike descriptions, comments confer sentiment about their subjects.

Extracting sentiment from an image (SentiBank).

Anticipated Challenges Generating comments on an image is a harder task than describing its visual content. Many possible ‘divergent’ comments. Standard NLP metrics like BLEU will not work. Data Collection Comments on images are often replies to other comments. Captions on pictures often influence the nature of the comments. We will most likely need to clean our own data sets by removing irrelevant comments.

Division of Tasks Lance: Sentiment Analysis from textual content Studying possible data collection approaches Jonathan: Sentiment Analysis from visual content David: Deep Learning (Caffe, LRCNN)