Learn to Comment Mentor: Mahdi M. Kalayeh

Slides:



Advertisements
Similar presentations
Deep Learning and Neural Nets Spring 2015
Advertisements

Spatial Pyramid Pooling in Deep Convolutional
Week 10 Presentation Wesna LaLanne - REU Student Mahdi M. Kalayeh - Mentor.
Image Captioning Approaches
Convolutional LSTM Networks for Subcellular Localization of Proteins
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. SHOW.
Learning to Answer Questions from Image Using Convolutional Neural Network Lin Ma, Zhengdong Lu, and Hang Li Huawei Noah’s Ark Lab, Hong Kong
A Hierarchical Deep Temporal Model for Group Activity Recognition
Olivier Siohan David Rybach
Unsupervised Learning of Video Representations using LSTMs
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
The Relationship between Deep Learning and Brain Function
Recurrent Neural Networks for Natural Language Processing
Summary of Week 1 (May 23 – May 27, 2016)
Recurrent Neural Networks
Show and Tell: A Neural Image Caption Generator (CVPR 2015)
Rochester Human-Computer Interaction (ROC HCI),University of Rochester
Combining CNN with RNN for scene labeling (segmentation)
Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .
Intelligent Information System Lab
Are End-to-end Systems the Ultimate Solutions for NLP?
mengye ren, ryan kiros, richard s. zemel
Shunyuan Zhang Nikhil Malik
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Textual Video Prediction
Image Question Answering
convolutional neural networkS
Bird-species Recognition Using Convolutional Neural Network
Visual Question Generation
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
Attention-based Caption Description Mun Jonghwan.
Convolutional Neural Networks for Visual Tracking
Paraphrase Generation Using Deep Learning
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
Counting in Dense Crowds using Deep Learning
Recurrent Neural Networks
Vessel Extraction in X-Ray Angiograms Using Deep Learning
Seminar Topics and Projects
The Big Health Data–Intelligent Machine Paradox
Lecture 16: Recurrent Neural Networks (RNNs)
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Textual Video Prediction
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Please enjoy.
Heterogeneous convolutional neural networks for visual recognition
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Natural Language Processing (NLP) Systems Joseph E. Gonzalez
Attention for translation
Lecture 21: Machine Learning Overview AP Computer Science Principles
Recurrent Neural Networks (RNNs)
Sequence to Sequence Video to Text
Automatic Handwriting Generation
Visual Question Answering
Presented by: Anurag Paul
The experiments based on Recurrent Neural Networks
Presented By: Harshul Gupta
Weeks 1 and 2 Aaron Ott.
Week 3 Presentation Ngoc Ta Aidean Sharghi.
UCF-REU in Computer Vision
Deep learning: Recurrent Neural Networks CV192
Week 3 Volodymyr Bobyr.
Bidirectional LSTM-CRF Models for Sequence Tagging
Week 7 Presentation Ngoc Ta Aidean Sharghi
Prabhas Chongstitvatana Chulalongkorn University
Mahdi Kalayeh David Hill
Visual Grounding.
CRCV REU 2019 Aaron Honculada.
Lecture 9: Machine Learning Overview AP Computer Science Principles
CVPR 2019 Poster.
Presentation transcript:

Learn to Comment Mentor: Mahdi M. Kalayeh REU Students: Lance Lebanoff | David Hill | Jonathan Pham

Problem Definition Given an image, generate comments about the image that mimic human comments. Use a combination of existing computer vision techniques to extract the features from the image: Object detection Scene understanding Sentiment analysis Train a deep neural network.

Useful Ideas From Previous Works Deep Learning structures CNN, LSTM Long Short Term Memory networks (LSTM) Accepts temporal sequences of arbitrary length Output natural language word by word Natural language processing for sentence descriptions Sentiment analysis from Sentibank

LRCNN Donahue, Jeff Long-term Recurrent Convolutional Networks for Visual Recognition and Description

LSTM Donahue, Jeff Long-term Recurrent Convolutional Networks for Visual Recognition and Description

Is deep learning enough? Is an end-to-end neural network sufficient? Training from scratch Issue of overfitting Pre-trained

Descriptions vs. Comments Image-sentence fragment alignment Comments can refer to concepts beyond the context of the image. Pop culture, current events, visual aesthetic, ... Unlike descriptions, comments confer sentiment about their subjects.

Extracting sentiment from an image (SentiBank).

Anticipated Challenges Generating comments on an image is a harder task than describing its visual content. Many possible ‘divergent’ comments. Standard NLP metrics like BLEU will not work. Data Collection Comments on images are often replies to other comments. Captions on pictures often influence the nature of the comments. We will most likely need to clean our own data sets by removing irrelevant comments.

Division of Tasks Lance: Sentiment Analysis from textual content Studying possible data collection approaches Jonathan: Sentiment Analysis from visual content David: Deep Learning (Caffe, LRCNN)