Learn to Comment Mentor: Mahdi M. Kalayeh REU Students: Lance Lebanoff | David Hill | Jonathan Pham
Problem Definition Given an image, generate comments about the image that mimic human comments. Use a combination of existing computer vision techniques to extract the features from the image: Object detection Scene understanding Sentiment analysis Train a deep neural network.
Useful Ideas From Previous Works Deep Learning structures CNN, LSTM Long Short Term Memory networks (LSTM) Accepts temporal sequences of arbitrary length Output natural language word by word Natural language processing for sentence descriptions Sentiment analysis from Sentibank
LRCNN Donahue, Jeff Long-term Recurrent Convolutional Networks for Visual Recognition and Description
LSTM Donahue, Jeff Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Is deep learning enough? Is an end-to-end neural network sufficient? Training from scratch Issue of overfitting Pre-trained
Descriptions vs. Comments Image-sentence fragment alignment Comments can refer to concepts beyond the context of the image. Pop culture, current events, visual aesthetic, ... Unlike descriptions, comments confer sentiment about their subjects.
Extracting sentiment from an image (SentiBank).
Anticipated Challenges Generating comments on an image is a harder task than describing its visual content. Many possible ‘divergent’ comments. Standard NLP metrics like BLEU will not work. Data Collection Comments on images are often replies to other comments. Captions on pictures often influence the nature of the comments. We will most likely need to clean our own data sets by removing irrelevant comments.
Division of Tasks Lance: Sentiment Analysis from textual content Studying possible data collection approaches Jonathan: Sentiment Analysis from visual content David: Deep Learning (Caffe, LRCNN)