Understanding and Predicting Image Memorability at a Large Scale

Slides:



Advertisements
Similar presentations
Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)
Advertisements

A brief review of non-neural-network approaches to deep learning
What makes an image memorable?
ImageNet Classification with Deep Convolutional Neural Networks
Spatial Pyramid Pooling in Deep Convolutional
Feedforward semantic segmentation with zoom-out features
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Understanding and Predicting Image Memorability at a Large Scale A. Khosla, A. S. Raju, A. Torralba and A. Oliva International Conference on Computer Vision.
Recent developments in object detection
Big data classification using neural network
Unsupervised Learning of Video Representations using LSTMs
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Faster R-CNN – Concepts
Deep Feedforward Networks
Predicting Visual Search Targets via Eye Tracking Data
The Relationship between Deep Learning and Brain Function
Object Detection based on Segment Masks
Compact Bilinear Pooling
Object detection with deformable part-based models
Automatic Lung Cancer Diagnosis from CT Scans (Week 2)
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Lecture 24: Convolutional neural networks
Part-Based Room Categorization for Household Service Robots
Intelligent Information System Lab
Schizophrenia Classification Using
Training Techniques for Deep Neural Networks
Final Year Project Presentation --- Magic Paint Face
Efficient Deep Model for Monocular Road Segmentation
CS 698 | Current Topics in Data Science
CS6890 Deep Learning Weizhen Cai
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Adversarially Tuned Scene Generation
Image Question Answering
Introduction to Deep Learning for neuronal data analyses
Human Activity Recognition Using Smartphone Sensor Data
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks
Bird-species Recognition Using Convolutional Neural Network
Introduction to Neural Networks
Image Classification.
Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Training Neural Networks II
On-going research on Object Detection *Some modification after seminar
Very Deep Convolutional Networks for Large-Scale Image Recognition
Neural Networks Geoff Hulten.
Lecture: Deep Convolutional Neural Networks
Visualizing and Understanding Convolutional Networks
Dr. Borji Aisha Urooj Cecilia La Place
Lip movement Synthesis from Text
Analysis of Trained CNN (Receptive Field & Weights of Network)
RCNN, Fast-RCNN, Faster-RCNN
Introduction to Object Tracking
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Heterogeneous convolutional neural networks for visual recognition
by Khaled Nasr, Pooja Viswanathan, and Andreas Nieder
Department of Computer Science Ben-Gurion University of the Negev
Automatic Handwriting Generation
Deep Object Co-Segmentation
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.
Learning and Memorization
Semantic Segmentation
Object Detection Implementations
Presented By: Harshul Gupta
Week 7 Presentation Ngoc Ta Aidean Sharghi
Adrian E. Gonzalez , David Parra Department of Computer Science
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.
Do Better ImageNet Models Transfer Better?
Presented By: Firas Gerges (fg92)
Presentation transcript:

Understanding and Predicting Image Memorability at a Large Scale

Problem How can human visual memory be predicted? Unlike visual classification, images that are memorable, or forgettable do not even look alike:

Dataset As part of this work LaMem dataset is created: 60,000 images Diverse sources – AVA dataset, MIR Flickr, MIT 1003, NUSEF, SUN image popularity dataset, Abnormal objects dataset, aPascal dataset

Let’s Play You will be shown a stream of images, each for 1 second. Just CLAP if you think you have seen the image before in this game.

Collecting memorability data – Visual Memory Game Each task lasted about 4.5 minutes consisting of a total of 186 images divided into 66 targets, 30 fillers, and 12 vigilance repeats. Vigilance repeats are used to ensure that subjects are paying attention.

0.68

Understanding Memorability Memorability scores are normalized to lie between 0 and 1.

Flickr Fixation Flickr Affective dataset AVA dataset

MemNet Pre-trained Hybrid CNN Trained on ILSVRC 2012 and Places dataset Memorability is a single real valued output Last layer is a Euclidean loss layer to fine-tune the network For both HOG2X2 and features from CNNs, a linear Support Vector Regression machine is trained to predict memorability. False Alarms (FA) are used to account for instances when people may remember some images simply because they are familiar but not memorable. Human performance – 0.68

Visualization From top to bottom, we find the neurons could be specializing for the following: people, busy images (lots of gradients), specific objects, buildings, and finally open scenes. This matches our intuition of what objects might make an image memorable.

The segmentations produced by neurons in conv5 that are strongly correlated, either positively or negatively, with memorability.

Memorability Maps To generate memorability maps, images are scaled up and apply MemNet to overlapping regions of the image. This is done for multiple scales of the image and average the resulting memorability maps. Convert the fully-connected layers, fc6 and fc7 to convolutional layers of size 1 1, making the network fully-convolutional. http://memorability.csail.mit.edu/demo.html Slide Credit: Aditya Khosla

Use non-realistic photo renderings or cartoonization to emphasize/de-emphasize different parts of an image based on the memorability maps, and evaluate its impact on the memorability of an image.

Applications and Conclusion Automatically modifying the memorability of images Advertising Gaming Education Social Networking