Understanding and Predicting Image Memorability at a Large Scale

Slides:

Advertisements

Similar presentations

Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)

Advertisements

A brief review of non-neural-network approaches to deep learning

What makes an image memorable?

ImageNet Classification with Deep Convolutional Neural Networks

Spatial Pyramid Pooling in Deep Convolutional

Feedforward semantic segmentation with zoom-out features

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Linear Models & Clustering Presented by Kwak, Nam-ju 1.

Understanding and Predicting Image Memorability at a Large Scale A. Khosla, A. S. Raju, A. Torralba and A. Oliva International Conference on Computer Vision.

Recent developments in object detection

Big data classification using neural network

Unsupervised Learning of Video Representations using LSTMs

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

Faster R-CNN – Concepts

Deep Feedforward Networks

Predicting Visual Search Targets via Eye Tracking Data

The Relationship between Deep Learning and Brain Function

Object Detection based on Segment Masks

Compact Bilinear Pooling

Object detection with deformable part-based models

Automatic Lung Cancer Diagnosis from CT Scans (Week 2)

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Lecture 24: Convolutional neural networks

Part-Based Room Categorization for Household Service Robots

Intelligent Information System Lab

Schizophrenia Classification Using

Training Techniques for Deep Neural Networks

Final Year Project Presentation --- Magic Paint Face

Efficient Deep Model for Monocular Road Segmentation

CS 698 | Current Topics in Data Science

CS6890 Deep Learning Weizhen Cai

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Adversarially Tuned Scene Generation

Image Question Answering

Introduction to Deep Learning for neuronal data analyses

Human Activity Recognition Using Smartphone Sensor Data

Layer-wise Performance Bottleneck Analysis of Deep Neural Networks

Bird-species Recognition Using Convolutional Neural Network

Introduction to Neural Networks

Image Classification.

Recurrent Neural Networks

CS 4501: Introduction to Computer Vision Training Neural Networks II

On-going research on Object Detection *Some modification after seminar

Very Deep Convolutional Networks for Large-Scale Image Recognition

Neural Networks Geoff Hulten.

Lecture: Deep Convolutional Neural Networks

Visualizing and Understanding Convolutional Networks

Dr. Borji Aisha Urooj Cecilia La Place

Lip movement Synthesis from Text

Analysis of Trained CNN (Receptive Field & Weights of Network)

RCNN, Fast-RCNN, Faster-RCNN

Introduction to Object Tracking

边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University

Heterogeneous convolutional neural networks for visual recognition

by Khaled Nasr, Pooja Viswanathan, and Andreas Nieder

Department of Computer Science Ben-Gurion University of the Negev

Automatic Handwriting Generation

Deep Object Co-Segmentation

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Learning and Memorization

Semantic Segmentation

Object Detection Implementations

Presented By: Harshul Gupta

Week 7 Presentation Ngoc Ta Aidean Sharghi

Adrian E. Gonzalez , David Parra Department of Computer Science

Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.

Do Better ImageNet Models Transfer Better?

Presented By: Firas Gerges (fg92)

Presentation transcript:

Understanding and Predicting Image Memorability at a Large Scale

Problem How can human visual memory be predicted? Unlike visual classification, images that are memorable, or forgettable do not even look alike:

Dataset As part of this work LaMem dataset is created: 60,000 images Diverse sources – AVA dataset, MIR Flickr, MIT 1003, NUSEF, SUN image popularity dataset, Abnormal objects dataset, aPascal dataset

Let’s Play You will be shown a stream of images, each for 1 second. Just CLAP if you think you have seen the image before in this game.

Collecting memorability data – Visual Memory Game Each task lasted about 4.5 minutes consisting of a total of 186 images divided into 66 targets, 30 fillers, and 12 vigilance repeats. Vigilance repeats are used to ensure that subjects are paying attention.

0.68

Understanding Memorability Memorability scores are normalized to lie between 0 and 1.

Flickr Fixation Flickr Affective dataset AVA dataset

MemNet Pre-trained Hybrid CNN Trained on ILSVRC 2012 and Places dataset Memorability is a single real valued output Last layer is a Euclidean loss layer to fine-tune the network For both HOG2X2 and features from CNNs, a linear Support Vector Regression machine is trained to predict memorability. False Alarms (FA) are used to account for instances when people may remember some images simply because they are familiar but not memorable. Human performance – 0.68

Visualization From top to bottom, we find the neurons could be specializing for the following: people, busy images (lots of gradients), specific objects, buildings, and finally open scenes. This matches our intuition of what objects might make an image memorable.

The segmentations produced by neurons in conv5 that are strongly correlated, either positively or negatively, with memorability.

Memorability Maps To generate memorability maps, images are scaled up and apply MemNet to overlapping regions of the image. This is done for multiple scales of the image and average the resulting memorability maps. Convert the fully-connected layers, fc6 and fc7 to convolutional layers of size 1 1, making the network fully-convolutional. http://memorability.csail.mit.edu/demo.html Slide Credit: Aditya Khosla

Use non-realistic photo renderings or cartoonization to emphasize/de-emphasize different parts of an image based on the memorability maps, and evaluate its impact on the memorability of an image.

Applications and Conclusion Automatically modifying the memorability of images Advertising Gaming Education Social Networking