Download presentation
Presentation is loading. Please wait.
Published byGavin Ray Modified over 8 years ago
1
ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba
2
Introduction Database of eye tracking data Learning a model of saliency Applications Conclusion
3
Bottom-up control of selective attention − stimulus salience (defined by color, contrast and orientation) − Saliency map
4
Current saliency models do not accurately predict human fixations.
5
Top-down control of selective attention − Scene schema guides fixations (more likely to land on meaningful areas) − Task goals guides fixations to land on objects relevant to the task
6
The first is a large database of eye tracking experiments with labels and analysis Second is a supervised learning model of saliency which combines both bottom-up image based saliency cues and top-down image semantic dependent cues Goal : Predict where users look without the eye tracking hardware.
7
Data gathering protocol ◦ 1003 random images from Flickr and LabelMe and recorded eye tracking data from 15 users who free viewed these images. 779 landscape images and 228 portrait images.
8
Data gathering protocol Gaze tracking paths and fixation locations are recorded for each viewer
9
Data gathering protocol left. Saliency mapright. most salient 20 percent of the image Gaussia n filter
10
Analysis of dataset ◦ a strong bias for human fixations to be near the center of the image [19][23]
11
Analysis of dataset ◦ the performance of human saliency maps to predict eye fixations Ground truth fixations Saliency map as classifier
12
Analysis of dataset ◦ Object of interest and Size of regions of interest
13
Features used for machine learning ◦ Low-level features ex: color,orientation,intensity ◦ Mid-level features ex: horizon ◦ High-level features ex: face detector ◦ Center prior : distance to the center
14
Training sample selection ◦ 903 training images and 100 testing images ◦ 10 positively labeled pixels randomly from the top 20% salient locations 10 negatively labeled pixels from the bottom 70% salient locations Training ◦ used the liblinear support vector machine to train a model
15
Comparison of saliency maps
16
Performance on testing images 1. Outperforms than other model 2. Reaches 88% of the way to human performance 3. not benefit from the huge bias of fixations toward the center 4. the overall performance for the object detector model is low
17
Performance on testing samples (the average of the true positive and true negative rates) 1. performs only as well as chance for the other subsets of samples 2. the later model performs more robustly over all subsets of samples 3. people and cars performs better on the subsets with faces
18
Using eye tracking data to decide how to render a photograph with differing levels of detail. [4] D. DeCarlo and A. Santella. Stylization and abstraction of photographs. ACM Transactions on Graphics
19
Contributions ◦ Developed a largest eye tracking database of natural images and permits large-scale quantitative analysis of fixations points and gaze paths. ◦ Using machine learning to train a bottom-up, top-down model of saliency and outperforms several existing Models. future work ◦ understanding the impact of framing, cropping and scaling images on fixations.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.