ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba.

ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba

 Introduction  Database of eye tracking data  Learning a model of saliency  Applications  Conclusion

Bottom-up control of selective attention − stimulus salience (defined by color, contrast and orientation) − Saliency map

 Current saliency models do not accurately predict human fixations.

Top-down control of selective attention − Scene schema guides fixations (more likely to land on meaningful areas) − Task goals guides fixations to land on objects relevant to the task

The first is a large database of eye tracking experiments with labels and analysis Second is a supervised learning model of saliency which combines both bottom-up image based saliency cues and top-down image semantic dependent cues Goal : Predict where users look without the eye tracking hardware.

 Data gathering protocol ◦ 1003 random images from Flickr and LabelMe and recorded eye tracking data from 15 users who free viewed these images. 779 landscape images and 228 portrait images.

 Data gathering protocol Gaze tracking paths and fixation locations are recorded for each viewer

 Data gathering protocol left. Saliency mapright. most salient 20 percent of the image Gaussia n filter

 Analysis of dataset ◦ a strong bias for human fixations to be near the center of the image [19][23]

 Analysis of dataset ◦ the performance of human saliency maps to predict eye fixations Ground truth fixations Saliency map as classifier

 Analysis of dataset ◦ Object of interest and Size of regions of interest

 Features used for machine learning ◦ Low-level features ex: color,orientation,intensity ◦ Mid-level features ex: horizon ◦ High-level features ex: face detector ◦ Center prior : distance to the center

 Training sample selection ◦ 903 training images and 100 testing images ◦ 10 positively labeled pixels randomly from the top 20% salient locations 10 negatively labeled pixels from the bottom 70% salient locations  Training ◦ used the liblinear support vector machine to train a model

 Comparison of saliency maps

 Performance on testing images 1. Outperforms than other model 2. Reaches 88% of the way to human performance 3. not benefit from the huge bias of fixations toward the center 4. the overall performance for the object detector model is low

 Performance on testing samples (the average of the true positive and true negative rates) 1. performs only as well as chance for the other subsets of samples 2. the later model performs more robustly over all subsets of samples 3. people and cars performs better on the subsets with faces

 Using eye tracking data to decide how to render a photograph with differing levels of detail. [4] D. DeCarlo and A. Santella. Stylization and abstraction of photographs. ACM Transactions on Graphics

 Contributions ◦ Developed a largest eye tracking database of natural images and permits large-scale quantitative analysis of fixations points and gaze paths. ◦ Using machine learning to train a bottom-up, top-down model of saliency and outperforms several existing Models.  future work ◦ understanding the impact of framing, cropping and scaling images on fixations.

ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba.

Similar presentations

Presentation on theme: "ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba.

Similar presentations

Presentation on theme: "ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba."— Presentation transcript:

Similar presentations

About project

Feedback