Daozheng Chen 1, Mustafa Bilgic 2, Lise Getoor 1, David Jacobs 1, Lilyana Mihalkova 1, Tom Yeh 1 1 Department of Computer Science, University of Maryland,

Slides:

Advertisements

Similar presentations

Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.

Advertisements

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

A Unified Framework for Context Assisted Face Clustering

Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.

Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.

Wen-Hung Liao Department of Computer Science National Chengchi University November 27, 2008 Estimation of Skin Color Range Using Achromatic Features.

Patch to the Future: Unsupervised Visual Prediction

Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig.

Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.

1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a.

NetSci07 May 24, 2007 Entity Resolution in Network Data Lise Getoor University of Maryland, College Park.

MPEG-4 Objective Standardize algorithms for audiovisual coding in multimedia applications allowing for Interactivity High compression Scalability of audio.

Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.

Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by grants from the National.

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

{ Fast Disparity Estimation Using Spatio- temporal Correlation of Disparity Field for Multiview Video Coding Wei Zhu, Xiang Tian, Fan Zhou and Yaowu Chen.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Image Search Presented by: Samantha Mahindrakar Diti Gandhi.

Jierui Xie, Boleslaw Szymanski, Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180, USA {xiej2, szymansk,

Distinguishing Photographic Images and Photorealistic Computer Graphics Using Visual Vocabulary on Local Image Edges Rong Zhang,Rand-Ding Wang, and Tian-Tsong.

Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.

LSDS-IR’08, October 30, Peer-to-Peer Similarity Search over Widely Distributed Document Collections Christos Doulkeridis 1, Kjetil Nørvåg 2, Michalis.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications Lucia Maddalena and Alfredo Petrosino, Senior Member, IEEE.

KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.

Active Learning Strategies for Compound Screening Megon Walker 1 and Simon Kasif 1,2 1 Bioinformatics Program, Boston University 2 Department of Biomedical.

Distributed Representations of Sentences and Documents

A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.

Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology

Identifying Computer Graphics Using HSV Model And Statistical Moments Of Characteristic Functions Xiao Cai, Yuewen Wang.

Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.

Music retrieval Conventional music retrieval systems Exact queries: ”Give me all songs from J.Lo’s latest album” What about ”Give me the music that I like”?

Mean Field Inference in Dependency Networks: An Empirical Study Daniel Lowd and Arash Shamaei University of Oregon.

Object Detection Sliding Window Based Approach Context Helps

Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.

Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.

Content-Based Image Retrieval

Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.

Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.

Efficient Region Search for Object Detection Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas at Austin.

Wei Feng , Jiawei Han, Jianyong Wang , Charu Aggarwal , Jianbin Huang

Today Ensemble Methods. Recap of the course. Classifier Fusion

BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.

A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.

Bo QIN, Zongshun MA, Zhenghua FANG, Shengke WANG Computer-Aided Design and Computer Graphics, th IEEE International Conference on, p Presenter.

INTERACTIVELY BROWSING LARGE IMAGE DATABASES Ronald Richter, Mathias Eitz and Marc Alexa.

Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer.

Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date :

Saliency Aggregation: A Data- driven Approach Long Mai Yuzhen Niu Feng Liu Department of Computer Science, Portland State University Portland, OR,

USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.

Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.

Final Project Mei-Chen Yeh May 15, General In-class presentation – June 12 and June 19, 2012 – 15 minutes, in English 30% of the overall grade In-class.

Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.

Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Coached Active Learning for Interactive Video Search Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University,

Cross-modal Hashing Through Ranking Subspace Learning

Boosted Augmented Naive Bayes. Efficient discriminative learning of

Learning Mid-Level Features For Recognition

An Empirical Study of Learning to Rank for Entity Search

Video Google: Text Retrieval Approach to Object Matching in Videos

Estimation of Skin Color Range Using Achromatic Features

Video Google: Text Retrieval Approach to Object Matching in Videos

NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &

Presentation transcript:

Daozheng Chen 1, Mustafa Bilgic 2, Lise Getoor 1, David Jacobs 1, Lilyana Mihalkova 1, Tom Yeh 1 1 Department of Computer Science, University of Maryland, College Park 2 Department of Computer Science, Illinois Institute of Technology Active Inference for Retrieval in Camera Networks

Problem Search camera network videos to retrieve frames containing specified individuals.

Time query

Time query

Related Work Person re-identification [Wang et al. ’07] Graphical Models for camera networks [Loy et al. ’09] Tracking over camera networks [Song et al. ’07] Active Learning [Settles ’09]

Our Contributions Map video frames in a camera network onto a graphical model and use a collective classification algorithm to predict frame states and perform frame retrieval. Apply active inference to direct human attention to portions of the videos which are most likely to have the biggest performance improvement.

Graphical Structures

Collective Classification

Active Inference

Active Inference

Outline Graphical model construction Iterative classification algorithm Active inference Experiment Conclusion

Graphical Model Construction Temporal neighbors (TN) Frames from the previous and next k time steps within the same camera. Positively correlated spatial neighbors (PSN) Correlation of the labels of two camera is greater than some threshold. Negatively correlated spatial neighbors (NSN) Correlation of the labels of two camera is less than some threshold.

Temporal Neighbors (k = 1)

Positively correlated spatial neighbors

Negatively correlated spatial neighbors

Graphical Structures

Outline Graphical model construction Iterative Classification Algorithm Active Inference Experiment Conclusion

The Iterative Classification Algorithm (ICA) Local Models (LM). The label of a frame is only dependent on its features. Relational Models (RM). The label of a frame is dependent on its features and its neighbors’ current labels First apply the local model for initialization, and then use the relational model iteratively until predicted labels converge. [Sen et al. ’08]

The Iterative Classification Algorithm (ICA) Local Models (LM). Logistic regression as the classifier. Cosine similarity based on signatures using bag-of-feature model as features F q = [f q1,f q2,…,f qn ] F = [f 1,f 2,…,f n ] COS(F q,F)

The Iterative Classification Algorithm (ICA) Relational Models (RM). Logistic regression as the classifier. Use aggregation function to construct a feature vector encoding neighbors’ information. F = [f 21,f 22,…,f 2n ]

The Iterative Classification Algorithm (ICA) Relational Models (RM). Logistic regression as the classifier. Use aggregation function to construct a feature vector encoding neighbors’ information. F = [f 21,f 22,…,f 2n ] F TN = [f TN1,f TN2 ]

The Iterative Classification Algorithm (ICA) Relational Models (RM). Logistic regression as the classifier. Use aggregation function to construct a feature vector encoding neighbors’ information. F = [f 21,f 22,…,f 2n ] F TN = [f TN1,f TN2 ] F PSN = [f PSN1,f PSN2 ]

The Iterative Classification Algorithm (ICA) Relational Models (RM). Logistic regression as the classifier. Use aggregation function to construct a feature vector encoding neighbors’ information. F = [f 21,f 22,…,f 2n ] F TN = [f TN1,f TN2 ] F PSN = [f PSN1,f PSN2 ] F NSN = [f NSN1,f NSN2 ]

The Iterative Classification Algorithm (ICA) Relational Models (RM). Logistic regression as the classifier. Use aggregation function to construct a feature vector encoding neighbors’ information. F = [f 21,f 22,…,f 2n ] F TN = [f TN1,f TN2 ] F PSN = [f PSN1,f PSN2 ] F NSN = [f NSN1,f NSN2 ] F RM

Outline Graphical model construction Iterative Classification Algorithm Active Inference Experiment Conclusion

Active Inference The retrieval algorithm can request the correct labels for some frames at inference time. [Rattigan et al. ’07] Subsequent inference using ICA is based on these corrected labels. Common methods for selecting frames to label: Random (RND). Uniform (UNI). Most certain to be relevant (MR). Most uncertain (UNC) Reflect and Correct. [Bilgic and Getoor. ’09]

Reflect and Correct (RAC) [Bilgic and Getoor. ’TKDD09]

Adaptive RAC (MLI)

Outline Graphical model construction Iterative Classification Algorithm Active Inference Experimental Evaluation Conclusion

Dataset [Ding et al. ’10]

Queries

Region of Interests Background subtraction to determine region of interest in the frame. Densely sample key points in the regions Use color histogram in RGB space to describe the region spanned by a key point Quantized the descriptor according to learned 500 code words. Produce a single signature for a video frame.

Spatial Topology

Methods for Comparison Active inference based on LM using RND, UNI, MR, UNC, MLI Active inference based on RM using RND, UNI, MR, UNC, MLI Average accuracy and Average 11-average precision as measurement

Results UNC-LM has the best performance when results are based on LM. RM always perform better than LM does under the same sampling method. UNC-RM and MLI always perform better. MLI never perform worse than MLI does.

Outline Graphical model construction Iterative Classification Algorithm Active Inference Experiment Conclusion

Using a graphical model provides significant performance improvements in frame retrieval. A simple method that captures the frame uncertainty has an advantage over other baseline methods. Our adaptation of RAC has overall better performance.

Questions?