Quantifying and Transferring Contextual Information in Object Detection Professor: S. J. Wang Student : Y. S. Wang 1.

Slides:

Advertisements

Similar presentations

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

Advertisements

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

Patch to the Future: Unsupervised Visual Prediction

Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.

Yuanlu Xu Advisor: Prof. Liang Lin Person Re-identification by Matching Compositional Template with Cluster Sampling.

Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,

1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.

Transferable Dictionary Pair based Cross-view Action Recognition Lin Hong.

Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.

1 Fast Asymmetric Learning for Cascade Face Detection Jiaxin Wu, and Charles Brubaker IEEE PAMI, 2008 Chun-Hao Chang 張峻豪 2009/12/01.

Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection Ejaz Ahmed 1, Gregory Shakhnarovich 2, and Subhransu Maji 3 1 University.

Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.

Large-Scale Object Recognition with Weak Supervision

Groups of Adjacent Contour Segments for Object Detection Vittorio Ferrari Loic Fevrier Frederic Jurie Cordelia Schmid.

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Fast intersection kernel SVMs for Realtime Object Detection

Student: Yao-Sheng Wang Advisor: Prof. Sheng-Jyh Wang ARTICULATED HUMAN DETECTION 1 Department of Electronics Engineering National Chiao Tung University.

Beyond Actions: Discriminative Models for Contextual Group Activities Tian Lan School of Computing Science Simon Fraser University August 12, 2010 M.Sc.

Training Regimes Motivation  Allow state-of-the-art subcomponents  With “Black-box” functionality  This idea also occurs in other application areas.

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

1 Integrating Vision Models for Holistic Scene Understanding Geremy Heitz CS223B March 4 th, 2009.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Generic Object Recognition -- by Yatharth Saraf A Project on.

A Study of Approaches for Object Recognition

Learning Spatial Context: Using stuff to find things Geremy Heitz Daphne Koller Stanford University October 13, 2008 ECCV 2008.

Spatial Pyramid Pooling in Deep Convolutional

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or.

Bag of Video-Words Video Representation

Learning Based Hierarchical Vessel Segmentation

EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.

Object Recognizing. Recognition -- topics Features Classifiers Example ‘winning’ system.

Professor: S. J. Wang Student : Y. S. Wang

Jifeng Dai 2011/09/27.  Introduction  Structural SVM  Kernel Design  Segmentation and parameter learning  Object Feature Descriptors  Experimental.

“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)

Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca

Group Sparse Coding Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July

Learning Collections of Parts for Object Recognition and Transfer Learning University of Illinois at Urbana- Champaign.

Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.

Pedestrian Detection and Localization

Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

INTRODUCTION Heesoo Myeong and Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, Seoul, Korea Tensor-based High-order.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.

Histograms of Oriented Gradients for Human Detection(HOG)

Discriminative Sub-categorization Minh Hoai Nguyen, Andrew Zisserman University of Oxford 1.

Recognition Using Visual Phrases

CSE 185 Introduction to Computer Vision Feature Matching.

Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.

Towards Total Scene Understanding: Classiﬁcation, Annotation and Segmentation in an Automatic Framework N 工科所錢雅馨 2011/01/16 Li-Jia Li, Richard.

Extracting Simple Verb Frames from Images Toward Holistic Scene Understanding Prof. Daphne Koller Research Group Stanford University Geremy Heitz DARPA.

1.Learn appearance based models for concepts 2.Compute posterior probabilities or Semantic Multinomial (SMN) under appearance models. -But, suffers from.

Hybrid Classiﬁers for Object Classiﬁcation with a Rich Background M. Osadchy, D. Keren, and B. Fadida-Specktor, ECCV 2012 Computer Vision and Video Analysis.

Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.

Cell Segmentation in Microscopy Imagery Using a Bag of Local Bayesian Classifiers Zhaozheng Yin RI/CMU, Fall 2009.

Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.

Object detection with deformable part-based models

Nonparametric Semantic Segmentation

Huazhong University of Science and Technology

Paper Presentation: Shape and Matching

Object detection as supervised classification

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

HOGgles Visualizing Object Detection Features

CSE 185 Introduction to Computer Vision

Example segmentations - unseen images

Semantic Segmentation

Presentation transcript:

Quantifying and Transferring Contextual Information in Object Detection Professor: S. J. Wang Student : Y. S. Wang 1

Outline Background Goal Difficulties in Usage of Contextual Information Provided solutions Another method: TAS Experimental Results and Discussion Conclusion and Future Direction 2

Background (I) Only the properties of target object used in the detection task in the past. ◦ Problem: Intolerable number of false positive 3

Background (I) Only the properties of target object used in the detection task in the past. ◦ Problem: Intolerable number of false positive 4

Background (II) What else??? Contextual information! 5

Goal Establish a model to efficiently utilize the contextual information to boost the performance of detection accuracy. 6

Difficulties (I) Diversity of Contextual Information ◦ There are may different types of context often co-existing with different degrees of relevance to the detection for the target object(s) in different images. ◦ Terminology:  Things (e.g. cars and people)  Stuffs (e.g. roads and sky)  Scene (e.g. what happen in the image) ◦ Thing-Thing, Thing-Stuff, Stuff-Stuff and Scene-Thing 7

Difficulties (II) Ambiguity of Contextual Information ◦ Contextual information can be ambiguous and unreliable, thus may not always have a positive effect on object detection. ◦ Ex: Crowded Scene with constant movement and occlusion among multiple objects. 8

Difficulties (III) Lack of Data for Context Learning ◦ Not enough training data :  Over-fitting problem  Wrong degree of relevance ◦ Ex: The contextual information of people on top of sofa can be more useful than people on top of grass. 9

Training Data Preparation & Notation Representation 10 Base Detector (HOG) Training Image Candidate windows Positive sample: Red Negative sample: Green

Provided Solutions A polar geometric descriptor for contextual representation. A maximum margin context model (MMC) for quantifying context. A context transfer learning model for context learning with limited data. 11

Polar Geometric Descriptor Instead of traditional annotation based descriptor, here we use polar geometric descriptor to describe two kind of contextual information (Thing-Thing, Thing-Stuff). 12 r :orientation b+1 :radial bins r*b+1 :patches 0.5σ, σ and 2σ :bin length Feature :HOG Patch representation: Bag of Words method using K-means with K = 100

Provided Solutions A polar geometric descriptor for contextual representation. A maximum margin context model (MMC) for quantifying context. A context transfer learning model for context learning with limited data. 13

Quantifying Context (I) Quantifying Context (I) 14 Risk function:

Quantifying Context (II) Quantifying Context (II) Goal = Minimize the Risk function 15 Minimize L equal to fulfill the following constraint Hard to be solved, could be replaced by

Quantifying Context (III) Maximum Margin Context Model 16 Add some extra variables and constraints

Provided Solutions A polar geometric descriptor for contextual representation. A maximum margin context model (MMC) for quantifying context. A context transfer learning model for context learning with limited data. 17

Context Transfer Learning Context Transfer Learning Two Cases: ◦ Similar contextual information  Ex: Cars and motorbikes ◦ Little in common in both appearance and context, but similar level of assistance provided by contextual information.  Ex: People and bikes 18

TMMC-I: Transferring Discriminant Contextual Information TMMC-I: Transferring Discriminant Contextual Information Similar context provide the assistance on the learning of w. 19

TMMC-I: Transferring Discriminant Contextual Information TMMC-I: Transferring Discriminant Contextual Information New Constraint: 20 Modified optimization function:

TMMC-II: Transferring the Weight of Prior Detection Score Similar level of assistance, same weight 21

TMMC-II: Transferring the Weight of Prior Detection Score 22 New Constraint: Modified optimization function:

Another Method: TAS 23

Another Method: TAS (I) 24 Steps: 1.Segmenting image into regions. 2.Use base-detector to get the candidate patches. 3.Establish the relationships between candidate patches and regions. 4.Use the relationships to judge there is a target object in the patch or not.

Another Method: TAS (II) Region clusters: 25

Another Method: TAS (III) Examples of experiment: 26

Experimental Result and Discussion Use four data sets for testing: ◦ VOC 2005 ◦ VOC 2007 ◦ I-LIDS ◦ FORECOURT 27

Experimental Result and Discussion 28

Experimental Result and Discussion 29

Experimental Result and Discussion Context Transfer Learning Models: 30

Experimental Result and Discussion Context Transfer Learning Models: 31

Conclusion and Future Direction In this paper, the author proposes a contextual information model to quantify and select useful context information to boost the detection performance. What can we do next? ◦ HOG feature not suits for stuff (e.g. sky, road) ◦ Automatic selection between TMMC-I, TMMC-II ◦ Automatic selection between target object category and source category 32

Reference Wei-Shi Zheng, Member, IEEE, Shaogang Gong, and Tao Xiang, ”Quantifying and Transferring Contextual Information in Object Detection ”, PAMI accepted. Geremy Heitz, Daphne Koller, “ Learning Spatial Context: Using Stuff to Find Things”, ECCV Youtube Search “Hard-Margin SVM” 33