Marked Point Processes for Crowd Counting

Slides:

Advertisements

Similar presentations

Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.

Advertisements

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.

Presented by Xinyu Chang

Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.

Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.

Marked Point Processes for Crowd Counting

Foreground Modeling The Shape of Things that Came Nathan Jacobs Advisor: Robert Pless Computer Science Washington University in St. Louis.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

La Parguera Hyperspectral Image size (250x239x118) using Hyperion sensor. INTEREST POINTS FOR HYPERSPECTRAL IMAGES Amit Mukherjee 1, Badrinath Roysam 1,

IEEE TCSVT 2011 Wonjun Kim Chanho Jung Changick Kim

Robust and large-scale alignment Image from

Lecture 6: Feature matching CS4670: Computer Vision Noah Snavely.

3-D Depth Reconstruction from a Single Still Image 何開暘

Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

1 Interest Operator Lectures lecture topics –Interest points 1 (Linda) interest points, descriptors, Harris corners, correlation matching –Interest points.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

E.G.M. PetrakisTexture1 Repeative patterns of local variations of intensity on a surface –texture pattern: texel Texels: similar shape, intensity distribution.

MULTIPLE MOVING OBJECTS TRACKING FOR VIDEO SURVEILLANCE SYSTEMS.

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.

Overview Introduction to local features

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

A General Framework for Tracking Multiple People from a Moving Camera

Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

An Efficient Search Strategy for Block Motion Estimation Using Image Features Digital Video Processing 1 Term Project Feng Li Michael Su Xiaofeng Fan.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

Stable Multi-Target Tracking in Real-Time Surveillance Video

Robust Object Tracking by Hierarchical Association of Detection Responses Present by fakewen.

Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.

Scene-Consistent Detection of Feature Points in Video Sequences Ariel Tankus & Yehezkel Yeshurun CVPR - Dec Tel-AvivUniversity.

Kylie Gorman WEEK 1-2 REVIEW. CONVERTING AN IMAGE FROM RGB TO HSV AND DISPLAY CHANNELS.

Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.

1 Tree Crown Extraction Using Marked Point Processes Guillaume Perrin Xavier Descombes – Josiane Zerubia ARIANA, joint research group CNRS/INRIA/UNSA INRIA.

 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.

Local features: detection and description

Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Digital Image Processing CSC331

Strong Supervision From Weak Annotation Interactive Training of Deformable Part Models ICCV /05/23.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

Blob detection.

SIFT Scale-Invariant Feature Transform David Lowe

CS262: Computer Vision Lect 09: SIFT Descriptors

Interest Points EE/CSE 576 Linda Shapiro.

Marked Point Processes for Crowd Counting

Paper – Stephen Se, David Lowe, Jim Little

Video Google: Text Retrieval Approach to Object Matching in Videos

estimated tracklet partition

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Zhaozheng Yin and Robert T. Collins Dept

Marked Point Processes for Crowd Counting

Figure 4. Testing minimal configurations with existing models for spatiotemporal recognition. (A-B) A binary classifier is trained to separate a positive.

CSE 455 – Guest Lectures 3 lectures Contact Interest points 1

CSDD Features: Center-Surround Distribution Distance

Local Binary Patterns (LBP)

Local features and image matching

Video Google: Text Retrieval Approach to Object Matching in Videos

Fourier Transform of Boundaries

Lecture 5: Feature invariance

Computer and Robot Vision I

Computer and Robot Vision I

Presentation transcript:

Marked Point Processes for Crowd Counting Weina Ge and Robert T. Collins Computer Science and Engineering Department, The Pennsylvania State University, USA Introduction Extrinsic Shape Mappings Estimation Experimental Results Detection Results A Bayesian marked point process (MPP) model is developed to detect and count people in crowded scenes. The model couples a spatial stochastic process governing number and placement of individuals with a conditional mark process for selecting body shape. We automatically learn the mark (shape) process from training video by estimating a mixture of Bernoulli shape prototypes along with an extrinsic shape distribution describing the orientation and scaling of these shapes for any given image location. The reversible jump Markov Chain Monte Carlo framework is used to efficiently search for the maximum a posteriori configuration of shapes, leading to an estimate of the count, location and pose of each person in the scene. Intuition: consider a center-surround region of a given scale, centered at a given pixel We have evaluated CSDD performance with respect to detection repeatability and matching utility. Details of the experiments and a complete set of results can be found on our website: 1. Extract feature distribution F 2. Extract feature distribution G http://vision.cse.psu.edu/projects/mpp/mpp.html 3. Compute Earth Mover’s Distance EMD(F,G) to measure dissimilarity of center region from surround region. Matching Results How to do this efficiently for all pixels? binary channels channels We use a marked point process to determine the number and configuration of multiple people in a scene. In addition to determining the location, scale and orientation of each individual, the MPP also selects an appropriate body shape from a set of learned Bernoulli shape prototypes, as displayed at the bottom. CSDD score (EMD) Motivation Implementation Details Soft map This method of EMD computation only works for 1D distributions. For n-D distributions, we concatenate the n 1D marginals to get a 1D distribution. Fast LoG filtering at every scale is performed using a fourth-order IIR filter (aka Deriche-filtering). We form a scale space of CSDD score images indexed by the scale of the LoG filter. CSDD features are then found as extrema in both scale and space. Row 1: four frames from a parking lot video sequence, showing affine alignment of bottom frame overlaid on top frame. Row 2: left to right: shout3 to shout4; shout2 to was2 (images courtesy of Tinne Tuytelaars); stop sign; snowy stop sign. Row 3: kampa1 to kampa4 (images courtesy of Jiri Matas); bike1 to bike6; trees1 to trees5; ubc1 to ubc6. Row 4: natural textures: asphalt; grass; gravel; stones. Example: Yin-yang symbol superimposed on an intensity gradient. Of the six interest region detectors compared, only the CSDD detector captures the natural location and scale of the symbol.