Download presentation
Presentation is loading. Please wait.
1
Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 30, NO. 7, JULY 2008
2
Outline Introduction Overview Probabilistic modeling Computing MAP by efficient MCMC Experimental results Conclusion
3
Introduction Segmentation and tracking of multiple humans in crowded situations is made difficult by interobject occlusion.
4
Introduction The method is feasible for a crowed scene: –persistent and temporarily heavy occlusion –Do not require that humans isolated when they first enter the scene. –More complex shape models are needed. –Joint reasoning about the collection of objects is needed..
5
Introduction Main features of this work: –A three-dimensional part-based human body model which enables the segmentation and tracking of humans in 3D and the inference of interobject occlusion naturally. –A Bayesian framework that integrates segmentaion and tracking based on a joint likelihood for the appearance of multiple objects.
6
Introduction –The design of an efficient Markov chain dynamics, directed by proposal probabilities based on image cues. –The incorporation of a color-based background model in a mean-shift tracking step.
7
Overview The prior models: –Background model: Based on a background model, the foreground blobs are extracted as the basic observation. –3D human shape model: Since the hypotheses are in 3D, occlusion reasoning is straightforward. –Camera model & Ground Plane Multiple 3D human hypotheses are projected onto the image plane and matched with the foreground blobs.
8
Overview The segmentation and tracking are integrated in a unified framework and interoperate along time: Segment the foreground blobs into multiple humans and associate the segmented humans with the existing trajectories. The tracks are used to propose human hypothesis in the next frame.
9
Overview We formulate the problem as one of Bayesian inference to find the best interpretation given the image observations, the prior model, and the estimates from the previous frame analysis. That is the maximun a posteriori (MAP) estimation.
10
Overview The state to be estimated at each frame: –The number of objects –Their correspondences to the objects in the previous frame (if any). –Their parameters (for example, position) –Uncertainty of the parameters – …
11
Probabilistic modeling Our goal is to estimate the state at time t, (t), given the image observation, I (1),…, I (t) : the state of the objects. : the solution space.
12
Probabilistic modeling a state containing n objects can be written as where k i is the unique identity of the ith object whose parameters are m i and n is the solution space of exactly n objects. The entire solution space is
13
3D human shape model The parameter of an individual human, m, are defined based on a 3D human shape model. Do not attempt to capture the detailed shape and articulation parameters of the human body. Head, torso, and legs, with fixed spatial relationship.
14
3D human shape model The parameters (m i ) to describe 3D human hypothesis: –size (h i ): 3D height of the model, it also control the overall scaling of the object in the three directions. –thickness (f i ): captures extra scaling in the horizontal directions. –position (u i or (x i,y i )): the image position of the head.
15
3D human shape model –orientation (o i ): 3D orientation of the body Orientations of the models are quantized into few levels for computation efficiency. –inclination (i i ): 2D inclination of the body There is the chance that the body may be inclined slgithly.
16
Object appearance model We use a color histogram of the object, defined within the object shape. It help establish correspondence in tracking because it is insensitive to the nonrigidity of human motion. There exists an efficient algorithm, for example, the mean-shift technique, to optimize a histogram- based object function.
17
Background appearance model The probability of pixel j being from the background is
18
The prior distribution The first term : – is independent of time and is defined by –S i is the projected image of the ith object and |S i | is its area. –
19
The prior distribution –P(o frontal )=P(o profile )=1/2 –P(x i,y i ) is a uniform distribution in the region where a human head is plausible –P(h i ) is a Gaussian distribution N( h, h 2 ) truncated in the range of [h min,h max ] –P(f i ) is a Gaussian distribution N( f, f 2 ) truncated in the range of [f min,f max ] – P(i i ) is a Gaussian distribution N( i, i 2 )
20
The prior distribution the second term –We approximate it by –We rearrange (t) and (t-1) as such that one of is true.
21
The prior distribution – – P assoc We assume that the position and the inclination of an object follow constant velocity models with Gaussian noise.
22
The prior distribution The height and thickness follow a Gaussian distribution. We use Kalman filters for temporal estimation. –P new & P dead the likelihood of the initialization of a new track the likelihood of the termination of a existing track They are set empirically according to the distance of the object to the entrance/exits.
23
Joint image likelihood for multiple objects and the background The visible part of object ( ): – determined by the depth order of all of the objects, which can be inferred from their 3D position and the camera model. Non object region ( )
24
Joint image likelihood for multiple objects and the background The joint likelihood P(I| ) consists of two terms: The first term: Background exclusion: the likelihood favors difference in an object hypothesis from the background. Object attraction: this likelihood favors its similarity to its corresponding object in the previous frame.
25
Joint image likelihood for multiple objects and the background –d i is the color histogram of the background image within the visibility mask of object i. –p i is the color histogram of the object. – is the Bhattachayya coefficient, which reflects the similarity of the two histogram.
26
Joint image likelihood for multiple objects and the background The second term is: –e j =log(P b (I j )) is the probability of belonging to the background model The likelihood penalizes the difference from the background model.
27
Computing MAP by efficient MCMC Computing the MAP is an optimization problem. Optimization is challenging: –An unknown number of objects, the solution space contains subspaces of varying dimension. –Includes both discrete variables and continuous variable. we adapt a data-driven Markov chain Monte Carlo (MCMC) approach to explore this complex solution space.
28
Computing MAP by efficient MCMC MCMC method with jump/diffusion dynamics to sample the posterior probability. –Jump: cause the Markov chain to move between subspaces with different dimension and traverse the discrete variables. –Diffusions: make the Markov chain sample continuous variables. In the process of sampling, the best solution is recorded and the uncertainty associated with the solution is also obtained.
29
Computing MAP by efficient MCMC
30
MCMC method: –We want to design a Markov chain with stationary distribution. –At the gth iteration, we sample a candidate state ’ from a proposal distribution q( g | g-1 ). –If the candidate state ’ is accepted, g = ’. –Otherwise, g = g-1.
31
Computing MAP by efficient MCMC Markov chain constructed in this way has its stationary distribution equal to P(), independent of the choice of the proposal probability q() and the initial state 0. The choice of the proposal probability q() can affect the efficiency of MCMC significantly. Using more informed proposal probabilities, for example, as in the data-driven MCMC, will make the Markov chain traverse the solution space more efficiently. Therefore, the proposal distribution is written as q( g | g-1, I).
32
Markov chain dynamic The dynamics correspond to the proposal distribution with a mixture density where A is the set of all dynamic = {add, remove, establish, break, exchange, diff} We assume that we have the sample in the (g-1)th iteration,and now propose a candidate ’ for the gth iteration.
33
Markov chain dynamic Dynamics: –object hypothesis addition Sample the parameter of a new human hypothesis (k n+1,m n+1 ) and add it to g-1. –object hypothesis removal –establish correspondence
34
Markov chain dynamic –break correspondence –exchange identity –Parameter update
35
Experimental results Evaluation on an outdoor scene
37
Experimental results –There are 20 occlusions events overall, nine of which are heavy occlusions. –We use 500 iterations per frame. –Trajectory-based errors: Trajectories of three objects are broken once (ID 28 -> ID 35, ID 31 - > ID 32, ID 30 -> ID 41) –Trajectories initialization: Some start when the objects are only partial inside. Only the initialization of three objects (object 31, 50, 52) are noticeably delayed. Partially occlusion and/or the lack of contrast with the background are the causes of the delays. –The detection rate and the false the false-alarm are 98.13 and 0.27 percent.
38
Conclusion A principled approach to simultaneously detect and track humans in a crowed scene. We formulate the problem as a Bayesian MAP estimation problem. The inference is performed by an MCMC-based approach to explore the joint solution space. The success lies in the integration of the top-down Bayesian formulation following the image formation process and the bottom-up features that are directly extracted from images.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.