Download presentation
Presentation is loading. Please wait.
Published byTheodore Briggs Modified over 9 years ago
1
Present by 陳群元
2
Introduction Previous work Predicting motion patterns Spatio-temporal transition distribution Discerning pedestrians Experimental results conclusion
3
Tracking individuals in extremely crowded scenes is a challenging task, we predict the local spatio-temporal motion patterns that describe the pedestrian movement at each space-time location in the video. we robustly model the individual’s unique motion and appearance to discern them from surrounding pedestrians.
4
Previous work track features and associate similar trajectories to detect individual moving entities within crowded scenes. We encode many possible motions in the HMM, and derive a full distribution of the motion at each spatio-temporal location in the video.
5
Introduction Previous work Predicting motion patterns Spatio-temporal transition distribution Discerning pedestrians Experimental results conclusion
7
An example : a 3-state Markov Chain λ o State 1 generates symbol A only, State 2 generates symbol B only, and State 3 generates symbol C only o Given a sequence of observed symbols O={CABBCABC}, the only one corresponding state sequence is {S 3 S 1 S 2 S 2 S 3 S 1 S 2 S 3 }, and the corresponding probability is P(O|λ)=P(q 0 =S 3 ) P(S 1 |S 3 )P(S 2 |S 1 )P(S 2 |S 2 )P(S 3 |S 2 )P(S 1 |S 3 )P(S 2 |S 1 )P(S 3 |S 2 ) =0.1 0.3 0.3 0.7 0.2 0.3 0.3 0.2=0.00002268 s2s2 s3s3 A B C 0.6 0.7 0.3 0.2 0.1 0.3 0.7 s1s1
8
An example : a 3-state discrete HMM λ o Given a sequence of observations O={ABC}, there are 27 possible corresponding state sequences, and therefore the corresponding probability is s2s2 s1s1 s3s3 {A:.3,B:.2,C:.5} {A:.7,B:.1,C:.2}{A:.3,B:.6,C:.1} 0.6 0.7 0.3 0.2 0.1 0.3 0.7
10
f(Pos) = (f(Pos+1) -f(Pos) + f(Pos) -f(Pos-1))/2 = f(Pos+1)- f(Pos-1)/2; For each pixel i in cuboid I is intensity
11
the local spatio-temporal motion pattern represented by a 3D Gaussian of spatio-temporal gradients
12
The hidden states of the HMM are represented by a set of motion patterns The probability of an observed motion pattern given a hidden state s is
13
Kullback–Leibler divergence is a non-symmetric measure of the difference between two probability distributions P and Q.
14
After training a collection of HMMs on a video of typical crowd motion, we predict the motion pattern at each space- time location that contains the tracked subject. where S is the set of hidden states, w(s) is defined by
15
Reference :A Tutorial On Hidden Markov Models andSelected Applications in Speech Recognition.
16
a weighted sum of the 3D Gaussian distributions associated with the HMM’s hidden states
17
The centroid we are interested in is a multivariate normal density that minimizes the total distortions. Formally, a centroid c is defined as, Reference: On Divergence Based Clustering of Normal Distributions and Its Application to HMM Adaptation
18
where and are the mean and covariance of the hidden state s, respectively.
19
Introduction Previous work Predicting motion patterns Spatio-temporal transition distribution Discerning pedestrians Experimental results conclusion
21
we use the gradient information to estimate the optical flow within each specific sub-volume and track the target in a Bayesian framework. Bayesian tracking can be formulated as maximizing the posterior distribution of the state x t of the target at time t given available measurements z 1:t = {z i ; i = 1 : : : t} by z t is the image at time t, p (x t |x t-1 ) is the transition distribution, and p (z t |x t ) is the likelihood. state vector x t as the width, height, and 2D location of the target within the image.
22
we focus on the target’s movement between frames and use a 2nd-degree autoregressive model for the transition distribution of the target’s width and height. Ideally, the state transition distribution p (x t |x t-1 ) directly reflects the two-dimensional motion of the target between frames t -1 and t. where is the 2D optical flow vector, and is the covariance matrix.
23
Assuming the movement to be small, the image constraint at I(x,y,t) with Taylor series can be developed to get H.O.T
24
The predicted motion pattern is defined by a mean gradient vector and a covariance matrix The motion information encoded in the spatio-temporal gradients can be expressed in the form of the structure tensor matrix The optical flow can then be estimated from the structure tensor by solving where w = [u; v; z] T is the 3D optical flow
25
the 2D optical flow is v’ = [u/z; v/z]T
28
Introduction Previous work Predicting motion patterns Spatio-temporal transition distribution Discerning pedestrians Experimental results conclusion
29
Typical models of the likelihood distribution p (z t |x t ) where is the variance, is a distance measure, and Z is a normalization term. difference between a region R (defined by state x t ) of the observed image z t and the template. We assume pedestrians exhibit consistency in their appearance and their motion, and model them in a joint likelihood by where p A and p M are the appearance and motion likelihoods
30
After tracking in frame t, we update each pixel i in the motion template by where is the motion template at time t, Is the region of spatio-temporal gradient defined by the tracking result (i.e., the expected value of the posterior) is the learning rate.
31
The error at pixel i and time t becomes t i and r i are the normalized gradient vectors of the motion template and the tracking result at time t To reduce the contributions of frequently changing pixels to the computation of the motion likelihood, we weigh each pixel in the likelihood’s distance measure. where Z is a normalization term such that
32
The distance measure of the motion likelihood distribution becomes
33
Introduction Previous work Predicting motion patterns Spatio-temporal transition distribution Discerning pedestrians Experimental results conclusion
34
The training video for the concourse scene contains 300 frames (about 10 seconds of video), the video for ticket gate scene contains 350 frames. We set the cuboid size to 10*10*10 for both scenes. The learning rate, appearance variance, and motion variance are 0.05.
39
Introduction Previous work Predicting motion patterns Spatio-temporal transition distribution Discerning pedestrians Experimental results conclusion
40
In this paper, we derived a novel probabilistic method that exploits the inherent spatially and temporally varying structured pattern of a crowd’s motion to track individuals in extremely crowded scenes.
41
The end Thank you
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.