Context-dependent Detection of Unusual Events in Videos by Geometric Analysis of Video Trajectories Longin Jan Latecki Computer and Information Sciences Temple University, Philadelphia Nilesh Ghubade and Xiangdong Wen
Agenda Introduction Mapping of video to a trajectory Relation: motion trajectory video trajectory Discrete curve evolution Polygon simplification Key frames Unusual events in surveillance videos Results
Main Tools Mapping the video sequence to a polyline in a multi-dimensional space. The automatic extraction of relevant frames from videos is based on polygon simplification by discrete curve evolution.
Mapping of video to a trajectory Mapping of the image stream to a trajectory (polyline) in a feature space. Representing each frame as: Bin0 ……… Bin n Frame 0 Frame N X-coord of the Bin’s centroid Bin’s Frequency Count Y-coord of the Bin’s centroid Bin n
Used in our experiments Red-Green-Blue (rgb) Bins Each frame as a 24-bit color image (8 bit per color intensity): Each frame as a 24-bit color image (8 bit per color intensity): Bin 0 = color intensities from 0-31Bin 0 = color intensities from 0-31 Bin 1 = color intensities from 32-63Bin 1 = color intensities from Bin 8 = color intensities from Bin 8 = color intensities from Three attributes per bin: - Three attributes per bin: - Row of the bin’s centroidRow of the bin’s centroid Column of the bin’s centroidColumn of the bin’s centroid Frequency count of the bin.Frequency count of the bin. (8 bins per color level * 3 attributes/bin)*3 color levels = 72 feature (8 bins per color level * 3 attributes/bin)*3 color levels = 72 feature
Theoretical Results: Motion trajectory Video trajectory Consider a video in which an object (a set of pixels) is moving on a uniform background. The object is visible in all frames and it is moving with a constant speed on a linear trajectory. Then the video trajectory in the feature space is a straight line. If n objects are moving with constant speeds on a linear trajectory, then the trajectory is a straight line in the feature space.
Consider a video in which an object (a set of pixels) is moving on a uniform background. Then the trajectory vectors are contained in the plane. If n objects are moving, then the dimension of the trajectory is at most 2n. If a new object suddenly appears in the movie, the dimension of the trajectory increases at least by 1 and at most by 3.
MovingDotMovieWithAdditionalDot.avi
Robust Rank Computation Using singular value decomposition, based on: C. Rao, A. Yilmaz, and M.Shah. View-Invariant Representation and Recognition of actions. Int. J. of Computer Vision 50, M. Seitz and C. R. Dyer. View-invariant analysis of cyclic motion. Int. J. of Computer Vision 16, We compute err in a window of 11 consecutive frames in our experiments.
MovingDotMovieWithAdditionalDot.avi
Interpolation of video trajectory MovingDotMovie_Clockwise.avi
MovingDotMovieWithAdditionalDot.avi
Polygon simplification Relevance RankingFrame Number Frames with decreasing relevance
Discrete Curve Evolution P=P 0,..., P m P i+1 is obtained from P i by deleting the vertices of P i that have minimal relevance measure K(v, P i ) = K(u,v,w) = |d(u,v)+d(v,w)-d(u,w)| u v w u v w
Discrete Curve Evolution: Preservation of position, no blurring
Discrete Curve Evolution: robustness with respect to noise
Discrete Curve Evolution: extraction of linear segments
Key Frame Extraction
Key frames and rank Security1 Bins Matrix Distance Matrix
err for seciurity1 video
M. S. Drew and J. Au:
Predictability of video parts: Local Curveness computation We divide the video polygonal curve P into parts T_i. For videos with 25 fps: T_i contains 25 frames. We apply discrete curve evolution to each T_i until three points remain: a, b, c. Curveness measure of T_i: C(T_i,P) = |d(a, b) + d(b, c) - d(a, c)| b is the most relevant frame in T_i and the first vertex of T_i+1
security7
err for seciurity7
2D projection by PCA of video trajectory for security7
Mov3
Mov3: Rustam waving his hand. Bins Matrix Key frames = Distance Matrix Key frames =
Hall_monitor
err for hall_monitor
Hall Monitor: 2 persons entering-exiting in a hall. Bins Matrix Key frames = Distance Matrix Key frames =
CameraAtLightSignal.avi
Multimodal Histogram Histogram of lena
Segmented Image Image after segmentation – we get a outline of her face, hat etc
Gray Scale Image - Multimodal Original Image of Lena
Thank you