Download presentation
Presentation is loading. Please wait.
1
MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN A RESOURCE CONSTRAINED MULTIMODAL SENSOR NETWORK Jayanth Nayak, Luis Gonzalez-Argueta, Bi Song, Jayanth Nayak 1, Luis Gonzalez-Argueta 2, Bi Song 2, Amit Roy-Chowdhury, Ertem Tuncel Amit Roy-Chowdhury 2, Ertem Tuncel 2 Department of Electrical Engineering, University of California, Riverside 9/8/2008ICDSC'081 Bourns College of Engineering I nformation P rocessing L aboratory www.ipl.ee.ucr.edu
2
Overview Introduction Problem Formulation Audio And Video Processing Camera Control Strategy Computing Final Tracks Of All Targets Experimental Results Conclusion Acknowledgements 9/8/2008ICDSC'082
3
Motivation Obtaining multi-resolution video from a highly active environment requires a large number of cameras. Disadvantages Cost of buying, installing and maintaining Bandwidth limitations Processing and storage Privacy Our goal: minimize the quantity of cameras by a control mechanism that directs the attention of the cameras to the interesting parts. 9/8/2008ICDSC'083
4
Proposed Strategy Audio sensors direct the pan/tilt/zoom of the camera to the location of the event. Audio data intelligently turns on the camera and video data turns off the camera. Audio and video data are fused to obtain tracks of all targets in the scene. 9/8/2008ICDSC'084
5
Example Scenario 9/8/2008ICDSC'085 An example scenario where audio can be used to efficiently control two video cameras. There are four tracks that need to be inferred. Directly indicated on tracks are time instants of interest, i.e., initiation and end of each track, mergings, splittings, and cross-overs. The mergings and crossovers are further emphasized by X. Two innermost tracks coincide in the entire time interval (t 2, t 3 ). The cameras C 1 and C 2 need to be panned, zoomed, and tilted as decided based on their own output and that of the audio sensors a 1,..., a M.
6
Relation To Previous Work Fusion of simultaneous audio and video data. Our audio and video data are captured at disjoint time intervals. Dense network of vision sensors. In order to cover a large field, we focus on controlling a reduced set of vision sensors. Our video and audio data is analyzed from dynamic scenes. 9/8/2008ICDSC'086
7
Problem Formulation Audio sensors A = {a 1,..., a M } are distributed across ground plane R R is also observable from a set of controllable cameras C = { c 1,..., c L }. However, entire region R may not be covered with one set of camera settings. p-tracks: tracks belonging to targets a-tracks: tracks obtained by clustering audio Resolving p-track ambiguity Camera Control Person Matching 9/8/2008ICDSC'087
8
Tracking System Overview 9/8/2008ICDSC'088 a-tracks Overall camera control system. Audio sensors A = {a 1,..., a M } are distributed across regions R i. The set of audio clusters are denoted by B t, and K t− represent the set of confirmed a-tracks estimated based on observations before time t. P/T/Z cameras are denoted by C = {c 1,..., c L }. Ground plane positions are denoted by O t k.
9
Processing Audio and Video a-tracks are clusters of audio data that are above amplitude threshold Tracked using Kalman Filter In video, people are detected using histogram of orientated gradients and tracked using Auxilary Particle Filter 9/8/2008ICDSC'089
10
Mapping From Image Plane to Ground Plane Learned parameters are used to transform tracks from image plane to ground plane Estimate projective transformation matrix H during a calibration phase Precompute H for each PTZ setting of each camera 9/8/2008ICDSC'0810 vanishing line
11
Tracking System Overview 9/8/2008ICDSC'0811
12
Camera Control Camera control Goal: avoid ambiguity or disambiguate when tracks are created or deleted intersect merge Set pan/tilt/zoom parameters 9/8/2008ICDSC'0812
13
Setting Camera Parameters Heuristic algorithm Cover ground plane by regions R i l R i l in field of view of camera C l Camera parameters Tracking algorithm specifies point of interest x from last known a-track If no camera on, find R i l containing x Reassign a camera and set its parameters if x approaches boundary of current R i l 9/8/2008ICDSC'0813
14
Camera Control Based on Track Trajectories Intersection 9/8/2008ICDSC'0814 Separation Merger Sudden Appearance Undetected Disappearance Sudden Disappearance Location (Meters) Time (Seconds) Location (Meters) Time (Seconds) Location (Meters) Time (Seconds) Location (Meters) Time (Seconds) Location (Meters) Time (Seconds) Switch to video Location (Meters) Time (Seconds)
15
Creating Final Tracks Of All Targets Bipartite graph matching over a set of color histograms We collect features as the target enters and exits the scene in video. For every new a-track, features are collected from a small set of frames. The weight of an edge is the distance between the observed video features. Additionally, audio data is enforced on the weights. 9/8/2008ICDSC'0815
16
Creating Final Tracks Using Bipartite Matching 9/8/2008ICDSC'0816 Location (Meters) Time (Seconds) Audio Video [a+, a-] [b+, b-] [c+] [e+, e-] Tracking in Audio and Video Location (Meters) Time (Seconds) Tracking in Audio Only Three tracks are recovered by matching every node (entry and exit from the scene) where video was capture. Two tracks are recovered. However, red and green show the wrong path. Audio cannot disambiguate independence once the clusters have merged. Video abcdefgabcdefg + - Bipartite Graph Matching abcdefgabcdefg abcdefgabcdefg + - Bipartite Graph Matching Without Audio Constraint abcdefgabcdefg [c-]
17
Experimental Results 9/8/2008ICDSC'0817 Inter P-Track Distance at a Merge EventInter P-Track Distance at a Crossover Event
18
Experimental Results (Cont.) 9/8/2008ICDSC'0818
19
Conclusion Goal: minimize camera usage in a surveillance system Save power, bandwidth, storage and money Alleviate privacy concerns Proposed a probabilistic scheme for opportunistically deploying cameras in a multimodal network. Showed detailed experimental results on real data collected in multimodal networks. Final set of tracks are computed by bipartite matching 9/8/2008ICDSC'0819
20
Acknowledgements This work was supported by Aware Building: ONR- N00014-07-C-0311 and the NSF CNS 0551719. Bi Song 2 and Amit Roy-Chowdhury 2 were additionally supported by NSF-ECCS 0622176 and ARO-W911NF-07-1-0485. 9/8/2008ICDSC'0820
21
Thank You. Questions? Jayanth Nayak Jayanth Nayak 1 nayak@mayachitra.com Luis Gonzalez-Argueta, Bi Song, Luis Gonzalez-Argueta 2, Bi Song 2, Amit Roy-Chowdhury, Ertem Tuncel Amit Roy-Chowdhury 2, Ertem Tuncel 2 {largueta,bsong,amitrc,ertem}@ee.ucr.edu 9/8/2008ICDSC'0821 Bourns College of Engineering I nformation P rocessing L aboratory www.ipl.ee.ucr.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.