Tracking and event recognition – the Etiseo experience Son Tran, Nagia Ghanem, David Harwood and Larry Davis UMIACS, University of Maryland
What I liked about Etiseo Diverse, annotated video material on which to evaluate algorithms. Challenging image analysis conditions spurred improvement to our previously developed algorithms Background modeling Tracking, and integration of detection and tracking
What I liked about Etiseo Diverse, annotated video material on which to evaluate algorithms. Challenging image analysis conditions spurred improvement to our previously developed algorithms Background modeling Tracking, and integration of detection and tracking The meetings are held in France!
Example - Adaptive background subtraction to deal with fast illumination changes Our previous approach employed a per pixel first order auto-regressive model Codebook model – VQ of background colors with heuristics to allow background modeling with moving foreground elements. Exhibited lag in responding to illumination changes Didn’t accurately update backgrounds behind foreground pixels Characteristics of the new algorithm Model based on a more physically correct model of camera response to illumination change Based on a global prediction index and a local linear prediction
Adaptive Background Subtraction Motivation Surfaces respond differentially to illumination changes due to Surface properties, Incident lighting direction and intensities Saturation leads to nonlinearity Limited dynamic sensor range m saturated R B 0 255
Adaptive Background Subtraction Overview of the background model building algorithm Each pixel has a codebook model; each codeword represented by: its principle color values, m a tangent vector ∆ 1. Measure the scene illumination change with the gain in a global index value. a = median(I t ) – median(I t-1 ) 2. Predict codewords’ new colors based on a, m and ∆ 3. Update m and ∆ based on the prediction and observation
Adaptive Background Subtraction Distance in brightness and color Prediction Updating the accessed codeword ∆ m x dBdB dCdC e m mp x ∆a∆a
Adaptive Background Subtraction Experimental results Linear update model – first order AR New model #1#200 #400 #600 #800 #1000
What I didn’t like (as much) about Etiseo Metrics for low level vision tasks (might) have limited predictive value for effectiveness of higher level tasks Example – adding or subtracting one row/column from bounding boxes can lead to large scoring differences of detectors and trackers, but would have negligible impact on event recognition scores So, how would we know if someone develops a better tracker for surveillance video analysis? The videos collected confounded too many variables Understandable given time and cost associated with video collection and annotation. But difficult to use the evaluation to predict conditions under which any method might work well.
What (else) I didn’t like (as much) about Etiseo Events to be recognized seemed ad hoc rather than generic Should be designed to stress some higher level capability that surveillance video analysis would often require Example 1 – identity maintenance of people and vehicles with gaps in observation central to detection of thefts and various security and safety violations. Example 2 – Identification of portals into (static or dynamic) closed worlds and recognition/description of entering/leaving and depositing/collecting events from closed worlds.
An idea for future evaluations Focus on measuring improvements in effectiveness of humans in performing surveillance. Forensic video analysis – decrease time to conduct retrospective video analyses of (large) collections of surveillance videos Motivating applications Retail sales – construct video trails of shoplifting events to support prosecutions Building and installation security – track people back in time to identify correlated people and vehicles