Sensor, Motion & Temporal Planning PhD Defense for Ser-Nam Lim Department of Computer Science University of Maryland, College Park.

Sensor, Motion & Temporal Planning PhD Defense for Ser-Nam Lim Department of Computer Science University of Maryland, College Park

Outline 1. Two-camera background subtraction: Invariant to shadows, lighting changes. 2. Multi-camera background subtraction and tracking: Occlusions. 3. Active camera: Predictive tracking. Motion, temporal planning. Camera scheduling. 4. Abandoned package detection: Severe occlusions. Temporal analysis in a statistical framework to minimize reliance on thresholding.

1. Two-camera Background Subtraction Details given during proposal. "Fast Illumination-invariant Background Subtraction using Two Views: Error Analysis, Sensor Placement and Applications", IEEE CVPR 2005.

Problem Description Single-camera background subtraction: Shadows, Illumination changes, and Specularities. Disparity-based background subtraction: Can overcome many of these problems, BUT Slow and Inaccurate online matches.

Two-Camera Algorithm Real time, two-camera background subtraction: Develop a fast two camera background subtraction algorithm that doesn’t require solving the correspondence problem online. Analyze advantages of various camera configurations with respect to robustness of background subtraction.

Fast Illumination-invariant Two- cameras Approach Clever idea due to Ivanov et. al. Yuri A. Ivanov, Aaron F. Bobick and John Liu, “Fast Lighting Independent Background Subtraction”, IEEE Workshop on Visual Surveillance, ICCV'98, Bombay, India, January 1998. Intuition: Established background conjugate pixels offline. Color differences between conjugate pixels. What are the problems? False and missed detections caused by homogeneous objects.

Intuition Color difference of the image point in both cameras are small when building the background Color difference still small with shadow

False Detections Reference camera Big color difference even though its background!! Happens when object is close to background.

Missed Detections Reference camera Background occluded!! Both cameras see color on the truck, so small color difference if homogeneous

Eliminate False Detections Place the two cameras vertical to each other with respect to the ground plane on which object moves

Reference camera Big color difference even though its background!! Now, whenever refcam sees background, the other cam too

Reducing Missed Detections Initial detection free of false detections. And the missed detections form a component adjacent to the ground plane. Utilize stereo matching of the initial detection to infer height and fill up the missed portion.

Refcam Infer height through selective stereo

Advantages FAST!! No online stereo matching. Invariant to shadows, lighting changes. Invariant to specularities: Through a height-inferring process. Detect near-background object: Difficult problem with disparity-based background subtraction. Accurate: Offline stereo matching can be computational intensive. Human intervention can be used.

Experiments – Lighting Changes

Experiments - Specularities

Experiments – Near Background

Experiments - Indoor

2. Multi-camera Detection and Tracking Under Occlusions Preparing for submission.

Problem Severe occlusions make detection and tracking difficult. We often need to observe highly occluded places!! Partial and full occlusions.

Algorithm Outline 1. Silhouette detection on a per-camera basis. 2. Count people in a top view. 3. Constrained stereo. 4. Sensor fusion – particle filter.

Silhouette Detection – background subtraction

People Counting Project the foreground silhouettes onto a common ground plane – do it for every available camera. Intersect projections of different cameras. Obtains a set of polygons, that possibly contain valid objects. Number of polygons is a rough estimate of the number of people in the scene.

Camera 1 Camera 2 Phantom polygon

Epipolar line Correct vertical line Wrong vertical line Good color matching Phantom polygon. Selective Stereo Ground plane pixel

Good color matching Constrained Stereo Vertical line Phantom polygon. Foreground pixel Camera 1 view Candidate ground plane pixels Camera 2 view Mapped candidate ground plane pixels Correct vertical line Epipolar line Wrong vertical line Bad color matching

Note that only the visible foreground pixels are successfully segmented based on selective stereo with one pair. Partial and full occlusions need to be dealt with multiple camera fusion. How??

Additional Consideration – Sensor Fusion Choosing the best stereo pairs for performing stereo matching – guided by particle filter.

Count people Use: Danny Yang, Hector H. Gonzalez- Baònos, Leonidas J. Guibas, “Counting People in Crowds with a Real-Time Network of Simple Image Sensors”, ICCV, 2003. Notice the errors!!

Final Results

3. Active Camera Submitted to ACM Multimedia System Journal. Submitted to ACM Multimedia 2006. “Constructing Task Visibility Intervals for Surveillance Systems”, VSSN Workshop, ACM Multimedia 2005. “A Scalable Image-based Multi-camera Visual Surveillance System”, AVSS 2003.

Problem Description Given: Collection of calibrated PTZ cameras and Large surveillance site. How to control cameras to acquire surveillance videos? Why collect surveillance videos? Collect k secs of unobstructed video from as close to a side angle as possible for gait recognition. Collect unobstructed video of person near any ROI.

Project Goals - Visual Requirements Targets have to be unobstructed in the collected videos during useful video collections. Involves predicting object trajectories in the field of regard based on tracking. Targets have to be in the field of view in the collected videos. Constrains PT parameters for cameras as a function of time during periods of visibility. Targets have to satisfy some task-specific minimum resolutions in the collected videos. Constrains Z parameter.

Project Goals - Performance Requirements Scheduling cameras to maximize task coverage. Determine future time intervals within which visual requirements of tasks are satisfied: We first do this for each camera, task pair. We then combine these solutions across tasks and then cameras to schedule tasks.

System Timeline For every (camera, task, object) tuple: Detection and tracking – using existing methods. Predict future locations of objects. Visibility analysis, to predict period during which objects are visible – visibility intervals. Determine allowable camera settings over time, within these visibility intervals to form Task Visibility Intervals (TVI’s). Composite TVI’s to form Multiple Task Visibility Intervals (MTVI’s) - scalability. Scheduling – scalability.

Predicting Future Location Represent object as sphere. For computational efficiency, each sphere represented as triplet of circular shadows on the projection planes for visibility analysis: Extrapolate the motion of each shadow for predicting their future locations. Each shadow move in a straight line in the predicted path, and its radius is grows linearly to capture the positional uncertainty.

Predictive Tracking Experiments

Visibility Analysis With the predicted locations, we can represent the extremal angle trajectories over time of each shadow in closed-form: Extremal angles are the angles subtended by the pair of tangent points. Straight line trajectory Camera center Extremal angle of one tangent point Shadow’s radius increases over time

The extremal angle trajectories of two different objects, are equated to find time intervals (intersections) when occlusion occurs – occlusion intervals: Complements of occlusion intervals are the visibility intervals. Can do this for every object pair. But can be more efficient using an optimal segment intersection algorithm (details given in dissertation).

Efficient Segment Intersection vs Brute Force

Task Visibility Intervals (TVI’s) Combine allowable camera settings over time with visibility intervals to form TVI’s. Allowable camera settings are determined at each future time step in the visibility interval: Iterates through range of pan, tilt and zoom settings, and determine time intervals during which PTZ ranges exist that satisfy task-specific resolution. For efficiency, use a piecewise approximation to the PTZ range. These TVI’s must also satisfy the required length of collected video.

Multiple Task Visibility Intervals (MTVI’s) TVI’s can be combined if: Common time intervals exist that are at least as long as the maximum required processing times among all the tasks involved. Common camera settings exist in these common time intervals. For efficiency, TVI’s can be combined with a plane-sweep algorithm.

Camera Scheduling Scheduling based on the constructed (M)TVI’s. Two methods are compared: Branch and bound. Greedy.

Define slack  as:  = [t  -, t  + ] = [r, d – p], where d is the deadline, r is the earliest release time and p is the processing time (duration of task). Let |  | be t  + - t  -. It can be shown that if |  max | < p min, then in any feasible schedule, the (M)TVI’s must be ordered by r.

Each camera can then be modeled with a acyclic graph with source and sink, with the nodes being the (M)TVI’s and the edge being the number of tasks covered on moving from one node to another. The sink of the graph of one camera is linked to the source of the graph of another camera – cascading.

1 2 3 s1s1 t1t1 2 2 2 0 0 0 4 5 6 s2s2 t2t2 2 2 2 0 0 0 7 8 9 s3s3 t3t3 2 2 2 0 0 0 t 0 0 0 Example

Dynamic Programming (DP) is run on the multi-camera graph: Equivalent to greedy algorithm, BUT Branch & Bound look at what are the tasks other cameras in the graph can potentially covered while running DP backtracking.

Approximation Factors – Branch & Bound vs Greedy Given k cameras, the approximation factor for multi-camera scheduling using the greedy algorithm is 2 + k , where and  are variables representing the distribution of tasks among the cameras. Proof in dissertation. Important – depends on the number of cameras, i.e., does not scale well to large camera networks!!

For k cameras, the approximation factor of the branch and bound algorithm is: Proof in dissertation -  and u are task distribution factors. Important – insensitive to number of cameras!!

Performance – Simulations

Experiments – Face Capture

Experiments – Full Body Video

Experiments – Lower Resolution

Experiments – Higher Resolution

4. Abandoned Package Detection under Severe Occlusions A short overview. Refer to dissertation for details. Preparing for submission.

Constraints No background frame available. Constant foreground motion. Constant occlusion. Single camera.

Algorithm 1. PDF for motion detection, P d : Observe successive frame differences. Assume pdf is zero-mean – extract the zero- centered mode. 2. PDF for background model, P b : Histogram frequency computed based on joint probability with P d. Intuition – true background pixels should observe no motion.

3. PDF of static pixels that are foreground, conditioned on ~P b and P d : Intuition – pixels belonging to abandoned packages are static foreground pixels. 4. MRF to label these pixels. Avoid thresholding. 5. Evaluate the clusters based on temporal persistency of shape (Hausdorff) and intensities.

Experiments

Conclusions The role of sensor placement in detections: Highlighted in two-camera background subtraction. The role of sensor placement/selection in tracking under occlusions: Improve stereo matching by choosing different stereo pairs based on a particle filter. Active camera system: A challenge to deploy in real world applications. Depends a lot on predictive tracking, how can we improve it? Left-baggage detection: What if the baggage is invisible (e.g., bomb left in trash can!!)?

Thanks!! Prof. Larry Davis for his support and teachings. Committee for taking their time.

Sensor, Motion & Temporal Planning PhD Defense for Ser-Nam Lim Department of Computer Science University of Maryland, College Park.

Similar presentations

Presentation on theme: "Sensor, Motion & Temporal Planning PhD Defense for Ser-Nam Lim Department of Computer Science University of Maryland, College Park."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sensor, Motion & Temporal Planning PhD Defense for Ser-Nam Lim Department of Computer Science University of Maryland, College Park.

Similar presentations

Presentation on theme: "Sensor, Motion & Temporal Planning PhD Defense for Ser-Nam Lim Department of Computer Science University of Maryland, College Park."— Presentation transcript:

Similar presentations

About project

Feedback