Evaluation of UMD Object Tracking in Video

Slides:

Advertisements

Similar presentations

ETISEO Benoît GEORIS and François BREMOND ORION Team, INRIA Sophia Antipolis, France Lille, December th 2005.

Advertisements

LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.

Imbalanced data David Kauchak CS 451 – Fall 2013.

BRISK (Presented by Josh Gleason)

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Patch to the Future: Unsupervised Visual Prediction

Video Shot Boundary Detection at RMIT University Timo Volkmer, Saied Tahaghoghi, and Hugh E. Williams School of Computer Science & IT, RMIT University.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Segmentation Divide the image into segments. Each segment:

Detecting and Tracking Moving Objects for Video Surveillance Isaac Cohen and Gerard Medioni University of Southern California.

Local Affine Feature Tracking in Films/Sitcoms Chunhui Gu CS Final Presentation Dec. 13, 2006.

Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.

Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.

1 Sociology 601, Class 4: September 10, 2009 Chapter 4: Distributions Probability distributions (4.1) The normal probability distribution (4.2) Sampling.

Stockman MSU/CSE Fall 2009 Finding region boundaries.

MULTIPLE MOVING OBJECTS TRACKING FOR VIDEO SURVEILLANCE SYSTEMS.

Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling

Lecture 2 Geometric Algorithms. A B C D E F G H I J K L M N O P Sedgewick Sample Points.

COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

Lecture 4: Feature matching CS4670 / 5670: Computer Vision Noah Snavely.

ViPER Video Performance Evaluation Resource University of Maryland.

Object Detection with Discriminatively Trained Part Based Models

ETISEO Benoît GEORIS, François BREMOND and Monique THONNAT ORION Team, INRIA Sophia Antipolis, France Nice, May th 2005.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.

CSCI 256 Data Structures and Algorithm Analysis Lecture 6 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.

Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,

Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten’s and E. Frank’s “Data Mining” and Jeremy Wyatt and others.

Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)

Paper_topic: Parallel Matrix Multiplication using Vertical Data.

Evaluating Classifiers Reading: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)An introduction to ROC analysis.

Multiple Sequence Alignment Vasileios Hatzivassiloglou University of Texas at Dallas.

ETISEO François BREMOND ORION Team, INRIA Sophia Antipolis, France.

Computer Graphics CC416 Lecture 04: Bresenham Line Algorithm & Mid-point circle algorithm Dr. Manal Helal – Fall 2014.

1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ； Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.

Naïve Bayes Classification Recitation, 1/25/07 Jonathan Huang.

Partial Reconfigurable Designs

Evaluating Classifiers

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Answering ‘Where am I?’ by Nonlinear Least Squares

Synthesis for Verification

Video Google: Text Retrieval Approach to Object Matching in Videos

Unit# 9: Computer Program Development

LECTURE 05: THRESHOLD DECODING

Hierarchical clustering approaches for high-throughput data

Content-Based Image Retrieval

Content-Based Image Retrieval

CSc4730/6730 Scientific Visualization

SAT-Based Area Recovery in Technology Mapping

Discrete Event Simulation - 4

PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD

LECTURE 05: THRESHOLD DECODING

[jws13] Evaluation of instance matching tools: The experience of OAEI

Approaching an ML Problem

Probabilistic Databases

Video Google: Text Retrieval Approach to Object Matching in Videos

LECTURE 05: THRESHOLD DECODING

Fourier Transform of Boundaries

Roc curves By Vittoria Cozza, matr

Lecture 2 Geometric Algorithms.

Conceptual grounding Nisheeth 26th March 2019.

Retrieval Performance Evaluation - Measures

Implementation of Learning Systems

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1

Presentation transcript:

Evaluation of UMD Object Tracking in Video University of Maryland

VACE Phase I Evaluations Multiple teams presented algorithms for various analysis tasks Text Detection and Tracking Face Detection and Tracking People Tracking Evaluation was handled by UMD/LAMP and PSU: Penn State devised metrics and ran evaluations. UMD generated ground truth, and implemented metrics ViPER was adapted for new evaluations

Penn State Developed Metrics Evaluations should provide a comprehensive, multifaceted view of challenges with detection and tracking. Tracking Methodologies Developed Pixel Level Frame Analysis Object Level Aggregation

PSU Frame Evaluations Look at the results for each frame, one at a time. For each frame, apply a set of evaluation metrics, independent of the “identity” of each object (i.e. find the best match) These include: Object count precision and recall. Pixel precision and recall over all objects in frame. Individual object pixel precision and recall measures.

PSU Frame Evaluation

PSU Object Aggregation for Tracking Assume object matching has already been done (first frame correspondence) For the life of the object, aggregate some set of metrics. A set of distances for each frame. Average over life of object, etc.

But… Frame metrics throw away tracking information since there is no frame to frame correspondence Aggregated tracking metrics require a known matching. Does not require unique labeling of objects to track Confusion can occur with multiple objects in the Most participants did multi frame detection, not tracking Even with the known matching, does not handle tracking adequately, to include things like confusion and occlusion. The in both cases, the metrics simply sum over all frames. There is no unified metric across time and space exists

UMD Maximal Optimal Matching Compute score for each possible object match. Find the optimal correspondence. One-to-one Match: For each ground truth object, get the list of result objects that minimize the total cost over all possible correspondences. Multiple Match: For each disjoint subset of ground truth objects, get the disjoint subset of output objects that minimizes the total cost. Compute the overall precision and recall. For S = size of matching: Precision = S / size(candidates) Recall = S / size(targets) The maximum one to one matching is currently found. This uses the Hungarian algorithm, which has a running time of O(n3). For the multiple matching, the algorithm uses some heuristics, including assumptions about monotonicity and the triangle inequality, that make it work reasonably well for most data sets. (Basically, it takes the best 1-1 match and looks for lost ones, adding them one at a time. Tends to end up with a few clumps.) Multiple matches also has a similar formulation for precision and recall, but it has Scandidates and Stargets. The user can specify the parameters for the metrics to be used.

Maximal Optimal Matching Advantages Takes into account both space and time. Can be generalized to make no assumptions about space and time. Optimal 1-1 matching has many nice properties. Can handle many-to-many matching. By pruning data to only compute on sequences that overlap in time, matching can be made tractable.

Object Matching

Object Matching Truth Data .45 .9 .57 .6 Result Data

Experimental Results We reran the tracking experiments using the Add description of data Add description of algorithms used for static and moving camera Show graphs for our stuff vs PSU

Example: Tracking Text: Frame

Example: Tracking Text: Tracking There are three metrics for tracking: size, position and angularity. Since text is given in bboxes, there is no angularity measure for box tracking, and the size metrics failed for this example. There are better examples of these on slide 16.

Example: Tracking Text: Object

Example: Person Tracking: Frame No Tracking metrics could be generated, as Marti did not use the first frame data.

Example: Person Tracking: Object The bottom graph would be more informative with several more lines. It is the distance of all matches sorted by distance. The curve is a good visual way of showing how three or more algorithms perform on the same set of data. However, it throws out the information about what object was matched, so a scatter plot is a better representation for one or two evaluations.

Claims Metrics provide for true tracking evaluation (not just aggregated detection) Tolerances can still be set on various components of the distance measure. Provides a single point of comparison

Fin Dr. David Doermann Dr. Rangachar Kasturi David Mihalcik Ilya Makedon & many others JinHyeong Park Felix Suhenko +more

Tracking Graphs

Object Level Matching Most obvious solution: many-many matching. Allows matching on any data type, at a price.

Pixel-Frame-Box Metrics Look at each frame and ask a specific question about its contents. Number of pixels correctly matched. Number of boxes that have some overlap. Or overlap greater than some threshold. How many boxes overlap a given box? (Fragmentation) Look at all frames and ask a question: Number of frames correctly detected. Proper number of objects counted.

Individual Box Tracking Metrics Mostly useful for the retrieval problem, this solution looks at pairs of ground truth boxes and a result box. Metrics are: Position Size Orientation

Questions: Ignoring Ground Truth Assume the evaluation routine is given a set of objects to ignore (or rules for determining what type of object to ignore). How does this effect the output? For pixel measures, just don’t count pixels on ignored regions. This works for Tracking and Frame evaluations. For object matches, do the complete match; when finished, ignore result data that matches ignored truth. For example, only want to evaluate text that has a chance of being OCRed correctly, while not punishing detection of illegible text.

Questions: Presenting the Results Have some basic built in graphs. Line graphs for individual metrics Bar charts showing several metrics For custom graphs, you have to do it yourself. ROC Curves Scatter Plots