Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluation of UMD Object Tracking in Video

Similar presentations


Presentation on theme: "Evaluation of UMD Object Tracking in Video"— Presentation transcript:

1 Evaluation of UMD Object Tracking in Video
University of Maryland

2 VACE Phase I Evaluations
Multiple teams presented algorithms for various analysis tasks Text Detection and Tracking Face Detection and Tracking People Tracking Evaluation was handled by UMD/LAMP and PSU: Penn State devised metrics and ran evaluations. UMD generated ground truth, and implemented metrics ViPER was adapted for new evaluations

3 Penn State Developed Metrics
Evaluations should provide a comprehensive, multifaceted view of challenges with detection and tracking. Tracking Methodologies Developed Pixel Level Frame Analysis Object Level Aggregation

4 PSU Frame Evaluations Look at the results for each frame, one at a time. For each frame, apply a set of evaluation metrics, independent of the “identity” of each object (i.e. find the best match) These include: Object count precision and recall. Pixel precision and recall over all objects in frame. Individual object pixel precision and recall measures.

5 PSU Frame Evaluation

6 PSU Object Aggregation for Tracking
Assume object matching has already been done (first frame correspondence) For the life of the object, aggregate some set of metrics. A set of distances for each frame. Average over life of object, etc.

7 But… Frame metrics throw away tracking information since there is no frame to frame correspondence Aggregated tracking metrics require a known matching. Does not require unique labeling of objects to track Confusion can occur with multiple objects in the Most participants did multi frame detection, not tracking Even with the known matching, does not handle tracking adequately, to include things like confusion and occlusion. The in both cases, the metrics simply sum over all frames. There is no unified metric across time and space exists

8 UMD Maximal Optimal Matching
Compute score for each possible object match. Find the optimal correspondence. One-to-one Match: For each ground truth object, get the list of result objects that minimize the total cost over all possible correspondences. Multiple Match: For each disjoint subset of ground truth objects, get the disjoint subset of output objects that minimizes the total cost. Compute the overall precision and recall. For S = size of matching: Precision = S / size(candidates) Recall = S / size(targets) The maximum one to one matching is currently found. This uses the Hungarian algorithm, which has a running time of O(n3). For the multiple matching, the algorithm uses some heuristics, including assumptions about monotonicity and the triangle inequality, that make it work reasonably well for most data sets. (Basically, it takes the best 1-1 match and looks for lost ones, adding them one at a time. Tends to end up with a few clumps.) Multiple matches also has a similar formulation for precision and recall, but it has Scandidates and Stargets. The user can specify the parameters for the metrics to be used.

9 Maximal Optimal Matching Advantages
Takes into account both space and time. Can be generalized to make no assumptions about space and time. Optimal 1-1 matching has many nice properties. Can handle many-to-many matching. By pruning data to only compute on sequences that overlap in time, matching can be made tractable.

10 Object Matching

11 Object Matching Truth Data .45 .9 .57 .6 Result Data

12 Experimental Results We reran the tracking experiments using the
Add description of data Add description of algorithms used for static and moving camera Show graphs for our stuff vs PSU

13 Example: Tracking Text: Frame

14 Example: Tracking Text: Tracking
There are three metrics for tracking: size, position and angularity. Since text is given in bboxes, there is no angularity measure for box tracking, and the size metrics failed for this example. There are better examples of these on slide 16.

15 Example: Tracking Text: Object

16 Example: Person Tracking: Frame
No Tracking metrics could be generated, as Marti did not use the first frame data.

17 Example: Person Tracking: Object
The bottom graph would be more informative with several more lines. It is the distance of all matches sorted by distance. The curve is a good visual way of showing how three or more algorithms perform on the same set of data. However, it throws out the information about what object was matched, so a scatter plot is a better representation for one or two evaluations.

18 Claims Metrics provide for true tracking evaluation (not just aggregated detection) Tolerances can still be set on various components of the distance measure. Provides a single point of comparison

19 Fin Dr. David Doermann Dr. Rangachar Kasturi David Mihalcik
Ilya Makedon & many others JinHyeong Park Felix Suhenko +more

20 Tracking Graphs

21 Object Level Matching Most obvious solution: many-many matching.
Allows matching on any data type, at a price.

22 Pixel-Frame-Box Metrics
Look at each frame and ask a specific question about its contents. Number of pixels correctly matched. Number of boxes that have some overlap. Or overlap greater than some threshold. How many boxes overlap a given box? (Fragmentation) Look at all frames and ask a question: Number of frames correctly detected. Proper number of objects counted.

23 Individual Box Tracking Metrics
Mostly useful for the retrieval problem, this solution looks at pairs of ground truth boxes and a result box. Metrics are: Position Size Orientation

24 Questions: Ignoring Ground Truth
Assume the evaluation routine is given a set of objects to ignore (or rules for determining what type of object to ignore). How does this effect the output? For pixel measures, just don’t count pixels on ignored regions. This works for Tracking and Frame evaluations. For object matches, do the complete match; when finished, ignore result data that matches ignored truth. For example, only want to evaluate text that has a chance of being OCRed correctly, while not punishing detection of illegible text.

25 Questions: Presenting the Results
Have some basic built in graphs. Line graphs for individual metrics Bar charts showing several metrics For custom graphs, you have to do it yourself. ROC Curves Scatter Plots


Download ppt "Evaluation of UMD Object Tracking in Video"

Similar presentations


Ads by Google