ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

Slides:

Advertisements

Similar presentations

28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.

Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.

Tutorial 8: Developing an Excel Application

PETS’05, Beijing, October 16 th 2005 ETISEO Project Ground Truth & Video annotation.

Templates and Styles Excel Advanced. Templates are pre- designed and formatted spreadsheets –They provide consistency of layout/structure –They.

Tutorial 12: Enhancing Excel with Visual Basic for Applications

Using Animation and Multimedia Lesson 9. Software Orientation The Animation Pane, shown at right, enables you to manage all the animation effects on the.

Face Alignment with Part-Based Modeling

INRETS, Villeneuve d’Ascq, December 15 th -16 th 2005 ETISEO Annotation rules Data structure Annotation tool and format Ground truth creation rules Reference.

Discussion on Video Analysis and Extraction, MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGL Presented by: Emmanuel Velasco City College.

Let us build a platform for structure extraction and matching that.... Sunita Sarawagi IIT Bombay TexPoint fonts used.

Video summarization by video structure analysis and graph optimization M. Phil 2 nd Term Presentation Lu Shi Dec 5, 2003.

Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.

A Guide to SQL, Seventh Edition. Objectives Understand the concepts and terminology associated with relational databases Create and run SQL commands in.

ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.

Data Input How do I transfer the paper map data and attribute data to a format that is usable by the GIS software? Data input involves both locational.

Macros Tutorial Week 20. Objectives By the end of this tutorial you should understand how to: Create macros Assign macros to events Associate macros with.

Introduction To Form Builder

Guide To UNIX Using Linux Third Edition

1 Chapter 2 Reviewing Tables and Queries. 2 Chapter Objectives Identify the steps required to develop an Access application Specify the characteristics.

Access Tutorial 3 Maintaining and Querying a Database

COMPREHENSIVE Excel Tutorial 8 Developing an Excel Application.

COM 205 Multimedia Applications

Chapter 9 Introduction to ActionScript 3.0. Chapter 9 Lessons 1.Understand ActionScript Work with instances of movie clip symbols 3.Use code snippets.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 13 Slide 1 Application architectures.

Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.

Introduction to ArcGIS for Environmental Sciences Day 2 – Fundamentals Module 8 Creating & Editing Data Creating Metadata.

Object detection, tracking and event recognition: the ETISEO experience Andrea Cavallaro Multimedia and Vision Lab Queen Mary, University of London

Microsoft Office 2013 ®® Access Tutorial 4 Creating Forms and Reports.

Copyright 2007, Paradigm Publishing Inc. EXCEL 2007 Chapter 7 BACKNEXTEND 7-1 LINKS TO OBJECTIVES Record & run a macro Record & run a macro Save as a macro-

Software Evaluation Criteria Automated Assignment Applications RSCoyner 10/8/04.

1 TEMPLATE MATCHING  The Goal: Given a set of reference patterns known as TEMPLATES, find to which one an unknown pattern matches best. That is, each.

Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.

Web Accessiblity Carol Gordon SIU Medical Library.

CSCI 1101 Intro to Computers 7.1 Learning HTML. 2 Introduction Web pages are written using HTML Two key concepts of HTML are:  Hypertext (links Web pages.

Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.

Multimedia Elements: Sound, Animation, and Video.

CHAPTER TEN AUTHORING.

Just as there are many human languages, there are many computer programming languages that can be used to develop software. Some are named after people,

Copyright © 2006 by Maribeth H. Price 2-1 Chapter 2 Working with ArcMap.

ViPER Video Performance Evaluation Resource University of Maryland.

© 2009 Bentley Systems, Incorporated Chris Collins D&C Manager Quantities.

Tutorial 4 Creating Forms and Reports

Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,

Term 2, 2011 Week 1. CONTENTS Problem-solving methodology Programming and scripting languages – Programming languages Programming languages – Scripting.

Creating Graphical User Interfaces (GUI’s) with MATLAB By Jeffrey A. Webb OSU Gateway Coalition Member.

Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.

SiD Workshop October 2013, SLACDmitry Onoprienko SiD Workshop SLAC, October 2013 Dmitry Onoprienko SLAC, SCA FreeHEP based software status: Jas 3, WIRED,

Recent Advances in ViPER David Mihalcik Jonathan Shneier David Doermann.

MedIX – Summer 07 Lucia Dettori (room 745)

Graphical Enablement In this presentation… –What is graphical enablement? –Introduction to newlook dialogs and tools used to graphical enable System i.

Introduction to Interactive Media Interactive Media Tools: Authoring Applications.

Reading Flash. Training target: Read the following reading materials and use the reading skills mentioned in the passages above. You may also choose some.

Ulead Video Studio is an easy to use video editing software that allows even the novice of movie makers to produce a professional project complete with.

Recent Advances in ViPER David Mihalcik David Doermann Charles Lin.

Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,

ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net.

ViPER Video Annotation and Performance Evaluation viper-toolkit.sf.net.

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

Text2PTO: Modernizing Patent Application Filing A Proposal for Submitting Text Applications to the USPTO.

CHAPTER 7 LESSON B Creating Database Reports. Lesson B Objectives  Describe the components of a report  Modify report components  Modify the format.

Excel Tutorial 8 Developing an Excel Application

MPEG-4 Binary Information for Scenes (BIFS)

Multimedia Content-Based Retrieval

Tutorial 4 Creating Forms and Reports

Lecture 12: Data Wrangling

Attributes and Values Describing Entities.

Evaluation of UMD Object Tracking in Video

Using Animation and Multimedia

Presentation transcript:

ViPER Video Performance Evaluation Toolkit viper-toolkit.sf.net

Performance Evaluation Ideal: –Fully automated –Repeatable –Can be used to compare results without access to the product –Predictive validity –Useful for the task –General enough to cover any task

Reality Cannot fully automate for most domains. –Subjective –Objective Results from subjective studies cannot be easily extended, if at all. Ground truth is hard to gather, lossy, and evaluation metrics are hard to formulate. It is often difficult to determine what is really being measured.

The ViPER Toolkit Unified video performance evaluation resource, including: –ViPER-GT – a Java toolkit for marking up videos with truth data. –ViPER-PE – a command line tool for comparing truth data to result data. –A set of scripts for running several sets of results with different options and generating graphs.

The Video Performance Evaluation Resource Ground Truth Editor Performance Evaluation Tool Truth Data Video Analysis Algorithm Result Data Schema Mapping MetricsFilters Evaluation Results Video Analysis Algorithm Result Data Video Analysis Algorithm Result Data

ViPER (Evaluator View) Performance Evaluation Scripts Truth Data Result Data Schema Mapping Metrics Template Filter Template Evaluation Results Result Data Evaluation Results Evaluation Results Schema Mapping Performance Evaluation Tool Metrics Params Metrics Params Metrics Params

ViPER (Developer View) Result Data

ViPER File Format Represents data as set of descriptors, which the user defines in a schema. –Each descriptor has a set of attributes, which may take on different values over the file. –Like a temporally qualified relational database for each media file, where each row is an instance of a descriptor.

ViPER File Format: Descriptors Descriptor Types –FILE (Video Level Information) –CONTENT (descriptors of the scene) Static attribute values Single instance of one type for any frame –OBJECT* (descriptors of instances, including events) Attributes are dynamic by default Multiple instances can exist at a single frame.

Attributes Attribute Types: –Strings, numbers, booleans, and enumerations –Shape types, including bounding boxes and polygons –Relations (Foreign keys)

ViPER Ground Truth Editing viper-toolkit.sf.net

Ground Truth Editing

The Difficulty of Authoring Ground Truth Ground truth is tedious and time consuming to edit. Ground truth is lossy.

A Generic Video Annotation Tool Lets the user specify the task and the interpretation. Provides a

Competition VideoAnnEx –IBM AlphaWorks MPEG-7 Editor OntoLog (OWL) –Jon Heggland’s RDF Video Ontology Editor Informedia –CMU Digital Video Library PhotoStuff –Still image annotation for the semantic web

Time Line View Provides summary of ground truth. Direct manipulation across frames. Feedback for indirect manipulation.

Time Line View Provides summary of ground truth. Direct manipulation. –Quick editing of activities, events, and other CONTENT descriptors. –Some ability to modify descriptors with dynamic attributes directly, if not the attribute values. Feedback for indirect manipulation. –Easier to notice massive changes.

Enhanced Keyboard Editing Support for real-time mark-up of events and activities. –Keys for creating and deleting activities. –Keys for controlling rate of display (jog dials). Enhance mark-up of spatial data. –Keys for creating, editing of a single descriptor's attribute. Overall attempt to minimize effort in a GOMS model. –Mouse events unnecessary except for polygon editing.

Enhanced Keyboard Editing: Real-time Example User assigns keys for three content types. Each key toggles between off/on states for each content type. Forward/back decelerate/accelerate video playback. May skip frames, rewind,etc. –In paused mode, space goes to next frame. USB jog dial might be useful.

Enhanced Keyboard Editing: Spatial Example Mode selection: –Control-d cycles through descriptor types. –Control-a cycles through attribute types. –Control-s cycles through available descriptors. Editing: –Control-n creates new descriptor of given descriptor type. –Control-f creates a new attribute of given type if none exists. –Arrow keys move, arrow+modifier resizes.

Frame View

Schema Editor

ViPER-GT Internals ViPER-GT: A Video Ground Truth Annotation Tool Schema Editor ViPER Metadata API Pure Java MPEG Decoder AppLoader Plug-In Manager Jena Core GT API Plug-Ins Native Decoders: VirtualDub QuickTime JMF

Latest Version in Series

Schema editor. Timeline view. Supports undo/redo. New video annotation widget.

GTF Inputter (Original V-GT)

ViPER Performance Evaluation viper-toolkit.sf.net

PE Methodology Ground truth and results are represented by a set of descriptive records –Target: An object or content record delineated temporally in the ground truth along with a set of of attributes (possibly spatial) –Candidate: An object or content record delineated temporally in the results along with a set of of attributes (possibly spatial) Requirements –Matching records which are close enough to satisfy a given set of constraints on: Temporal range Spatial location of object Values of attributes in date-type specific parameter space

Detection and Localization Detection: whether a target object or content record is properly identified Localization: how well the target is detected Simplest level: –A target is detected if its temporal range overlaps the temporal range of a single candidate Qualifiers and localization constraints –Temporal overlap must meet a certain tolerance (% or #) –Spatial attributes must overlap within a tolerance a frame by frame basis –Non-spatial attributes must be within a given tolerance

Temporal Localization Metrics Overlap coefficient: # Of % of target frames detected –Dice coefficient # Or % in common Similarity measure –Extent coefficient Deviation in the endpoint location of ranges TARGET CANDIDATES

Attribute Localization Each datatype has its own metric –Svalue: edit distance –Point: euclidean distance –Bboxes, oboxes, circle: overlap and dice coefficients –Bvalue, lvalue: exact match [0,1] –Remainder: absolute difference Object correspondence –Optimal subset Temporal constraints –Frame by frame tolerance –Virtual candidate TARGET CANDIDATES

Reporting Metrics Detections –List of correct, missed and false detections –Summary of absolute detection scores as a percentage –Summary of overall precision and recall Localization –Optimal subset of matching frames –Frame by frame tolerance –Mean, median, SD and maximum values reported Issues: –Many-to-one, many-to-many

Evaluation Using “Gtfc” Used to provide basic evaluation mechanisms –Requires configuration Equivalence classes Evaluation specification –Reports Attribute and descriptor level recall and precision

Evaluation Configuration #BEGIN_EQUIVALENCE DISSOLVE : FADE-IN FADE-OUT TRANSLATE #END_EQUIVALENCE #BEGIN_EVALUATION_LIST CONTENT Shot-Change TYPE: CUT FADE-IN FADE-OUT OBJECT Text TYPE: FULL OVERLAY SCENE *POSITION *CONTENT *MOTION #END_EVALUATION_LIST Set up Equivalencies Evaluate subsets of GT Allow Selected Performance Measures

Video Evaluation Providing metrics to judge correctness of –Value of Attributes –Range of Frames (temporal) –Detection and Localization of objects (spatial) –Moving objects (spatio temporal) Degree of Correctness related to –similarity of or distance between descriptors –cost of transformation between result and ideal data Performance Metrics reported as –% of correct/incorrect instances

PE Methodology Ground truth and results are represented by a set of descriptive records –Target: An object or content record delineated temporally in the ground truth along with a set of of attributes (possibly spatial) –Candidate: An object or content record delineated temporally in the results along with a set of of attributes (possibly spatial) Requirements –Matching records which are close enough to satisfy a given set of constraints on: Temporal range Spatial location of object Values of attributes in date-type specific parameter space

Detection and Localization Detection: whether a target object or content record is properly identified Localization: how well the target is detected Simplest level: –A target is detected if its temporal range overlaps the temporal range of a single candidate Qualifiers and localization constraints –Temporal overlap must meet a certain tolerance (% or #) –Spatial attributes must overlap within a tolerance a frame by frame basis –Non-spatial attributes must be within a given tolerance

Temporal Localization Metrics Overlap coefficient: # Of % of target frames detected –Dice coefficient # Or % in common Similarity measure –Extent coefficient Deviation in the endpoint location of ranges TARGET CANDIDATES

Attribute Localization Each datatype has its own metric –Svalue: edit distance –Point: euclidean distance –Bboxes, oboxes, circle: overlap and dice coefficients –Bvalue, lvalue: exact match [0,1] –Remainder: absolute difference Object correspondence –Optimal subset Temporal constraints –Frame by frame tolerance –Virtual candidate TARGET CANDIDATES

Metric and Tolerance Specification Specification in the evaluation parameter file descriptor-type descriptor-name [METRIC TOLERANCE] attribute1 [METRIC TOLERANCE] attribute2 [METRIC TOLERANCE]

Match Scenarios FALSE MISSED CORRECT

Error Graphs

Localization Graphs

Enhanced Don't Care Example In an activity detection, certain segments are often more important than others: –The moment someone enters or exits the scene. –The moment a thief grabs a bag. These segments might be marked up explicitly as part of an activity descriptor, and treated as important during the evaluation.

Enhanced Don't Care Regions For object evaluation, Don't Care currently applies only to entire descriptors. –Needs to apply to dynamic attributes at a per-frame level, as it does for framewise evaluations. Need enhanced rules for computing don't-care regions spatially and temporally. For example: –Region of body not part of torso or head is unimportant. –Frames before this event are unimportant.

Scripting ViPER RunEvaluation –Runs sets of comparisons with different input parameters.