A Web-based System for Collaborative Annotation of Large Image and Video Collections Multimedia ’05 Proceedings of the 13 th annual ACM international conference.

Slides:

Advertisements

Similar presentations

Collections Management Software for Museums and Archives r e d i s c o v e r y s o f t w a r e. c o m O V E R V I E W P R E S E N T A T I O N.

Advertisements

Interaction Design: Visio

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Creating Online Presentations. Creating a Presentation To create a presentation 1.Open PowerPoint. In the task pane under New select From Design Template,

Web Server Hardware and Software

Presented by Zeehasham Rasheed

MCDST : Supporting Users and Troubleshooting a Microsoft Windows XP Operating System Chapter 5: User Environment and Multiple Languages.

1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2006 Microsoft Corporation.

CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.

Dreamweaver 8 Concepts and Techniques Introduction Web Site Development and Macromedia Dreamweaver 8.

Cambodia-India Entrepreneurship Development Centre - : :.... :-:-

Chapter 13: Designing the User Interface

Editing Description Logic Ontologies with the Protege OWL Plugin.

Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.

HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.

University of Sunderland CDM105 Session 2 Web Authoring Web Design The main principles and the main guru.

Ken Haydu - WFO ILN MIC. Overview Software development began with a meeting between the Ohio EMA and WFO ILN in late Identified requirements included.

MULTIMEDIA DEFINITION OF MULTIMEDIA

Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung,

CHAPTER TEN AUTHORING.

1 CS430: Information Discovery Lecture 18 Usability 3.

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.

August 2005 TMCOps TMC Operator Requirements and Position Descriptions Phase 2 Interactive Tool Project Presentation.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.

Modeling Information Navigation : Implication for Information Architecture Craig S. Miller 이주우.

1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.

 Network  A _____ of computers that can _________ w/ each other  Examples of hardware  ______________ & communication lines  Internet  Hardware.

Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.

Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.

CHAPTER 7 LESSON C Creating Database Reports. Lesson C Objectives  Display image data in a report  Manually create queries and data links  Create summary.

TRECVID IES Lab. Intelligent E-commerce Systems Lab. 1 Presented by: Thay Setha 05-Jul-2012.

Doktorant ė : Egl ė Mickevi č i ū t ė Software development technologies KAUNO TECHNOLOGIJOS UNIVERSITETAS INFORMACIJOS SISTEM Ų KATEDA Kaunas, 2013.

Dive Into® Visual Basic 2010 Express

Topic Modeling for Short Texts with Auxiliary Word Embeddings

AEM Digital Asset Management - DAM Author : Nagavardhan

Section 9.1 Section 9.2 Identify multimedia design guidelines

WP5: Semantic Multimedia

Objective % Select and utilize tools to design and develop websites.

Chapter 1 Introduction to HTML

Web Site Development and Macromedia Dreamweaver 8

Creating Oracle Business Intelligence Interactive Dashboards

Microsoft Office Access 2010 Lab 2

UX Concepts How they affected our development flow.

Simon Tucker and Steve Whittaker University of Sheffield

System Design Ashima Wadhwa.

Computer Fundamentals

An Introduction to Computers and Visual Basic

Multimedia Content-Based Retrieval

Planning and Building a Presentation

Objective % Select and utilize tools to design and develop websites.

Maintainability What is the primary task for the maintainability of a web site? Web sites designers must be aware that during the lifetime of a site one.

An Introduction to Computers and Visual Basic

Chap 7. Building Java Graphical User Interfaces

Human Factors Issues Chapter 8 Paul King.

Graphical User Interfaces -- Introduction

An Introduction to Computers and Visual Basic

Ying Dai Faculty of software and information science,

Web Programming– UFCFB Lecture 8

Tutorial 7 – Integrating Access With the Web and With Other Programs

CIS 376 Bruce R. Maxim UM-Dearborn

BCS Template Presentation February 22, 2018

Creating Online Presentations

Intro Project Introduction to HTML.

Journal of Web Semantics 55 (2019)

Presentation transcript:

A Web-based System for Collaborative Annotation of Large Image and Video Collections Multimedia ’05 Proceedings of the 13 th annual ACM international conference on Multimedia Authors: Timo Volkmer, John R. Smith, and Apostol (Paul) Natsev Presented by: Thay Setha An evaluation and user study

Outline 1.Introduction 2.Related work 3.The IBM EVA Annotation System 4.Application and Evaluation 5.Conclusion and Future Work

Introduction Nowadays, Research and development of video and image search system has become very popular. Annotated Collections of images and videos are a necessary basis of the successful development of multimedia retrieval systems. The large dataset in training data need to be annotated completely as well as accurately.

Introduction In this paper, they present describe and evaluate a web-based system for collaborative annotation of large collections of images or temporally pre-segmented videos (The IBM Efficient Video Annotation). – Optimized for collaborative annotation – Feature: work loading sharing – Support inter-annotator analysis The IBM Efficient Video Annotation (EVA) system was developed for the purpose of the 2005 TRECVID Annotation Forum for the annotation of approximately 80 hours of video to be used for the 2005 TRECVID Video Retrieval Evaluation benchmark.

Introduction The major focus in the design of this system was on usability by – simplify and speed up the annotation process – maintaining configuration and customization options – Different annotation styles which is a necessity for a large user base of annotators. – Customize number, Size, and layout of thumbnails displayed per page (annotate only few images per page without scrolling Vs scrolling and annotating many images at a time) – Use mouse/Keyboard for navigation for annotation – Select one/more concepts to annotate at a time

Introduction EVA Tool was designed with a few simplifying assumptions to promote consistency, simplicity, and speed of annotation: – All annotations apply terms from a small controlled-term vocabulary and no free text annotations are allowed – All annotations are for static visual concepts only and can be inferred from a single key frame without required users to play back video clips. – All annotations are assigned at the global frame level only and are assumed applicable to the entire shot. No object identification or regional annotation is required.

Related work The annotation tool from Informedia Image Classifier by the Informedia team at Carnegie Mellon University. – Semi-supervised image classification with Standalone Microsoft Windows Application – Does not provide statistics during collaborative annotation VIPER Annotation Tool by the Computer Vision Group at the University of Geneva – Feature: temporal segmentation, browsing, and event characterization. ESP Game – Web-based system and annotate images with custom concepts in game-like environment – Confidence of each annotation is computed based on how many users agree on a particular concept for an image. Ricoh MovieTool – MPEG-7 based system for video annotation – Automatic shot segmentation and hierarchical annotation – Complicated user interface

The IBM EVA Annotation System EVA System is a web based application for new image and video annotation system that allow user access using web browser. During annotation users can navigate page by page through the entire set of images. Users can specify parameters such as – number of thumbnail size per page – their organization in columns – and thumbnail size where thumbnail can either be an image or a representative frame of a video segment. Annotator can select a video and one or more concepts to use in a session.

The IBM EVA Annotation System

Each image can be assigned one of four labels in regards to the currently selected concept: – Positive: The image can clearly be classified with the given concept. – Negative: The image can clearly be classified as not possessing the given concept – Ignore: The semantics of this image is not clear clearly expressed and it should not be used for classification with the given concept. (blurred images and imperfect frame) – Skip: The image remains currently unannotated and will be reviewed later. This is default state.

The IBM EVA Annotation System A custom concept lexicon can be loaded and each annotator can be assigned either the full lexicon or a part thereof and then choose one or more concepts of those assigned for an annotation session. The assignment of labels is done for one concept at a time; user decides which and how many concepts are used in the current session. Lead to a more accurate and complete annotation as apposed to annotating all available concepts at once.

The IBM EVA Annotation System Bulk annotation buttons: – Bulk-positive: All thumbnails on the page that are currently marked as “skip” will be set to “negative” – Bulk-negative: All thumbnails on the page that are currently marked as “skip” will be set to “negative” – Bulk-ignore: All thumbnails on the page that are currently marked as “skip” will be set to “ignore” – Bulk-skip: all thumbnails of the current page will be reset to “skip”, regardless of their current state. Bulk button act only on previously unlabelled thumbnails except for the bulk “skip” use to clears all annotation for a given concept and given page of thumbnails. Efficient annotation for very rare or very frequent concepts by assuming a default state of “negative” or “positive” labels. It is easier to do than going through each image and assigning its label separately.

The IBM EVA Annotation System Beside using mouse, Annotation can be performed more efficiently by keyboard once a user has gone through a brief training phase. Labels can be assigned by using only a single keystroke and if a label is assigned by keyboard, the cursor is automatically advance to the next thumbnail. Annotation progress with statistics for each video and each concept assigned to the current user

The IBM EVA Annotation System They have added the ability to collect aggregate-level user data during annotation: – Time spent on each page – Number and size of thumbnails – Statistics about the usage of keyboard and mouse.

Application and Evaluation The collection of video clips consisted of 137 television news and entertainment broadcasts in Chinese, Arabic, and English language is provided to do the annotation. Each video temporally segmented into shots and one representative frame was selected for each shot which result in 61,904 frames

Application and Evaluation During the annotation effort, they monitored statistic such as: Inter- user agreement, average annotation time, concept frequency, and progress per concept to study the manual annotation process. According to the 7 semantics dimensions, the concepts in the lexicon are grouped into 7 categories: – Category A: Program Category (7 Concepts) – Category B: Setting/Scene/Site (15 Concepts) – Category C: People (8 Concepts) – Category D: Objects (8 Concepts) – Category E: Activities (2 Concepts) – Category F: Events (2 Concepts) – Category G: Graphics (2 Concepts)

Application and Evaluation Figure 3: Correlation between concept frequency and inter-user disagreement. The Pearson’s correlation coefficient is r = 0.73 for all concepts and r = 0.78 when omitting “‘Urban”, “Vegetation”, “Entertainment”, and “Police/Security”. The inter-annotator disagreement is based on those shots that had redundant annotation, that is, one or more users annotated with the same shot with the same concept with “positive” with “negative” only. They observe a high correlation between concept frequent and inter-user disagreement, As could be expected frequent concepts tend to cause more disagreement.

Application and Evaluation Figure 4: Inter-user disagreement for all concepts, normalized over concept frequency. Concepts such as “Urban”, “Vegetation”, “Entertainment”, and “Police/Security” clearly stand out. Figure 4 received by normalized the inter-annotator disagreement over frequency. This confirm that some concepts stand out and show relatively high disagreement. They concluded that this might be caused by unclear specification of these concepts.

Application and Evaluation Figure 5: Average concept inter-user disagreement, frequency, and annotation time as a function of concept category. Concept categories are ordered by annotation time per frame. Figure 5 shows average concept frequency, average annotation time per frame, and normalized inter-user disagreement for all concept categories. Concept categories “Objects” and “Event” require long time to be annotated but have low disagreement, this led us to the conclusion that these concepts are generally more complex to annotate. They are indeed defined and can identified, but require much attention.

Application and Evaluation Figure 6: Average annotation time per frame as a function of concept frequency. Rare and frequent concepts on average required more time to be labeled (computed variance 2 = 0.22). Grouped all concepts according to their frequency and evaluated the average annotation time for each group. Figure 6 shows that rare and frequent concepts required more time to annotate than concepts with medium frequency. The bulk annotation button have contributed to the annotation time being not very different across concepts. This is supported by user-feedback, suggesting that this feature was generally appreciated.

Application and Evaluation Figure 9: Average annotation time per frame, grouped by the primary input device of users. Finally, they have evaluated how the input device that users choose can affect the efficiency of annotation. The majority of all users (44%) preferred to use primarily the keyboard, 17% of the users who chose the mouse and 39% of the users could not clearly be classified and used both.

Conclusion and Future Work They present a new web-based system for collaborative image and video annotation, the IBM Efficient Video Annotation (EVA) system – Provide efficient user interface and powerful back-end feature such as annotation statistic and user level workload distribution. – Initial evaluation through analysis of the annotation time and quality – High inter-annotator agreement – Data security as it allows scheduled backups of all data on the server side. Future work: – Conduct an-depth quantitative evaluation. – Use as a research platform to study human-computer interaction. – Include more feature such as machine learning technique and improved browsing and filtering methods.

Thank You!