An Architecture for Mining Resources Complementary to Audio-Visual Streams J. Nemrava, P. Buitelaar, N. Simou, D. Sadlier, V. Svátek, T. Declerck, A. Cobet,

Slides:



Advertisements
Similar presentations
National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Trajectory Analysis of Broadcast Soccer Videos Computer Science and Engineering Department Indian Institute of Technology, Kharagpur by Prof. Jayanta Mukherjee.
Knowledge Management and Engineering David Riaño.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Personalized Abstraction of Broadcasted American Football Video by Highlight Selection Noboru Babaguchi (Professor at Osaka Univ.) Yoshihiko Kawai and.
1 Texmex – November 15 th, 2005 Strategy for the future Global goal “Understand” (= structure…) TV and other MM documents Prepare these documents for applications.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
ADVISE: Advanced Digital Video Information Segmentation Engine
SCULPTEUR: Multimedia Retrieval for Museums S. Goodall, P. H. Lewis, K. Martinez, P. A. S. Sinclair, F. Giorgini, M. J. Addis, M. J. Boniface, C. Lahanier,
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
Information Retrieval in Practice
1 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Exploitation of knowledge in video recordings.
What’s Making That Sound ?
Video Classification By: Maryam S. Mirian
WP5.4 - Introduction  Knowledge Extraction from Complementary Sources  This activity is concerned with augmenting the semantic multimedia metadata basis.
WP5.4/3.1/4.2/5.5 meeting 29th of November 2007, DFKI.
Information Extraction from Cricket Videos Syed Ahsan Ishtiaque Kumar Srijan.
Multimedia Databases (MMDB)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Player Action Recognition in Broadcast Tennis Video with Applications to Semantic Analysis of Sport Game Guangyu Zhu, Changsheng Xu Qingming Huang, Wen.
A Proposal for a Video Modeling for Composing Multimedia Document Cécile ROISIN - Tien TRAN_THUONG - Lionel VILLARD Presented by: Tien TRAN THUONG Project.
NATIONAL TECHNICAL UNIVERSITY OF ATHENS Image, Video And Multimedia Systems Laboratory Background
HOMEWORK BOOKLET – YEAR 7&8 NAME: _____________________________ TEACHER: __________________________.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
TRECVID 2004 Search Task by NUS PRIS Tat-Seng Chua, et al. National University of Singapore.
Tactic Analysis in Football Instructors: Nima Najafzadeh Mahdi Oraei Spring
1 Multiple Classifier Based on Fuzzy C-Means for a Flower Image Retrieval Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki Graduate School of Engineering,
Research Projects 6v81 Multimedia Database Yohan Jin, T.A.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Prof. Thomas Sikora Technische Universität Berlin Communication Systems Group Thursday, 2 April 2009 Integration Activities in “Tools for Tag Generation“
Levi Smith.  Reading papers  Getting data set together  Clipping videos to form the training and testing data for our classifier  Project separation.
CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy.
Case Study 1 Semantic Analysis of Soccer Video Using Dynamic Bayesian Network C.-L Huang, et al. IEEE Transactions on Multimedia, vol. 8, no. 4, 2006 Fuzzy.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
TEMPLATE DESIGN © E-Eye : A Multi Media Based Unauthorized Object Identification and Tracking System Tolgahan Cakaloglu.
Application Ontology Manager for Hydra IST Ján Hreňo Martin Sarnovský Peter Kostelník TU Košice.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
2004 謝俊瑋 NTU, CSIE, CMLab 1 A Rule-Based Video Annotation System Andres Dorado, Janko Calic, and Ebroul Izquierdo, Senior Member, IEEE.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
Using Cross-Media Correlation for Scene Detection in Travel Videos.
An MPEG-7 Based Semantic Album for Home Entertainment Presented by Chen-hsiu Huang 2003/08/12 Presented by Chen-hsiu Huang 2003/08/12.
Jeopardy Digital Images Historical Inquiry LectureAssessment $100 $200 $300 $400 $500.
MULTIMEDIA DATA MODELS AND AUTHORING
WBI/WCI - SKM 14 July Analysis and Knowledge Extraction from Video & Audio Rick Parent Jim Davis Raghu Machiraju Deleon Wang Department of Computer.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
REAL-TIME DETECTOR FOR UNUSUAL BEHAVIOR
Digital Video Library - Jacky Ma.
Visual Information Retrieval
Representation and Analysis of Multimedia Content: The BOEMIE Proposal
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Social Knowledge Mining
Image Segmentation Techniques
Multimedia Content Description Interface
Pilar Orero, Spain Yoshikazu SEKI, Japan 2018
Objects as Attributes for Scene Classification
Introduction to Information Retrieval
Presentation transcript:

An Architecture for Mining Resources Complementary to Audio-Visual Streams J. Nemrava, P. Buitelaar, N. Simou, D. Sadlier, V. Svátek, T. Declerck, A. Cobet, T. Sikora, N. O'Connor, V. Tzouvaras, H. Zeiner, J. Petrák

Introduction Video retrieval can strongly benefit from textual sources related to the A/V stream Vast textual resources available on the web can be used for fine-grained event recognition. Good example is sport-related videos  Summaries of matches Tabular (list of player, cards, substitutions) Textual (minute-by-minute reports)

Available Resources Audio-Video Streams  A/V analysis captures features from the video using suitable detectors Primary Complementary  Directly attached to the media Overlay text, spoken commentaries, Secondary Complementary  Independent from the media Written commentaries, summaries, analysis

Audio-Video Analysis Crowd image detector Speech-Band Audio Activity On-Screen Graphics Tracking Motion activity measure Field Line orientation Close-up

Primary complementary resources Video track  Overlay text OCR text region detection Time synchronization  Merging 16 frames to recognize moving from static objects in the video  Textual information such as overlay text and players numbers provide additional primary resource Audio track  Speech commentaries

Secondary Complementary Resources Tabular  Summaries, list of players, goals, cards  “meta” information Location, referee, attendance, date

Secondary Complementary Resources Unstructured  Several minute-by-minute sources  Text analysis and event extraction using SPRouT Player actions Player Names  German and English Ontology based IE tool SProUT ‘A beautiful pass by Ruud Gullit set up the first Rijkaard header.’

Ontology SProUT uses SmartWeb football ontology for  Player action  Referee action  Trainer action

Architecture Overview 21

Architecture overview

Reasoning over complementary resources of football games Textual Sources (per coarse-grained minute)  Extraction of semantic concepts from unstructured texts using DFKI ontology based information extraction tool Video Analysis (for every second) - DCU  Crowd image detector – values range ∈ [0,1]  Speech-Band Audio Activity - values range ∈ [0,1]  Motion activity measure - values range ∈ [0,1]  Close-up - values range ∈ [0,1]  Field Line orientation - values range ∈ [0,90]

Video Analysis Fuzzification A period of 20 seconds is evaluated  A threshold value was set according to the detectors mean value during the game.  Top value was mapped to [0,1] Similar process for motion, close up and crowd detectors

Video Analysis Fuzzification Line angle  Values between 0-7 are Middle Field  Values between are End of Field  Fuzzification according to their occurrences in the period of 20 seconds  Example Middle Field 13 occurrences Fuzzy Value = 0.65 End of Field 4 occurrences Fuzzy Value = 0.2 Other 3 occurrences Fuzzy Value = 0.15

Declaring Alphabet … Concepts = {Scoringopportunity Outofplay Handball Kick Scoregoal Cross Foul Clear Cornerkick Dribble Freekick Header Trap Shot Throw Pass Ballpossession Offside Charge Lob Challenge Booked Goalkeeperdive Block Save Substitution Tackle EndOfField MiddleField Other Crowd Motion CloseUp Audio} Roles = {consistOf} Individuals = {min0 sec20 sec40 sec60 min1 sec80 sec100 sec120 min2 sec140 sec160 sec180 min3 sec200…}

Knowledge Representation- ABox 〈 min1 : Kick ≥ 1 〉 〈 min1 : Scoregoal ≥ 1 〉 〈 sec80 : Audio ≥ 0.06 〉 〈 sec80 : Crowd ≥ 〉 〈 sec80 : Motion ≥ 〉 〈 sec80 : EndOfField ≥ 0.05 〉 〈 (min1 : sec60 ) : consistOf ≥ 1 〉 〈 (min1 : sec80 ) : consistOf ≥ 1 〉 〈 (min1 : sec100 ) : consistOf ≥ 1 〉 〈 (min1 : sec120 ) : consistOf ≥ 1 〉

Knowledge Representation- TBox

Query Examples

Architecture Overview 21

Cross-Media Features Basic idea  Identify which video detectors are more prominent for which event class  For instance for CORNERKICK the “end-zone” video detector should be significantly high Strategy  Analyze distribution of video detectors over event classes  Identify significant detectors for each class  Feedback into the video event detection algorithm

Cross-Media Features purpose of the cross-media descriptors is to capture the features and relations in multimodal data so as to be able to retrieve complementary information when dealing with one of the data sources  build up model to classify events in video independently from the video Use of cross-media features in event-type classification of video segments by use of fuzzy reasoning with the FiRe inference engine  Fire is focused on events retrieval

Thank you for your attention