Utilizing Video Ontology for Fast and Accurate Query-by-Example Retrieval Kimiaki Shirahama Graduate School of Economics, Kobe University Kuniaki Uehara.

Slides:



Advertisements
Similar presentations
Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
BY ANISH D. SARMA, XIN DONG, ALON HALEVY, PROCEEDINGS OF SIGMOD'08, VANCOUVER, BRITISH COLUMBIA, CANADA, JUNE 2008 Bootstrapping Pay-As-You-Go Data Integration.
Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Fuzzy Medical Image Segmentation
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
Presented by Zeehasham Rasheed
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
What is an Ontology? AmphibiaTree 2006 Workshop Saturday 8:45–9:15 A. Maglia.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Inferential Statistics
DOG I : an Annotation System for Images of Dog Breeds Antonis Dimas Pyrros Koletsis Euripides Petrakis Intelligent Systems Laboratory Technical University.
Retrieval Effectiveness of an Ontology-based Model for Information Selection Khan, L., McLeod, D. & Hovy, E. Presented by Danielle Lee.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
Text Classification, Active/Interactive learning.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
MPEG-7 Interoperability Use Case. Motivation MPEG-7: set of standardized tools for describing multimedia content at different abstraction levels Implemented.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
Dimitrios Skoutas Alkis Simitsis
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
Facilitating Document Annotation using Content and Querying Value.
THE SUPPORTING ROLE OF ONTOLOGY IN A SIMULATION SYSTEM FOR COUNTERMEASURE EVALUATION Nelia Lombard DPSS, CSIR.
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
Generic Tasks by Ihab M. Amer Graduate Student Computer Science Dept. AUC, Cairo, Egypt.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
FUZZY LOGIC INFORMATION RETRIEVAL MODEL Ferddie Quiroz Canlas, ME-CoE.
October Andrew C. Gallagher, Jiebo Luo, Wei Hao Improved Blue Sky Detection Using Polynomial Model Fit Andrew C. Gallagher, Jiebo Luo, Wei Hao Presented.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Text Categorization With Support Vector Machines: Learning With Many Relevant Features By Thornsten Joachims Presented By Meghneel Gore.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
International Conference on Fuzzy Systems and Knowledge Discovery, p.p ,July 2011.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Data Mining and Decision Support
BOOTSTRAPPING INFORMATION EXTRACTION FROM SEMI-STRUCTURED WEB PAGES Andrew Carson and Charles Schafer.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Coached Active Learning for Interactive Video Search Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University,
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
An Ontology framework for Knowledge-Assisted Semantic Video Analysis and Annotation Centre for Research and Technology Hellas/ Informatics and Telematics.
Multimedia Content-Based Retrieval
Multimedia Information Retrieval
ece 627 intelligent web: ontology and beyond
Block Matching for Ontologies
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
Information Networks: State of the Art
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Topic: Semantic Text Mining
Generalized Diagnostics with the Non-Axiomatic Reasoning System (NARS)
Presentation transcript:

Utilizing Video Ontology for Fast and Accurate Query-by-Example Retrieval Kimiaki Shirahama Graduate School of Economics, Kobe University Kuniaki Uehara Graduate School of System Informatics, Kobe University

Need for Video Ontology QBE (Query by Example): 1.Represent a query by providing example shots 2.Retrieve shots similar to example shots in terms of features The similarity on features does not agree with the similarity on semantic contents! Characterized by few edges corresponding to sky regions (Overfitting) (Example shots)(Retrieved shots) Query: Tall buildings are shown Relevant Irrelevant

Our Use of Video Ontology (Example shots) Query: Tall buildings are shown Retained Filtered Construct a video ontology as knowledge base in QBE (Formal and explicit specification of concepts, concept properties and their relations) → Filter shots that are clearly irrelevant to the query Building: 0.9 Urban: 0.7 Person: 0.1 Building: 0.8 Urban: 0.8 Person: 0 Building: 0.9 Urban: 0.6 Person: 0 Building: 0.8 Urban: 0.9 Person: 0.2 Building: 0.1 Urban: 0.0 Person: 0.8 Building: 0.0 Urban: 0.0 Person: 0.1 Building: 0.0 Urban: 0.1 Person: 0 Building: 0.9 Urban: 0.7 Person: 0.1 Concept detection scores are already assigned to all shots.

Overview of Our Approach 1.Video ontology construction: Organize concepts into a meaningful structure 2.Concept selection: Select concepts related to a given query 3.Shot filtering: Filter irrelevant shots based on selected concepts a.Filter as many irrelevant shots as possible b.Retain as many relevant shots as possible Building Outdoor TowerWindow Building Tower Outdoor Window House Concept vocabulary Video ontology Query: Buildings are shown Concept selection: Buildings, Outdoor, Tower, etc. Retained Filtered Shot filtering

Large-Scale Concept Ontology for Multimedia (LSCOM) The most popular ontology for video retrieval 1,000 concepts in broadcast news video domain Broadcast_News LocationPeopleObjects Office Meeting Crowd Face Bus Vehicle Program Weather Sports Activity and events People related Walk/Run Graphics Maps Charts Concept properties and relations are clearly insufficient! → Organize LSCOM concepts into a meaningful structure! Many research effort to develop effective detection methods for LSCOM concepts → Use detection scores of 374 concepts provided by City University of Hong Kong

appearWith Concept Organization Approach Limited kinds of reasoning (e.g. majority voting, linear combination) Various kinds of reasoning!  Inductive approach Use annotated video collection (training data) Only degrees of relatedness are represented  Deductive approach (our approach) Manually organize LSCOM concepts Various properties and relations are represented TowerHouse OBJECT Vehicle Window partOf Ground_Vehicle WITH_PERSONCar Person BicycleMotorcycle Outdoor locatedAt CONSTRUCTION Building (Wei, 2011)

Our Video Ontology Shot with 8 attributes (categories of semantic contents) Disjoint partition Visual co-occurrence Define new concepts to construct a meaningful structure

appearWith Video Ontology Construction Ground_Vehicle 1. Disjoint partition Car Vehicle Cannot be an instance of more than one subconcept! Ground_Vehicle Car Vehicle Bus 2. Visual co-occurrence Ground_Vehicle WITH_PERSON Bicycle Car Some concepts are frequently shown in the same shots All concepts should not be organized into a single hierarchy Motorcycle Person

Overview of Our Approach 1.Video ontology construction: Organize concepts into a meaningful structure 2.Concept selection: Select concepts related to a given query 3.Shot filtering: a.Filter as many irrelevant shots as possible b.Retain as many relevant shots as possible Building Outdoor TowerWindow Building Tower Outdoor Window House Concept vocabulary Video ontology Query: Buildings are shown Concept selection: Buildings, Outdoor, Tower, etc. Retained Filtered Shot filtering

Concept Selection 1.Select concepts matching words in the text description of a query 2.Select their subconcepts, and concepts specified as properties 3.Select concepts with properties matching words in the text description of the query Query: Buildings are shown 4. Validate selected concepts using example shots → Concepts detected with high detection scores are selected

Overview of Our Approach 1.Video ontology construction: Organize concepts into a meaningful structure 2.Concept selection: Select concepts related to a given query 3.Shot filtering: a.Filter as many irrelevant shots as possible b.Retain as many relevant shots as possible Building Outdoor TowerWindow Building Tower Outdoor Window House Concept vocabulary Video ontology Query: Buildings are shown Concept selection: Buildings, Outdoor, Tower, etc. Retained Filtered Shot filtering

Shot Filtering Using Concept Relations (Wrongly retained shot) Query: Person appears with computers Person, Female_Person Computer, Room, Laboratory Use concept relations to efficiently filter irrelevant shots Indoor object Indoor locations Shots where Outdoor is detected should be filtered! Simple filtering: Filter shots if none of selected concepts are detected.

Use of Concept Relations  Hierarchical relations  Sibling relations (Disjoint partition) Building Office_Building LOCATION INDOOROutdoorUnderwaterOuter_Space Should not be detected Query: Buildings are shown IF detected, THEN should be detected, Query: Person appears with computers → INDOOR needs to be detected Office_Building is wrongly detected

Overview of Our Approach 1.Video ontology construction: Organize concepts into a meaningful structure 2.Concept selection: Select concepts related to a given query 3.Shot filtering: a.Filter as many irrelevant shots as possible b.Retain as many relevant shots as possible Building Outdoor TowerWindow Building Tower Outdoor Window House Concept vocabulary Video ontology Query: Buildings are shown Concept selection: Buildings, Outdoor, Tower, etc. Retained Filtered Shot filtering

Uncertainty in Concept Detection Indoor fails to be detected Query: Person appears with computers → INDOOR needs to be detected Ontologies present a priori knowledge which is taken as true by human. → Lack the support for uncertainty Traditional ontology reasoning assumes … (Russell, 2003) 1.Locality: If A ⇒ B, then B is concluded by A without considering any other rules. 2.Detachment: Once B is proven, it is used regardless of how it was derived. → Do not consider the uncertainty of a hypothesis 3.Truth functionality: A complex rule can be examined from the truth of its components. → Does not consider the uncertainty of combining multiple hypotheses How to manage erroneous concept detection results?

Dempster-Shafer Theory (1/2) 1.Locality: If A ⇒ B, then B is concluded by A without considering any other rules. 2.Detachment: Once B is proven, it is used regardless of how it was derived. → Do not consider the uncertainty of a hypothesis ⇒ Represent the degree of belief of a hypothesis Demspter-Shafer Theory (DST): Generalization of Bayesian theory where the probability of a hypothesis is defined based on its degree of belief Degree of belief that a shot is certainly relevant m({relevance}) = 0.2 Degree of belief that its relevance is uncertain m({relevance, irrelevance}) = 0.6 Degree of belief that it is certainly irrelevant m({irrelevance}) = 0.2 Query: Person appears with computers → INDOOR needs to be detected

Dempster-Shafer Theory (2/2) 3. Truth functionality: A complex rule can be examined from the truth of its components. → Does not consider the uncertainty of combining multiple hypotheses ⇒ Combination rule for integrating uncertain hypotheses Filtering by video ontology m({relevance}) = 0.2 m({relevance, irrelevance}) = 0.6 m({irrelevance}) = 0.2 Filtering by visual –based approach m({relevance}) = 0.7 m({relevance, irrelevance}) = 0.2 m({irrelevance}) = 0.1 Conflict Agreement Query: Persons appear with computers → INDOOR needs to be detected jm({relevant}) = 0.71

Experimental Setting TRECVID 2009 video data  219 development videos (36,106 shots) → Manually select 10 example shots for each query  619 test videos (97,150 shots) → Retrieve shots matching the query Target Queries Query 1: A view of one or more tall buildings and the top story visible Query 2: One or more people, each at a table or desk with a computer visible Query 3: One or more people, each sitting in a char, talking Evaluation measures 1.Precision: The fraction of retained shots that are relevant to a query. 2.Recall: The fraction of relevant shots that are successfully retained. 3.Filter recall: The fraction of irrelevant shots that are successfully filtered. 4.Retrieval performance: The number of relevant shots within 1,000 retrieved shots Retrieval method: (Shirahama, 2011)

Importance of Concept Selection OntoWordNetVisual Shot filtering does not heavily rely on concept selection methods. (Selected concepts are similar to each other) Filter shots if none of selected concepts are detected Examine the importance of using concept relations for shot filtering Query 1Query 2Query 3Query 1Query 2Query 3Query 1Query 2Query 3 (a) Precision(b) Recall(c) Filter recall

Effectiveness of Using Concept Relations and DST Onto: Many irrelevant shots are wrongly retained (Filter recall) No-DST: Precision is significantly improved, but recall is degraded DST: Recall is recovered while improving precision and filter recall Onto (base)No-DSTDST (a) Precision(b) Recall(c) Filter recall Query 1Query 2Query 3Query 1Query 2Query 3Query 1Query 2Query 3 Best! Concept relations and DST are useful to achieve shot filtering, which not only filters many irrelevant shots, but also retains many relevant shots.

Retrieval Performance and Time Shot filtering by our video ontology is effective for both improving the retrieval performance and reducing the retrieval time! Without using shot filteringUsing shot filtering (a) Retrieval performance(b) Retrieval time Query 1Query 2Query 3Query 1Query 2Query 3

Conclusion and Future Works Conclusion Video ontology construction and utilization for fast and accurate QBE 1.Video ontology construction based on design patterns of general ontologies Disjoint partition and visual co-occurrence 2.Concept selection by tracing the video ontology 3.Shot filtering Improve precision → Concept relations Improve recall → Dempster-Shafer theory (DST) Future works Compute detection scores for self-defined concepts Estimate optimal parameter in Basic Belief Assignment (BBA) functions in DST Reduce retrieval time by parallelizing the retrieval process

Thank you!