1 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Exploitation of knowledge in video recordings.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
DL:Lesson 11 Multimedia Search Luca Dini
Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
Broadcast News Parsing Using Visual Cues: A Robust Face Detection Approach Yannis Avrithis, Nicolas Tsapatsoulis and Stefanos Kollias Image, Video & Multimedia.
Content-based Video Indexing, Classification & Retrieval Presented by HOI, Chu Hong Nov. 27, 2002.
Toward Semantic Indexing and Retrieval Using Hierarchical Audio Models Wei-Ta Chu, Wen-Huang Cheng, Jane Yung-Jen Hsu and Ja-LingWu Multimedia Systems,
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Sriram Tata SID: Introduction: Large digital video libraries require tools for representing, searching, and retrieving content. One possibility.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
T.Sharon 1 Internet Resources Discovery (IRD) Video IR.
Vigilant Real-time storage and intelligent retrieval of visual surveillance data Dr Graeme A. Jones.
AceMedia Personal content management in a mobile environment Jonathan Teh Motorola Labs.
WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)
KDD for Science Data Analysis Issues and Examples.
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Information Retrieval in Practice
What’s Making That Sound ?
Video Classification By: Maryam S. Mirian
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
WP5.4 - Introduction  Knowledge Extraction from Complementary Sources  This activity is concerned with augmenting the semantic multimedia metadata basis.
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,
An approach to Intelligent Information Fusion in Sensor Saturated Urban Environments Charalampos Doulaverakis Centre for Research and Technology Hellas.
Multimedia Databases (MMDB)
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Player Action Recognition in Broadcast Tennis Video with Applications to Semantic Analysis of Sport Game Guangyu Zhu, Changsheng Xu Qingming Huang, Wen.
Characterizing activity in video shots based on salient points Nicolas Moënne-Loccoz Viper group Computer vision & multimedia laboratory University of.
Chapter 1 Introduction to Data Mining
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
NATIONAL TECHNICAL UNIVERSITY OF ATHENS Image, Video And Multimedia Systems Laboratory Background
Multimodal Information Analysis for Emotion Recognition
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
Research Projects 6v81 Multimedia Database Yohan Jin, T.A.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
1 Data Mining for Surveillance Applications Suspicious Event Detection Dr. Bhavani Thuraisingham April 2006.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Informatics and Telematics Institute Centre for Research and Technology Hellas ITI-CERTH Amsterdam, Multimedia Semantics XG, July 2006 Vasileios.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Data Mining for Surveillance Applications Suspicious Event Detection Dr. Bhavani Thuraisingham.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
MULTIMEDIA SYSTEMS CBIR & CBVR. Schedule Image Annotation (CBIR) Image Annotation (CBIR) Video Annotation (CBVR) Video Annotation (CBVR) Few Project Ideas.
An Ontology framework for Knowledge-Assisted Semantic Video Analysis and Annotation Centre for Research and Technology Hellas/ Informatics and Telematics.
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
Data Mining for Surveillance Applications Suspicious Event Detection
Digital Video Library - Jacky Ma.
Visual Information Retrieval
Presenter: Ibrahim A. Zedan
Multimedia Content-Based Retrieval
Semantic Video Classification
V. Mezaris, I. Kompatsiaris, N. V. Boulgouris, and M. G. Strintzis
Data Mining for Surveillance Applications Suspicious Event Detection
Knowledge-based event recognition from salient regions of activity
Data Mining for Surveillance Applications Suspicious Event Detection
Presentation transcript:

1 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Exploitation of knowledge in video recordings Dr. Alexia Briassouli, Dr. Yiannis Kompatsiaris Multimedia Knowledge Laboratory CERTH-ITI October 24, 2008 Thessaloniki, Greece

2 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 2 Evolution of Content 1-2 exabytes (millions of terabytes) of new information produced world-wide annually 80 billion of digital images are captured each year Over 1 billion images related to commercial transactions are available through the Internet This number is estimated to increase by ten times in the next two years new films are produced each year world-wide available films television stations and radio stations 100 billions of hours of audiovisual content Personal Content Sport - News Movies Web Mobile

3 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 3 Multimedia Content Networks Storage & Devices Segmentation KA Analysis Labeling Cross-media analysis Context Reasoning Metadata Generation & Representation Content adaptation and distribution - Multiple Terminal & Networks Hybrid / Content-based retrieval recommendations and personalization Semantic technology in Markets Web 2.0 photo - video applications

4 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 4 Need for annotation + medatata “The value of information depends on how easily it can be found, retrieved, accessed, filtered or managed in an active, personalized way”

5 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Analysis  Video analysis that exploits knowledge provides significant advantages:  Improved accuracy of semantics from video  Higher level concepts inferred through exploitation of knowledge combined with video processing:  Knowledge about behavior, event detection  More efficient storage, access, retrieval, dissemination of multimodal data because of the (automatically generated) annotations

6 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Analysis in JUMAS

7 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 7 Text-based indexing Manual annotation + Straightforward + High/Semantic level + Efficient during content creation Most commonly used Necessary in a number of applications - Time consuming - Operator-application dependent - Text related problems (synonyms etc)‏ Annotation using captions and related text Web, Video, Documents etc + Straightforward + High/Semantic level + Multimodal approach - Text processing restrictions and limitations - Captions must exist

8 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 8 Semantic Gap Addressing the Semantic Gap Semantic Gap for multimedia: To map automatically generated numerical low level- features to higher level human-understandable semantic concepts Dominant Color Descriptor of a sky region This image contains a sky region and is a holiday image

9 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 9 Problem definition Semantic image analysis: how to translate the automatically extracted visual descriptions into human like conceptual ones Low-level features provide cues for strengthen/weaken evidence based on visual similarity Prior knowledge is needed to support semantics disambiguation

10 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 10 Additional Analysis Information Knowledge Infrastructure (Multimedia Ontology) ‏ Manual Annotation - Models Semantic Analysis Single Modality Analysis Knowledge Extraction A common view Feature extraction Text, Image analysis Segmentation, SVMs Evidence generation “Vehicle”, “Building” Classifiers fusion Global vs. Local Modalities fusion Context “Ambulance” Reasoning Fusion of annotations Consistency checking Higher-level concepts/events “Emergency scene” Multimedia content annotation tools Training (Statistical) Modeling Domain Multimedia content Annotations Algorithms - Features Context

11 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Knowledge from Video analysis  Semantics from video  Implicitly derived via machine learning methods i.e. based on training:  SVM, HMM, Neural Networks, Bayesian Networks  Training uses appropriate data, relevant to the semantics that interest us  Training finds models that connect low level features (e.g. motion trajectories) with high-level annotations  These models are then applied to test data

12 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 12 Natural-Person: Sailing-Boat: Sand: Building: Pavement: Road: Body-Of-Water: Cliff: Cloud: Mountain: Sea: Sky: Stone: Waterfall: Wave: Dried-Plant: Dried-Plant-Snowed: Foliage: Grass: Tree: Trunk: Snow: Sunset: Car: Ground: Lamp-Post: Statue: Classification Results aceMedia Segment’s hypothesis set

13 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 13 Frame Region – Concept Association Region feature vector formed from local descriptors Region feature vector formed from local descriptors Individual SVM introduced for every defined local concept, receiving as input the region feature vector Individual SVM introduced for every defined local concept, receiving as input the region feature vector Training identical to global concept training case Training identical to global concept training case Every region evaluated by all trained SVMs, segment’s local concept hypothesis set created ( )‏ Every region evaluated by all trained SVMs, segment’s local concept hypothesis set created ( )‏ Ground: 0.89 Grass: 0.44 Mountain: 0.21 Boat: 0.07 Smoke: 0.41 Dirty-Water: 0.18 Trunk: 0.12 Foam: 0.19 Debris: 0.34 Mud: 0.31 Water: 0.42 Sky: 0.22 Ashes: 0.11 Subtitles: 0.24 Flames: 0.13 Vehicle: 0.12 Building: Foliage: 0.84 Person: 0.32 Road: 0.39 Segment’s hypothesis set

14 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 14 Initial Region-Concept Association Region feature vector formed from local descriptors Individual SVM introduced for every defined concept, receiving as input the region feature vector Training identical to global training case Every region evaluated by all trained SVMs, segment’s concept hypothesis set created ( )‏ Building: 0.89 Roof: 0.29 Grass: 0.21 Tree: 0.07 Stone: 0.41 Ground: 0.15 Dried-plant: 0.12 Sky: 0.19 Person: 0.34 Trunk: 0.31 Vegetation: 0.42 Rock: 0.22 Boat: 0.11 Sand: 0.44 Sea: 0.13 Wave: 0.12 Segment’s hypothesis set

15 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Knowledge for Video analysis  Explicit Semantics from video  Based on previously known models  Explicitly defined models, rules, facts  Rules from preliminary scripts and standards from similar cases  Explicit and implicit knowledge can be combined with results from low-level video processing to extract meaningful high-level knowledge

16 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute System Overview Multimedia Content Video Analysis (face recognition, motion segmentation etc) ‏ Knowledge Infrastructure (explicit or implicit) ‏ Semantic Multimedia Description

17 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video analysis  Motion Analysis  Motion detection  Tracking  Detection of when motion occurs  Motion Segmentation  Object segmentation based on motion characteristics  Generation of ‘active regions’

18 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Activity Areas from motion analysis

19 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Sub-activity Areas  After statistical processing for temporal localization of motion and events People walking towards each other People meet People leave together

20 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 20 Fight Sequence

21 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Processing (1) ‏  Pre-processing  Separate video from audio  Split video into frames  Noise removal via spatiotemporal filtering  Scene/shot detection  Shot = frames taken by single camera  Detect transition between frames  Uses only low-level information  Scene = story-telling unit  Uses higher-level knowledge, semantics

22 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Processing (2) ‏  Spatial segmentation:  Spatial segmentation in images, video frames  Extracts object(s) based on color, texture features  Motion segmentation:  Groups pixels with similar motion  Spatiotemporal segmentation:  Finds objects over several frames through combination of motion, appearance features  Merges spatial and motion segmentation results

23 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Knowledge in Video Analysis (1) ‏  Low level features can be combined with knowledge/rules for higher-level results  Spatiotemporally segmented objects can be used for object recognition  Face/gesture recognition after training with faces/gestures of significance  Motion in specific parts of a video (e.g. near court entrance, near prisoner’s seat) has additional significance:  Needs prior knowledge of which parts of the video frames are important and why

24 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Knowledge in Video Analysis (2) ‏  Knowledge structures can provide additional information about the relations between different low-level features  Interactions e.g. two motions in opposite directions, relation of extracted gestures, may mean something: people meeting, fighting, pointing, gesticulating  Face recognition combined with prior knowledge can show who is present when an event occurs

25 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Conclusions  Combined use of video processing with knowledge can lead to richer and more accurate high-level descriptions of multimedia data  Can be used in many more applications than currently, because the knowledge introduces flexibility and adaptability to the system:  The same algorithms and low-level features can provide much more information when used in combination with explicit and implicit knowledge

26 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute 26 Thank you! CERTH-ITI / Multimedia Knowledge Laboratory

27 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Analysis State of the Art  Spatiotemporal segmentation:  Find spatiotemporally homogeneous objects i.e. similar appearance and motion  Apply spatial segmentation on each frame  Match segmented objects in successive frames using low-level features (e.g. similar color, texture, continuous motion)‏  Use motion information – project position of object in current/next frames

28 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Analysis State of the Art

29 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Analysis State of the Art

30 Image Video & Multimedia Systems Laboratory Multimedia Knowledge Laboratory Informatics and Telematics Institute Video Analysis State of the Art  Spatial segmentation:  Spatial segmentation in images, video frames:  Region Based: Most methods are based on grouping similar features like color, texture, location – based on homogeneity of intensity, texture, position  Gradient/edge based: detecting changes in spatial distribution of features e.g. pixel illumination  Some methods combine region/edge information