MUSCLE- Network of Excellence Movie Summarization and Skimming Demonstrator ICCS-NTUA (P. Maragos, K. Rapantzikos, G. Evangelopoulos, I. Avrithis) AUTH.

Slides:



Advertisements
Similar presentations
Feature extraction: Corners
Advertisements

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
Computer Vision Lecture 16: Texture
DL:Lesson 11 Multimedia Search Luca Dini
MPEG-4 Objective Standardize algorithms for audiovisual coding in multimedia applications allowing for Interactivity High compression Scalability of audio.
Broadcast News Parsing Using Visual Cues: A Robust Face Detection Approach Yannis Avrithis, Nicolas Tsapatsoulis and Stefanos Kollias Image, Video & Multimedia.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
Toward Semantic Indexing and Retrieval Using Hierarchical Audio Models Wei-Ta Chu, Wen-Huang Cheng, Jane Yung-Jen Hsu and Ja-LingWu Multimedia Systems,
IEEE TCSVT 2011 Wonjun Kim Chanho Jung Changick Kim
Personal Driving Diary: Constructing a Video Archive of Everyday Driving Events IEEE workshop on Motion and Video Computing ( WMVC) 2011 IEEE Workshop.
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
Real-Time Audio-Visual Automatic Speech Recognition Demonstrator TSI-TUC, Greece (A. Potamianos, E. Sanchez-Soto, M. Perakakis) NTUA, Greece (P. Maragos,
Generic Object Recognition -- by Yatharth Saraf A Project on.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
1 Interest Operator Lectures lecture topics –Interest points 1 (Linda) interest points, descriptors, Harris corners, correlation matching –Interest points.
Research activities at AUTH related to dialogue detection Ioannis Pitas Constantine Kotropoulos Nikos Nikolaidis Research activities at AUTH related to.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
MUSCLE movie data base is a multimodal movie corpus collected to develop content- based multimedia processing like: - speaker clustering - speaker turn.
Feature extraction: Corners and blobs
LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.
Segmentation (Section 10.2)
ICCS-NTUA Contributions to E-teams of MUSCLE WP6 and WP10 Prof. Petros Maragos National Technical University of Athens School of Electrical and Computer.
A Probabilistic Framework for Video Representation Arnaldo Mayer, Hayit Greenspan Dept. of Biomedical Engineering Faculty of Engineering Tel-Aviv University,
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
김덕주 (Duck Ju Kim). Problems What is the objective of content-based video analysis? Why supervised identification has limitation? Why should use integrated.
MUSCLE WP9 E-Team Integration of structural and semantic models for multimedia metadata management Aims: (Semi-)automatic MM metadata specification process.
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
Introduction to Image Processing Grass Sky Tree ? ? Review.
Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval.
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.
Characterizing activity in video shots based on salient points Nicolas Moënne-Loccoz Viper group Computer vision & multimedia laboratory University of.
Multimodal Information Analysis for Emotion Recognition
黃文中 Introduction The Model Results Conclusion 2.
Simfo Marco Adelfio, Dan Bucatanschi, Bo Liu, Nick Violi.
Prof. Thomas Sikora Technische Universität Berlin Communication Systems Group Thursday, 2 April 2009 Integration Activities in “Tools for Tag Generation“
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.
MedIX – Summer 07 Lucia Dettori (room 745)
Street Smarts: Visual Attention on the Go Alexander Patrikalakis May 13, XXX.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Miguel Tavares Coimbra
Detection of Illicit Content in Video Streams Niall Rea & Rozenn Dahyot
1 Detecting Group Interest-level in Meetings Daniel Gatica-Perez, Iain McCowan, Dong Zhang, and Samy Bengio IDIAP Research Institute, Martigny, Switzerland.
Pascal Kelm Technische Universität Berlin Communication Systems Group Thursday, 2 April 2009 Video Key Frame Extraction for image-based Applications.
CS654: Digital Image Analysis
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 4 – Audio and Digital Image Representation Klara Nahrstedt Spring 2010.
Lecture 04 Edge Detection Lecture 04 Edge Detection Mata kuliah: T Computer Vision Tahun: 2010.
Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.
Image Features (I) Dr. Chang Shu COMP 4900C Winter 2008.
 Mentor : Prof. Amitabha Mukerjee Learning to Detect Salient Objects Team Members - Avinash Koyya Diwakar Chauhan.
Keypoint extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
Visual Information Retrieval
Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,
Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.
TP12 - Local features: detection and description
Color-Texture Analysis for Content-Based Image Retrieval
Davide Nardo, Valerio Santangelo, Emiliano Macaluso  Neuron 
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Detecting Artifacts and Textures in Wavelet Coded Images
Text Detection in Images and Video
Introduction Computer vision is the analysis of digital images
Computer Vision Lecture 16: Texture II
CSE 455 – Guest Lectures 3 lectures Contact Interest points 1
Detection of salient points
Introduction Computer vision is the analysis of digital images
Lecture 5: Feature invariance
Presentation transcript:

MUSCLE- Network of Excellence Movie Summarization and Skimming Demonstrator ICCS-NTUA (P. Maragos, K. Rapantzikos, G. Evangelopoulos, I. Avrithis) AUTH (C. Kotropoulos, P. Antonopoulos, V. Moschou, N. Nikolaidis, I. Pitas) INRIA-IRISA (P. Gros) TSI-TUC (A. Potamianos, M. Perakakis) MUSCLE Showcase:

MUSCLE- Network of Excellence Audio-VisualAttention Modeling – Event Detection Audio-Visual Attention Modeling – Event Detection Detecting events by attention modeling Two-module (aural, visual) attention for 3D event histories Attention curve extraction. Fusing streams vs. fusing features Visual Fusion Audio Saliency Map Feature Vector Visual Attention Audio Attention User Attention Curve Event Detection

MUSCLE- Network of Excellence MUSCLE Review II, April 2006 Audio Saliency Audio signal model: sum of AM-FM components Modulation bands through a linear bank of K Gabor filters. Tracking the maximum average Teager Energy (MTE) : k-th filter response, :Teager-Kaiser Energy operator MTE : dominant signal modulation energy. Demodulating, via DESA, the dominant channel and frame average

MUSCLE- Network of Excellence Spatiotemporal Visual Saliency Features –Intensity –Color –Spatiotemporal orientations Feature intra- and inter- competition

MUSCLE- Network of Excellence MUSCLE Review II, April 2006 AudioVisual Fusion – User attention curve Simple linear fusion scheme Detecting events by 4 curve characteristics: –Peak/valley detection (key-frame selection) Local maxima\minima –Sharp transition detection (1D edges) LoG operator on curve Scale parameter by std of Gaussian –Thresholding values (salient segments) –Region of peak support (lobes, segments between edges where maxima exist) Two fusion schemes: –i) Fuse curves (linear, non-linear fusion) –ii) Detect in audio and video and combine (e.g. AND,OR)

MUSCLE- Network of Excellence MUSCLE Review II, April 2006 User Attention Curve

MUSCLE- Network of Excellence MUSCLE Review II, April 2006 Key frame selection AudioVideo Fusion

MUSCLE- Network of Excellence MUSCLE Review II, April 2006 Examples of Audio/Video event Examples of Audio/Video event enhancement Video suppresses/groups audio events (audio event present)  Audio & Video events match (both are present)  Audio giving event (video event absent)

MUSCLE- Network of Excellence Movie Database Description 42 scenes were extracted from 6 movies of different genres, i.e., Analyze That, Lord of the Rings, Secret Window, Platoon, Jackie Brown, Cold Mountain. 25 out of the 42 scenes are dialogue instances and the remaining 17 are annotated as non- dialogue scenes. Dialogue scenes last from 20 sec to 120 sec. Total duration: 34 min and 43 sec.

MUSCLE- Network of Excellence Scene Annotation Dialogue types for both audio and video streams are: –CD (Clean Dialogue) –BD (Dialogue with background) Non-Dialogue types for both audio and video streams are: –CM (Clean Monologue) –BM (Monologue with background) –ND (Other)

MUSCLE- Network of Excellence Database Description gt folder: ground truth information (*.xml files). video folder: the video streams without the audio channel (*.avi files). audio folder: the audio streams without the visual channel (*.wav files). actors index: actor’s Id, name, and photograph (*.xls file). Actors info is also available in xml format for each video scene.