Yu-Gang Jiang, Yanran Wang, Rui Feng Xiangyang Xue, Yingbin Zheng, Hanfang Yang Understanding and Predicting Interestingness of Videos Fudan University,

Slides:



Advertisements
Similar presentations
Max-Margin Additive Classifiers for Detection
Advertisements

Yansong Feng and Mirella Lapata
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Multimedia Answer Generation for Community Question Answering.
Patch to the Future: Unsupervised Visual Prediction
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Data-driven Visual Similarity for Cross-domain Image Matching
SUPER: Towards Real-time Event Recognition in Internet Videos Yu-Gang Jiang School of Computer Science Fudan University Shanghai, China
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
Retrieving Actions in Group Contexts Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon Fraser University Sept. 11, 2010.
Tour the World: building a web-scale landmark recognition engine ICCV 2009 Yan-Tao Zheng1, Ming Zhao2, Yang Song2, Hartwig Adam2 Ulrich Buddemeier2, Alessandro.
ADVISE: Advanced Digital Video Information Segmentation Engine
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos.
Con-Text: Text Detection Using Background Connectivity for Fine-Grained Object Classification Sezer Karaoglu, Jan van Gemert, Theo Gevers 1.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Inter-modality Face Sketch Recognition Hamed Kiani.
Time-Sensitive Web Image Ranking and Retrieval via Dynamic Multi-Task Regression Gunhee Kim Eric P. Xing 1 School of Computer Science, Carnegie Mellon.
Quality-aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim Singapore Management University Aixin Sun.
Describing People: A Poselet-Based Approach to Attribute Classification Lubomir Bourdev 1,2 Subhransu Maji 1 Jitendra Malik 1 1 EECS U.C. Berkeley 2 Adobe.
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
School of Information Technology & Electrical Engineering Multiple Feature Hashing for Real-time Large Scale Near-duplicate Video Retrieval Jingkuan Song*,
Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,
Kristina Lerman Aram Galstyan USC Information Sciences Institute Analysis of Social Voting Patterns on Digg.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.
Multi-task Low-rank Affinity Pursuit for Image Segmentation Bin Cheng, Guangcan Liu, Jingdong Wang, Zhongyang Huang, Shuicheng Yan (ICCV’ 2011) Presented.
ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.
Beauty is Here! Evaluating Aesthetics in Videos Using Multimodal Features and Free Training Data Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang School of.
Motivation to Use Search Engine: A Two-Factor Model Ling-ling Wu ( 吳玲玲) Department of Information Management, National Taiwan University.
Multi-Task Learning for Boosting with Application to Web Search Ranking Olivier Chapelle et al. Presenter: Wei Cheng.
Semantic Embedding Space for Zero ­ Shot Action Recognition Xun XuTimothy HospedalesShaogang GongAuthors: Computer Vision Group Queen Mary University of.
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
1 Selectivity Estimation for Exclusive Query Translation in Deep Web Data Integration Fangjiao Jiang Renmin University of China Joint work with Weiyi Meng.
ACADS-SVMConclusions Introduction CMU-MMAC Unsupervised and weakly-supervised discovery of events in video (and audio) Fernando De la Torre.
Ch. Eick: Some Ideas for Task4 Project2 Ideas on Creating Summaries and Evaluations of Clusterings Focus: Primary Focus Summarization (what kind of objects.
Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, Xiangyang Xue
Understanding and Predicting Interestingness of Videos Yu-Gang Jiang, Yanran Wang, Rui Feng, Hanfang Yang, Yingbin Zheng, Xiangyang Xue School of Computer.
Describing People: A Poselet-Based Approach to Attribute Classification.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Classifying Covert Photographs CVPR 2012 POSTER. Outline  Introduction  Combine Image Features and Attributes  Experiment  Conclusion.
Hybrid Classifiers for Object Classification with a Rich Background M. Osadchy, D. Keren, and B. Fadida-Specktor, ECCV 2012 Computer Vision and Video Analysis.
 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:
Zhuode Liu 2016/2/13 University of Texas at Austin CS 381V: Visual Recognition Discovering the Spatial Extent of Relative Attributes Xiao and Lee, ICCV.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Image Retrieval and Ranking using L.S.I and Cross View Learning Sumit Kumar Vivek Gupta
Course Project Lists for ITCS6157 Jianping Fan. Project Implementation Lists Automatic Image Clustering You can download 1,000,000 images from You can.
Automatic Advertisement Ratings Discussion Methods Problem and Motivation The goal is to automatically generate an objective score or ranking for an advertisement.
SHAHAB iCV Research Group.
Digital Video Library - Jacky Ma.
Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,
Data Driven Attributes for Action Detection
Saliency-guided Video Classification via Adaptively weighted learning
Personalized Social Image Recommendation
Compositional Human Pose Regression
Thesis Advisor : Prof C.V. Jawahar
Accounting for the relative importance of objects in image retrieval
Multiple Feature Learning for Action Classification
Large Scale Image Deduplication
AHED Automatic Human Emotion Detection
Deep Cross-media Knowledge Transfer
Faceted Filter Jidong Jiang
Heterogeneous convolutional neural networks for visual recognition
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
Presentation transcript:

Yu-Gang Jiang, Yanran Wang, Rui Feng Xiangyang Xue, Yingbin Zheng, Hanfang Yang Understanding and Predicting Interestingness of Videos Fudan University, Shanghai, China AAAI 2013, Bellevue, USA, July 2013

Large amount of videos on the Internet – Consumer Videos, advertisement… Some videos are interesting, while many are not Motivation More interestingLess interesting Two Advertisements of digital products

Applications Web Video Search Recommendation System...

Predicting Aesthetics and Interestingness of Images – Datta et al. ECCV, 2006; Dhar et al. CVPR, 2011; N. Murray et al. CVPR, 2012… We are the first to explore the interestingness of Videos Related Work More interesting Less interesting ………

Flickr – source: Flickr.com Consumer Video – videos: 1200 (20 hrs in total) YouTube – source: Youtube.com Advertisement Video – videos: 420 (4.2 hrs in total) Two New Datasets

Collected by 15 interestingness-enabled queries – Top 10% of 400 as interesting videos; Bottom 10% as uninteresting – 80 videos per category/query Flickr Dataset

Collected by 15 ads queries on YouTube 10 human assessors (5 females, 5 males) – Compare video pairs Annotation Interface YouTube Dataset General observation: videos with humorous stories, attractive background music, or better professional editing tend to be more interesting

Aim: compare two videos and tell which is more interesting Visual features Audio features High-level attribute features Ranking SVM results Multi-modal fusion vs. Our Computational Framework

Feature Visual features Color Histogram SIFTHOGSSIMGIST Audio features MFCC Spectrogram SIFT Audio-Six High-level attribute features ClassemesObjectBankStyle Flower, Tree, Cat, Face… Rule of Thirds Vanishing Point Soft Focus Motion Blur Shallow DOF …

Prediction – Ranking SVM trained on our dataset Chi square kernel for histogram-like features RBF kernel for the others – 2/3 for training and 1/3 for testing Evaluation – Prediction accuracy The percentage of correctly ranked test video pairs Prediction & Evaluation

Prediction Accuracies(%) Visual Feature Results Flickr YouTube

Audio Feature Results Flickr YouTube Prediction Accuracies(%)

Attribute Feature Results Flickr YouTube Different from predicting Image Interestingness

Visual Attribute Prediction Accuracies(%) Visual+Audio+Attribute Results Flickr YouTube % 5.4% Audio Visual+Audio+Attribute Visual+Audio

Conducted a pilot study on video interestingness Built two datasets to support this study – Publicly available at: Evaluated a large number of features – Visual + audio features are very effective – A few features useful in image interestingness do not work in video domain (e.g., Style Attributes…) Summary

Thank you ! Datasets are available at: