Large-Scale Content-Based Audio Retrieval from Text Queries

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Evaluating the Robustness of Learning from Implicit Feedback Filip Radlinski Thorsten Joachims Presentation by Dinesh Bhirud
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
ECG Signal processing (2)
Introduction to Information Retrieval
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
Discriminative and generative methods for bags of features
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
ICASSP, May Arjen P. de Vries Thijs Westerveld Tzvetanka I. Ianeva Combining Multiple Representations on the TRECVID Search Task.
Distributed Search over the Hidden Web Hierarchical Database Sampling and Selection Panagiotis G. Ipeirotis Luis Gravano Computer Science Department Columbia.
Results Audio Information Retrieval using Semantic Similarity Luke Barrington, Antoni Chan, Douglas Turnbull & Gert Lanckriet Electrical & Computer Engineering.
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
Dept. of Computer Science & Engineering, CUHK Pseudo Relevance Feedback with Biased Support Vector Machine in Multimedia Retrieval Steven C.H. Hoi 14-Oct,
Presented by Zeehasham Rasheed
Semantic Similarity for Music Retrieval Luke Barrington, Doug Turnbull, David Torres & Gert Lanckriet Electrical & Computer Engineering University of California,
Scalable Text Mining with Sparse Generative Models
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Universit at Dortmund, LS VIII
Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Machine Learning in Ad-hoc IR. Machine Learning for ad hoc IR We’ve looked at methods for ranking documents in IR using factors like –Cosine similarity,
Distributed Information Retrieval Server Ranking for Distributed Text Retrieval Systems on the Internet B. Yuwono and D. Lee Siemens TREC-4 Report: Further.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
Combining Audio Content and Social Context for Semantic Music Discovery José Carlos Delgado Ramos Universidad Católica San Pablo.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Vector Space Models.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Post-Ranking query suggestion by diversifying search Chao Wang.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering.
Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)
Image Retrieval and Ranking using L.S.I and Cross View Learning Sumit Kumar Vivek Gupta
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Topic Modeling for Short Texts with Auxiliary Word Embeddings
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
A review of audio fingerprinting (Cano et al. 2005)
Multimodal Learning with Deep Boltzmann Machines
Introduction to Music Information Retrieval (MIR)
Recognizing Humans: Action Recognition
Search Engines & Subject Directories
Classification Discriminant Analysis
Classification Discriminant Analysis
The Open World of Micro-Videos
iSRD Spam Review Detection with Imbalanced Data Distributions
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Search Engines & Subject Directories
Search Engines & Subject Directories
Michal Rosen-Zvi University of California, Irvine
Panagiotis G. Ipeirotis Luis Gravano
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Word embeddings (continued)
Unsupervised learning of visual sense models for Polysemous words
SVMs for Document Ranking
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Precision and Recall.
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Presentation transcript:

Large-Scale Content-Based Audio Retrieval from Text Queries Rick van der Zwet <hvdzwet@liacs.nl> Multimedia Image Retrieval 2010

Rick van der Zwet <hvdzwet@liacs.nl> Introduction Google powered paper uses free-form text queries searches audio-content, rather than textual meta-data scale to very large number of audio documents, very rich query vocabulary 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> Introduction::scope Tested using (Passive-Aggressive Model for Image Retrival) compared against: Gaussian Mixture Models Support Vector Machines 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Introduction::Sound properties short sound effects noisier and larger collection of uses-contributes user-labeled recordings isolated (sound effects recordings) or combined (movie sound tracks) 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Introduction::Earlier work few high-level categories Wan and Lu: features and metric evaluation Turnbull et al: retrieves audio documents based on text queries used training mixture models. 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> The learning problem Text query q -> set of audio documents A Let R(q,A) be the set of audio document relevant to A Optimal result should give all relevant entries before the irrelevant ones The score function F(q,a) will be used to order based on decreasing scores. 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Multiple terms classification Bag-of-words representation Using a vector for to build the set. Sample: growling lion purring cat. Among the terms present in q, the terms appearing rarely in the reference corpus are more discriminant and should be assigned higher weights. 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Gaussian Mixture Model Model probability density functions of audio documents Wrong idea: every frame is independent of all other frames Train for every term apart. 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Support Vector Machines Discriminant function that maximizes the margin between the positive and negative examples Minimizing the number of miss classifications using training Conflicting goals Train for every goal apart 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Passive-Aggressive Model for Image Retrieval Matrix of with same length of the vector Score will go for the dot product No ranking-SVM = quadratic, so does not scale passive-aggressive (PA) -> try to optimize with threshold for the optional value 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> Data-set Various sources By hand Common words in file-names User count being the popular factor Acoustic features using MatLab 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> Experiment Use 2/3 of data for training all the rest of testing Validate against Human query database of freesound.com Match if found labels will actually match the text query for a certain hit. 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> Results 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> Error distribution 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> Speed & Scalability PAMIR scales best SVM are quadratic with regards to training examples 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>

Rick van der Zwet <hvdzwet@liacs.nl> Discussion PAMIR method is scalable Noisy real-world labels works fine to extend. But a better way of gathering tags will be preferred. Out-of-dictionary Dynamic matching, require further research. 5/21/2018 Rick van der Zwet <hvdzwet@liacs.nl>