Audio Meets Image Retrieval Techniques Dave Kauchak Department of Computer Science University of California, San Diego

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Point Processing Histograms. Histogram Equalization Histogram equalization is a powerful point processing enhancement technique that seeks to optimize.
An Adaptive Learning Method for Target Tracking across Multiple Cameras Kuan-Wen Chen, Chih-Chuan Lai, Yi-Ping Hung, Chu-Song Chen National Taiwan University.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Empirical Evaluation of Dissimilarity Measures for Color and Texture
Face Recognition and Biometric Systems Elastic Bunch Graph Matching.
Recognizing hand-drawn images using shape context Gyozo Gidofalvi Computer Science and Engineering Department University of California, San Diego
Department of Computer Science University of California, San Diego
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
CS324e - Elements of Graphics and Visualization Color Histograms.
Image Processing David Kauchak cs160 Fall 2009 Empirical Evaluation of Dissimilarity Measures for Color and Texture Jan Puzicha, Joachim M. Buhmann, Yossi.
Similarity Search for Adaptive Ellipsoid Queries Using Spatial Transformation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa (Nara.
Recognizing Objects in Range Data Using Regional Point Descriptors A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik. Proceedings of the European.
Image Similarity and the Earth Mover’s Distance Empirical Evaluation of Dissimilarity Measures for Color and Texture Y. Rubner, J. Puzicha, C. Tomasi and.
Computer Vision Group, University of BonnVision Laboratory, Stanford University Abstract This paper empirically compares nine image dissimilarity measures.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.
Turning Privacy Leaks into Floods: Surreptitious Discovery of Social Network Friendships Michael T. Goodrich Univ. of California, Irvine joint w/ Arthur.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions.
Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.
2015年7月2日星期四 2015年7月2日星期四 2015年7月2日星期四 Data Mining: Concepts and Techniques1 Data Transformation and Feature Selection/Extraction Qiang Yang Thanks: J.
University of California San Diego Locality Phase Prediction Xipeng Shen, Yutao Zhong, Chen Ding Computer Science Department, University of Rochester Class.
Spatio-chromatic image content descriptors and their analysis using Extreme Value theory Vasileios Zografos and Reiner Lenz
What is Neutral? Neutral Changes and Resiliency Terence Soule Department of Computer Science University of Idaho.
Face Detection and Neural Networks Todd Wittman Math 8600: Image Analysis Prof. Jackie Shen December 2001.
Image Processing David Kauchak cs458 Fall 2012 Empirical Evaluation of Dissimilarity Measures for Color and Texture Jan Puzicha, Joachim M. Buhmann, Yossi.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Section 9.3 Sample Means.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 4 Some Key Ingredients for Inferential.
Audio Retrieval David Kauchak cs458 Fall Administrative Assignment 4 Two parts Midterm Average:52.8 Median:52 High:57 In-class “quiz”: 11/13.
How do you know?: Interpreting and Analyzing Data NCLC 203 New Century College, George Mason University April 6, 2010.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
ENDA MOLLOY, ELECTRONIC ENG. FINAL PRESENTATION, 31/03/09. Automated Image Analysis Techniques for Screening of Mammography Images.
CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.
Shape Matching for Model Alignment 3D Scan Matching and Registration, Part I ICCV 2005 Short Course Michael Kazhdan Johns Hopkins University.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Like.com vs. Ugmode Non-infringement arguments *** CONFIDENTIAL *** Prepared by Ugmode, Inc.
90288 – Select a Sample and Make Inferences from Data The Mayor’s Claim.
Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
Cache-Conscious Performance Optimization for Similarity Search Maha Alabduljalil, Xun Tang, Tao Yang Department of Computer Science University of California.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
The Quotient Image: Class-based Recognition and Synthesis Under Varying Illumination T. Riklin-Raviv and A. Shashua Institute of Computer Science Hebrew.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
MedIX – Summer 07 Lucia Dettori (room 745)
10.5 Testing Claims about the Population Standard Deviation.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
© Devi Parikh 2008 Devi Parikh and Tsuhan Chen Carnegie Mellon University April 3, ICASSP 2008 Bringing Diverse Classifiers to Common Grounds: dtransform.
1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.
Construction of Substitution matrices
Page 1© Crown copyright 2004 The use of an intensity-scale technique for assessing operational mesoscale precipitation forecasts Marion Mittermaier and.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 4-6 Peer Tutor Slides Instructor: Mr. Ethan W. Cooper, Lead Tutor © 2013.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Multiresolution Histograms and their Use for Texture Classification Stathis Hadjidemetriou, Michael Grossberg and Shree Nayar CAVE Lab, Columbia University.
Data Preprocessing: Data Reduction Techniques Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Simulation-based inference beyond the introductory course Beth Chance Department of Statistics Cal Poly – San Luis Obispo
Quantitative Methods in the Behavioral Sciences PSY 302
Chapter 7 Probability and Samples
Rhythmic Transcription of MIDI Signals
Toward statistical inference
Hybrid Features based Gender Classification
Multiple Feature Learning for Action Classification
Chapter 9: Sampling Distributions
Chapter 7: Sampling Distributions
Mark Chavira Ulises Robles
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

Audio Meets Image Retrieval Techniques Dave Kauchak Department of Computer Science University of California, San Diego

Image vs. Audio ? ? ? ? ? ? Classical Country Rock

Image techniques to audio Idea: Apply image retrieval (and classification) techniques to audio Image is 2-D Audio is 1-D 

Benefits Don’t have to reinvent the wheel Image techniques have had fairly good success More literature in image processing Audio retrieval is a relatively new field

Key Concepts and Goals Image techniques to audio processing Apply a number of different image techniques (and show they work ) Relate various parts of audio to counterparts in image Novel data set with known ground truth Multiple input for audio Raw audio

A first step… Audio retrieval Input: A number of songs Output: “Similar” songs from an audio database Histogramming methods (Puzicha et. al.) Wavelets instead of gabor filters

Basic Technique DWT Database Most “similar” songs histogram

Normal vs. Proportional Histogramming Remember DWT: Different number of samples per level Normal: Histogram each level with same number of bins Proportional: Histogram each level keeping samples/bin equal

Compare Histograms Chi-square on each level Sum chi-square value and use for dissimilarity measure (lower the better) Sum dissimilarity over all input songs

Ground Truth Data Set Songs by 4 different bands (10 songs each) Dave Mathews band U2 Blink 182 Green Day Mono, sampled at 22 KHz from a number of sources

Experiment Input = 5 songs by a single band Goal = Pull out 5 other songs by that band 10 random experiments per band (40 total) Normal bins: 8, 16, 32, 64, 128, 192, 256, 320, 384, 448, 512 Proportional bins: 4, 8, 16, 32, 64

Scoring By points: 5 pts. Correct answer in first place 4 pts. Correct answer in second place, etc. Perfect = = 15 Percentage correct at each place Percentage that have correct answer less than or equal to place

Results: Points

Results: Points Proportional

Best Score Results: 16 bins  1 st 2 nd 3 rd 4 th 5thScore Dave Mathews Blink U Green Day Average

Different Bands NormalProportional Dave Mathews Blink U Green Day2.12 Average2.8

Percentage correct 1 st 2 nd 3 rd 4 th 5 th Normal Proportio nal

One last result

Summary of Results Overall, results are not amazing Band choice has large influence Normal and Proportional perform somewhat similar Proportional is more even over all bands Bin size doesn’t appear to be crucial 75% of a chance a song by the same band will end up in top 5

Next Step… Adaptive Binning Vary Parameters Levels Song length Histogram comparison methods Another image retrieval algorithm Boosting for feature selection using large feature set? Other? Larger and more diverse database

Conclusion Even though results are not fabulous, image processing techniques CAN be used for audio processing Using bands for testing allows for ground truth Audio files are BIG!