A Wavelet-Based Approach to the Discovery of Themes and Motives in Melodies Gissel Velarde and David Meredith Aalborg University Department of Architecture,

Slides:



Advertisements
Similar presentations
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Advertisements

Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Object recognition and scene “understanding”
An Approach to ECG Delineation using Wavelet Analysis and Hidden Markov Models Maarten Vaessen (FdAW/Master Operations Research) Iwan de Jong (IDEE/MI)
Applications in Signal and Image Processing
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
Audio Meets Image Retrieval Techniques Dave Kauchak Department of Computer Science University of California, San Diego
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Point-set algorithms for pattern discovery and pattern matching in music David Meredith Goldsmiths College University of London.
Today Unsupervised Learning Clustering K-means. EE3J2 Data Mining Lecture 18 K-means and Agglomerative Algorithms Ali Al-Shahib.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
Detecting Image Region Duplication Using SIFT Features March 16, ICASSP 2010 Dallas, TX Xunyu Pan and Siwei Lyu Computer Science Department University.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.
Methods of Image Compression by PHL Transform Dziech, Andrzej Slusarczyk, Przemyslaw Tibken, Bernd Journal of Intelligent and Robotic Systems Volume: 39,
Wavelet-based Coding And its application in JPEG2000 Monia Ghobadi CSC561 project
ECE 501 Introduction to BME ECE 501 Dr. Hang. Part V Biomedical Signal Processing Introduction to Wavelet Transform ECE 501 Dr. Hang.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones Hon Lu, Wei Pan, Nocholas D. lane, Tanzeem Choudhury and Andrew T. Campbell.
Motif Discovery in Protein Sequences using Messy De Bruijn Graph Mehmet Dalkilic and Rupali Patwardhan.
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
Educational Software using Audio to Score Alignment Antoine Gomas supervised by Dr. Tim Collins & Pr. Corinne Mailhes 7 th of September, 2007.
Presented by Tienwei Tsai July, 2005
Local invariant features Cordelia Schmid INRIA, Grenoble.
Hierarchical Dirichlet Process (HDP) A Dirichlet process (DP) is a discrete distribution that is composed of a weighted sum of impulse functions. Weights.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.
Texture scale and image segmentation using wavelet filters Stability of the features Through the study of stability of the eigenvectors and the eigenvalues.
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.
Data Extraction using Image Similarity CIS 601 Image Processing Ajay Kumar Yadav.
Discovering Deformable Motifs in Time Series Data Jin Chen CSE Fall 1.
Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.
k-Shape: Efficient and Accurate Clustering of Time Series
Local invariant features Cordelia Schmid INRIA, Grenoble.
CCN COMPLEX COMPUTING NETWORKS1 This research has been supported in part by European Commission FP6 IYTE-Wireless Project (Contract No: )
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &
By Brian Lam and Vic Ciesielski RMIT University
A Compression-Based Model of Musical Learning David Meredith DMRN+7, Queen Mary University of London, 18 December 2012.
Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.
QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
The Wavelet Tutorial: Part2 Dr. Charturong Tantibundhit.
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.
CS654: Digital Image Analysis
Discovering Musical Patterns through Perceptive Heuristics By Oliver Lartillot Presentation by Ananda Jacobs.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
Hierarchical Topic Detection UMass - TDT 2004 Ao Feng James Allan Center for Intelligent Information Retrieval University of Massachusetts Amherst.
A shallow description framework for musical style recognition Pedro J. Ponce de León, Carlos Pérez-Sancho and José Manuel Iñesta Departamento de Lenguajes.
By Brian Lam and Vic Ciesielski RMIT University
Cluster Analysis II 10/03/2012.
Supervised Time Series Pattern Discovery through Local Importance
Geometric Pattern Discovery in Music
Wavelets : Introduction and Examples
Chen Jimena Melisa Parodi Menashe Shalom
A Time Series Representation Framework Based on Learned Patterns
Audio Content Description
Memory and Melodic Density : A Model for Melody Segmentation
Revision (Part II) Ke Chen
Revision (Part II) Ke Chen
Word Embedding Word2Vec.
Sequential Hierarchical Clustering
Cluster analysis Presented by Dr.Chayada Bhadrakom
Wavelet Analysis Objectives: To Review Fourier Transform and Analysis
David Meredith Aalborg University, Denmark
Presentation transcript:

A Wavelet-Based Approach to the Discovery of Themes and Motives in Melodies Gissel Velarde and David Meredith Aalborg University Department of Architecture, Design & Media Technology EuroMAC, September 2014

We present A computational method submitted to the MIREX 2014 Discovery of Repeated Themes & Sections task The results on the monophonic version of the JKU Patterns Development Database

Ground Truth Bach’s Fugue BWV 889

Ground Truth: Chopin’s Mazurka Op. 24, No. 4

The idea behind the method In the context of pattern discovery in monophonic pieces: –With a good melodic structure in terms of segments, it should be possible to gather similar segments into clusters and rank their salience within the piece.

Considerations “a good melodic structure in terms of segments” – Is considered to be closer to the ground truth analysis (See Collins, 2014) It specifies certain segments or patterns These patterns can be overlapping and hierarchical

Considerations We also consider other aspects of the problem, – representation, – segmentation, – measuring similarity, – clustering of segments and – ranking segments according to salience

The method – Follows and extends our approach to melodic segmentation and classification based on filtering with the Haar wavelet (Velarde, Weyde and Meredith, 2013) – Uses idea of computing a similarity matrix for “window connectivity information from a generic motif discovery algorithm for sequential data (Jensen, Styczynski, Rigoutsos and Stephanopoulos, 2006)

Wavelet transform A family of functions is obtained by translations and dilatations of the mother wavelet: Haar Wavelet The wavelet coefficients of the pitch vector v for scale s and shift u are defined as the inner product:

Representation (Velarde et al. 2013) New representation

First stage Segmentation (Velarde et al. 2013) New segmentation

Segmentation First stage segmentation Comparison Concatenation Comparison Clustering Ranking Distance matrix given a measure Contiguous similar diagonal segments are concatenated By agglomerative clusters from an agglomerative hierarchical cluster tree Criteria: sum of the length of occurrences Constant segmentation, wavelet zero-crossings or modulus maxima Binarized distance matrix given a threshold Distance matrix given a measure

Parameter combinations We tested the following parameter combinations: MIDI pitch Sampling rate: 16 samples per qn Representation: – normalized pitch signal, wav coefficients, wav coefficients modulus Scale representation at 1 qn Segmentation: – constant segmentation, zero crossings, modulus maxima Scale segmentation at 1 and 4 qn Threshold for concatenation: 0, 0.1, 1 Distances: – city-block, Euclidean, DTW Agglomerative clusters from an agglomerative hierarchical cluster tree Number of clusters: 7 Ranking criterion: Sum of the length of occurrences

Evaluation As described at MIREX 2014:Discovery of Repeated Themes & Sections – establishment precision, establishment recall, and establishment F1 score; – occurrence precision, occurrence recall, and occurrence F1 score; – three-layer precision, three-layer recall, and three-layer F1 score; – runtime, first five target proportion and first five precision; – standard precision, recall, and F1 score;

Results On the JKU Patterns Development Database monophonic version J. S. Bach, Fugue BWV 889, Beethoven's Sonata Op. 2, No. 1, Movement 3, Chopin's Mazurka Op. 24, No. 4, Gibbons's Silver Swan, and Mozart's Sonata K.282, Movement 2. We selected best combinations according to representation and segmentation.

Results Fig 1. Mean F1 score (mean(f1_est, f1_occ(c=.75), 3L F1, f1_occ (c=.5)).

Results Fig 2. Standard F1 score

Results Fig 3. Mean Runtime per piece.

Our MIREX Submissions VM1 and VM2 Combinations selected based on – mean F1 score: mean( F1_est, F1_occ(c=.75), F1_3, F1_occ (c=.5)) – standard F1 score VM1 differs from VM2 in the following parameters: – Normalized pitch signal representation, – Constant segmentation at the scale of 1 qn, – Threshold for concatenation 0.1. VM2 differs from VM1 in the following parameters: – Wavelet coefficients representation filtered at the scale of 1 qn – Modulus maxima segmentation at the scale of 4 qn – Threshold for concatenation 1

Our MIREX Submissions Three Layer F1, (χ 2 (1)=1.8, p=0.1797): ->No significant difference Standard F1, (χ 2 (1)=4, p=0.045): ->VM1 preferred Runtime, (χ 2 (1)=5, p=0.0253) ->VM2 preferred Table 1. Results of VM1 on the JKU Patterns Development Database. Table 2. Results of VM2 on the JKU Patterns Development Database. Piecen_Pn_QP_estR_est F1_est P_occR_occF1_occ P_3R_3F1_3 Runtime FFTP_ FFP P_occR_occF1_occ PRF1 (c=.75) (s)est(c=.5) Bach Beethoven Chopin Gibbons Mozart mean SD Piecen_Pn_QP_estR_est F1_est P_occR_occF1_occ P_3R_3F1_3 Runtime FFTP_ FFP P_occR_occF1_occ PRF1 (c=.75) (s)est(c=.5) Bach Beethoven Chopin Gibbons Mozart mean SD

Example: Bach's Fugue BWV 889 prototypical pattern

Observations The segmentation stage makes more difference in the results, according to the parameters – In the first stage segmentation The size of the scale affects the results for standard measures and runtimes – In the first comparison Zero-crossings segmentation works best with DTW DTW is much more expensive to compute

Observations In the comparison (after segmentation), City-block is dominant DTW in the comparison after segmentation is not in the best combinations – Maybe because there is no ritardando or accelerando in this dataset and/or representation For standard measures and a smaller segmentation scale – Pitch signal works better than wavelet representation For non standard measures and a larger segmentation scale – Modulus maxima performs slightly better than zero- crossings and constant segmentation

Conclusions Our novel wavelet-based method outperforms the methods reported by Meredith (2013) and Nieto & Farbood (2013) on the monophonic version of the JKU PDD training dataset, scoring higher on precision, recall and F1 score, and reporting faster runtimes.

Conclusions The segmentation stage makes more difference in the results, according to the parameters A small scale for first stage segmentation should be preferable for higher values of the standard measures and a large scale should be preferable for runtime computation. City-block should be preferable after segmentation

References [1]T. Collins. Mirex 2014 competition: Discovery of repeated themes and sections, ir.org/mirex/wiki/2014:Discovery_of_Repeated_Themes_%26_Sections. Accessed on 12 May [2]K. Jensen, M. Styczynski, I. Rigoutsos and G. Stephanopoulos: “A generic motif discovery algorithm for sequential data”, Bioinformatics, 22:1, pp , [3]D. Meredith. “COSIATEC and SIATECCompress: Pattern discovery by geometric compression”, Competition on Discovery of Repeated Themes and Sections, MIREX 2013, Curitiba, Brazil, [4] O. Nieto, and M. Farbood. “Discovering Musical Patterns Using Audio Structural Segmentation Techniques. Competition on Discovery of Repeated Themes and Sections, MIREX 2013, Curitiba, Brazil, 2013 [5]G. Velarde, T. Weyde and D. Meredith: “An approach to melodic segmentation and classification based on filtering with the Haar-wavelet”, Journal of New Music Research, 42:4, , 2013.