Tree structured and combined methods for comparing metered polyphonic music Kjell Lëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008.

Slides:



Advertisements
Similar presentations
Chapter 2: Rhythm and Pitch
Advertisements

Composing How do I get started?. Step 1 Select your form AB two contrasting sections ABA three sections, the first and third sections are the same, the.
Indexing DNA Sequences Using q-Grams
A General Algorithm for Subtree Similarity-Search The Hebrew University of Jerusalem ICDE 2014, Chicago, USA Sara Cohen, Nerya Or 1.
Music Retrieval and Analysis
Greedy Algorithms Greed is good. (Some of the time)
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Point-set algorithms for pattern discovery and pattern matching in music David Meredith Goldsmiths College University of London.
Pitch-spelling algorithms David Meredith Aalborg University.
Melodic Similarity CS 275B/Music 254. "Natural history" of similarity  Concept of similarity fundamental to organization of most art music  Types of.
On Demand String Sorting over Unbounded Alphabets Carmel Kent Moshe Lewenstein Dafna Sheinwald.
ADVISE: Advanced Digital Video Information Segmentation Engine
Tree structured representation of music for polyphonic music information retrieval David Rizo Departament of Software and Computing Systems University.
Distance Functions for Sequence Data and Time Series
T.Sharon 1 Internet Resources Discovery (IRD) Music IR.
Based on Slides by D. Gunopulos (UCR)
The Effectiveness Study of Music Information Retrieval Arbee L.P. Chen National Tsing Hua University 2002 ACM International CIKM Conference.
Voice Separation A Local Optimization Approach Voice Separation A Local Optimization Approach Jurgen Kilian Holger H. Hoos Xiaodan Wu Feb
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
. Phylogenetic Trees (2) Lecture 12 Based on: Durbin et al Section 7.3, 7.8, Gusfield: Algorithms on Strings, Trees, and Sequences Section 17.
Pitch Pitch can be described as being how high or low the sound is heard. Pitch is determined by the speed or frequency of the vibration which is causing.
Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.
JSymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada.
Polyphonic Queries A Review of Recent Research by Cory Mckay.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
1 Music Classification Using Significant Repeating Patterns Chang-Rong Lin, Ning-Han Liu, Yi-Hung Wu, Arbee L.P. Chen, Proc. of 9th International Conference,
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
JSymbolic Cedar Wingate MUMT 621 Professor Ichiro Fujinaga 22 October 2009.
Aspects of Music Information Retrieval Will Meurer School of Information University of Texas.
Data Structures and Algorithms Lecture (BinaryTrees) Instructor: Quratulain.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.
Melodic Search: Strategies and Formats CS 275B/Music 254.
For use with WJEC Performing Arts GCSE Unit 1 and Unit 3 Task 1 Music Technology Creativity in composing.
BTEC First Music Unit 4: Introducing Music Composition.
MUSC 1000 Intro to Music MWF10-10:50. Some General Questions: What is Music? Where do we listen to music? Are there any composers or bands you know or.
Melodic Similarity Presenter: Greg Eustace. Overview Defining melody Introduction to melodic similarity and its applications Choosing the level of representation.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
Things to Consider When Writing Melodies Vital Elements  Two most vital elements - rhythm and melody.  Harmonic structure of your composition will.
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
Hierarchical clustering approaches for high-throughput data Colin Dewey BMI/CS 576 Fall 2015.
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
Metamidi: a tool for automatic metadata extraction from MIDI files Tomás Pérez-García, Jose M. Iñesta, and David Rizo Computer Music Laboratory University.
Learning to analyse tonal music Pl á cido Rom á n Illescas David Rizo Jos é Manuel I ñ esta Pattern recognition and Artificial Intelligence group University.
Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta,
Melody Characterization by a Fuzzy Rule System Pedro J. Ponce de León, David Rizo, José M. Iñesta (DLSI, Univ. Alicante) Rafael Ramírez (MTG, Univ. Pompeu.
Stochastic Text Models for Music Categorization Carlos Pérez-Sancho, José M. Iñesta, David Rizo Pattern Recognition and Artificial Intelligence group Department.
What is automatic music transcription? Transforming an audio signal of a music performance in a symbolic representation (MIDI or score). Aim: This prototype.
A shallow description framework for musical style recognition Pedro J. Ponce de León, Carlos Pérez-Sancho and José Manuel Iñesta Departamento de Lenguajes.
Lecture on Data Structures(Trees). Prepared by, Jesmin Akhter, Lecturer, IIT,JU 2 Properties of Heaps ◈ Heaps are binary trees that are ordered.
Rhythmic Transcription of MIDI Signals
Chapter 2: Rhythm and Pitch
Notes and Rests The beginnings of rhythm
Data Structures: Disjoint Sets, Segment Trees, Fenwick Trees
Introduction to Music Information Retrieval (MIR)
Taku Aratsu1, Kouichi Hirata1 and Tetsuji Kuboyama2
Aspects of Music Information Retrieval
Hierarchical clustering approaches for high-throughput data
Chapter 11 Data Compression
Fine Arts section 1 pg.7-20 By david steen.
Theme and variations.
Melodic Similarity CS 275B/Music 254.
Pitch Spelling Algorithms
Data Structures for Shaping and Scheduling
An Introduction to Music–Melody –Harmony –Rhythm.
Chord Recognition with Application in Melodic Similarity
Presentation transcript:

Tree structured and combined methods for comparing metered polyphonic music Kjell Lëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008

2 Outline Objectives State of the art Tree representation of monodies and polyphonic songs Comparison of trees for obtaining similarities between songs Geometric methods Combination of methods Experiments Conclusions and work lines

3 Melodic comparison (symbolic) Given the sequence of notes at the scores … Are those tunes the same?

4 Target Polyphonic music comparison of whole songs

5 Approaches to polyphonic comparison Convert into monophonic –Use sequence comparison Adapted text retrieval methods –PROMS: Clausen et al ‘00 –Doraisamy and Rüger ‘04: n-grams Geometric methods –Lubiw and Tanur ‘04 –Ukkonen, Lemström and Mäkinen ‘03 + CMMR’08 Session: MUSR: Music Retrieval papers

6 Tree construction process (Rizo et al. ’03)  Based on the logarithmic nature of music notation  Each tree level is a subdivision of the upper level whole4 beats half2+2 quarter 4×1 8×½8×½eighth  Leaf labels can be any pitch magnitude  Rests are coded the same way as notes  Duration is implicitly coded in the tree structure F C EG 1 4/4 bar Initial time Duration Tree representation for monodies

7  The complete melody (all bars) is a forest (all trees)  Bars can be grouped sequentially or hierarchically F C E G Representation of whole melodies A B C G Sequential grouping: CEGFABCG Tree representation

8 Polyphonic tree representation Process repeated for each voice: replace single labels for sets {C,G} {C} {F}{F} C F CG G E {G} {C,G,E}{C,G,E} {C,F,G}{C,F,G} {C,E,F,G} Actually, the interval from the tonic is represented in the tree Using tree tonality guessing (rizo et al.’06) {0,7} {0} {5} {0,5,7} {0,4,5,7} Propagate from bottom using set union

9 Polyphonic tree representation Better tree summarization: Use duration importance: rhythmic weights  Multiset Rhythmic weight = 2 h-l h = tree height l = node level {C=2,E=2,G=2} {C=1} {F=1} {C=1,F=1,G=2} {C=3,E=2,F=1,G=4} It has been tested to use the Krumhansl-Schmuckler profiles along with the rhythmic weights: worse results l = 1 l = 2 l = 3

10 Comparing songs Compare songs = compare trees Approaches –Classical tree edit distances Shasha Selkow –Use only the information of the roots Sequence edit distance Longest Common Subsequence

11 Tree comparison Use only information in the roots –Roots contain the summary of its children after propagation { C=0.3, E=0.1, G=1 }..... Bar 1Bar 2Bar 3Bar 4Bar N RootED and LCRS: -Let  be a tree level ot tree T, compose a sequence S  (T) with all nodes at that level in the forest -RootED and LCRS use  =1 -Distance between 2 songs A and B at a level  d(A,B,  a,  b )= stringDistance(S  a (A), S  b (B)) or d(A,B,  a,  b )= LCS(S  a (A), S  b (B)) Complexity with  = 1 O(|bars A | * |bars B |) SaSa { C=0.6, F=0.2 } {F=1, G=1,A=1, B=0.2} {C=0.3, E=0.2, G=0.5} { C=0.6, F=0.2 } Labels of the root of each tree

12 Multiset substitution cost Define multiset as a vector: Index = interval from tonic Value = cardinality –E.g: {C=1, G=4, B=2} is defined as –[1,0,0,0,0,0,0,4,0,0,0,2] Multiset substitution cost between multisets X and Y represented by vectors v and w

13 Graphical representations P1, P2, P3 algorithms from Ukkonen, Lemström, Makinen ‘03 P2v5, P2v6: indexed versions of P2 –Not published yet

14 Method combination Dissimilarity measure for a method = distance between songs Combined dissimilarity measure = combination of distances between songs Combination = sum of normalized distances

15 Experiments Corpora: –ICPS (68 files): 7 different polyphonic incipits: Schubert’s Ave Maria, Ravel’s Bolero, Alouette, Happy Birthday, Frère Jacques, Jingle Bells, When The Saints Go Marching In Covers made up of polyphonic piano files + “Band in a box” variations –VAR (78 files): Bach Goldberg variations Bach english suites variations Some Tchaikowsky variations

16 Evaluation method Leave one out –All-against-all: each song S is compared with the rest of the songs, the result is an ordered list with the most similar songs first Accuracy –Top-recognition-rate (T RR n ): presence percentage of the a version of the song S among the top n slots Success rate = T RR 0 –Precision-at-|class| |class| = number of versions of the same song Times –Exclude preprocessing times: only performed once at startup of system Averages: all results are averages of all queries

17 Results: ICPS Time and success rate Combined method: success rate

18 Results: VAR Cuccess rate Combined method: again success rate

19 Top-recognition-rate: ICPS Combined method gets a good result

20 Top-recognition-rate: VAR Combined method is the best one: combined methods are more robust

21 Conclusions and work lines Very hard task when MIDI files are real ones –Preprocess songs: Use automatic tonal analysis + tree propagation to remove non-important notes in songs Improve results by combining more different classifiers Tune the tree comparison measures: submitted Add LCS fast implementation from Hyyrö ‘04 Add confidence values to LCS Include meter extraction methods to build the trees Query MIDI

22 END

23 Melody = sequence of notes String representation + string distances –(Mongeau and Sankoff ‘90, Lemström 2000) GGAGCBGGGAGAGGCBB Symbols are combinations of pitch x rhythm Pitch can be: absolute pitch, pitch class, interval from tonic, interval, contour, high-def contour, nothing Rhythm can be: absolute, inter-onset interval, inter-onset ratio, contour, nothing e.g.: (G4,8)(G4,8)(A4,4)(G4,4)(C5,4)(B4,2)(G4,8)(G4,8) Best comparison results using intervals with no rhythm information

24 String distances Drawbacks on the comparison without rhythm –Wrong results with: Same melodic distance and different rhythm:  edit distance Godfather theme Hungarian dance, Schubert  ≈ Too many ornament notes:  edit distance

25  Propagation and prunning Tree construction process Max. prunning level defined s F F A G Rules (Rizo et al., 2003) Tree as initially coded from the score Tree representation

26 The distance is computed as the cost of the operations to transform one tree into the other. TREE EDIT DISTANCE TREE EDIT DISTANCE (Zhang & Shasha, 1989) C G C A A C C C G C A C C A A t1t1 t2t2 d(t 1,t 2 ) Weighted operations of insertion deletion replacement Melodic similarity metrics Tree edit distance  O( |T 1 |  |T 2 |  h(T 1 )  h(T 2 ) ) Previous prunning process helps to overcome this complexity (Zhang & Shasha, “Simple fast algorithms for the editing distance between trees...”. SIAM J Comput., 8(6): ) Tree representation