National Taiwan University

Slides:



Advertisements
Similar presentations
Introduction to Java Programming Lecture 10 Method Benefits, Declaring, and Calling Methods.
Advertisements

Dynamic Time Warping (DTW)
Chapter Programming in C
 for loop  while loop  do while loop  How to choose?  Nested loop  practice.
Shallow Copy Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Onset Detection in Audio Music J.-S Roger Jang ( 張智星 ) MIR LabMIR Lab, CSIE Dept. National Taiwan University.
Retrieval Methods for QBSH (Query By Singing/Humming) J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval.
Section 1.2 Describing Distributions with Numbers 用數字描述分配.
高速網路實驗室 High Speed Network Group Lab Final Project-Routing 張宜鴦.
五小專案 黃詩晴 章乃云. 目錄 計算機 智慧盤 拼圖 記憶大挑戰 數學題庫 心得 參考文獻.
Chapter 9 Hypothesis tests with the t statistic. 當母體  為未知時 ( 我們通常不知 ) ,用樣本 s 來取代 因為用 s 來估計  ,所呈現出來的分佈已不 是 z distribution ,而是 t distribution.
: The Playboy Chimp ★★☆☆☆ 題組: Problem Set Archive with Online Judge 題號: 10611: The Playboy Chimp 解題者:蔡昇宇 解題日期: 2010 年 2 月 28 日 題意:給一已排序的數列 S( 升冪.
Monte Carlo Simulation Part.2 Metropolis Algorithm Dept. Phys. Tunghai Univ. Numerical Methods C. T. Shih.
Modern Information Retrieval 第三組 陳國富 王俊傑 夏希璿.
Greedy Algorithms. 2 Greedy Methods ( 描述 1) * 解最佳化問題的演算法, 其解題過程可看成是由一 連串的決策步驟所組成, 而每一步驟都有一組選擇 要選定. * 一個 greedy method 在每一決策步驟總是選定那目 前看來最好 的選擇. *Greedy.
Matlab Assignment Due Assignment 兩個 matlab 程式 : Eigenface : Eigenvector 和 eigenvalue 的應用. Fractal : Affine transform( rotation, translation,
: Playing War ★★★★☆ 題組: Problem Set Archive with Online Judge 題號: 11061: Playing War 解題者:陳盈村 解題日期: 2008 年 3 月 14 日 題意:在此遊戲中,有一類玩家一旦開始攻擊, 就會不停攻擊同一對手,直到全滅對方或無法再.
Distance Functions for Sequence Data and Time Series
電腦繪圖期末專題 廖君興. 大綱  簡介  實做方法  成果比較  其他結果  結論.
: A-Sequence ★★★☆☆ 題組: Problem Set Archive with Online Judge 題號: 10930: A-Sequence 解題者:陳盈村 解題日期: 2008 年 5 月 30 日 題意: A-Sequence 需符合以下的條件, 1 ≤ a.
845: Gas Station Numbers ★★★ 題組: Problem Set Archive with Online Judge 題號: 845: Gas Station Numbers. 解題者:張維珊 解題日期: 2006 年 2 月 題意: 將輸入的數字,經過重新排列組合或旋轉數字,得到比原先的數字大,
Dynamic Programming.
: Very Easy!! ★★☆☆☆ 題組: Problem Set Archive with Online Judge 題號: 10523: Very Easy!! 解題者:楊子興 解題日期: 2006 年 6 月 13 日.
1 Theory I Algorithm Design and Analysis (11 - Edit distance and approximate string matching) Prof. Dr. Th. Ottmann.
Exact Indexing of Dynamic Time Warping
Variable Penalty Dynamic Time Warping For Aligning Chromatography Data David Clifford Research Scientist June 2009.
So far: Historical introduction Mathematical background (e.g., pattern classification, acoustics) Feature extraction for speech recognition (and some neural.
Educational Software using Audio to Score Alignment Antoine Gomas supervised by Dr. Tim Collins & Pr. Corinne Mailhes 7 th of September, 2007.
CS910: Foundations of Data Analytics Graham Cormode Time Series Analysis.
Dynamic Time Warping Algorithm for Gene Expression Time Series
Implementing a Speech Recognition System on a GPU using CUDA
CSIE Dept., National Taiwan Univ., Taiwan
2015/10/221 Progressive Filtering and Its Application for Query-by-Singing/Humming J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept.,
加速以 GPU 為運算核心的二階段哼唱選歌 系統 A CCELERATING A T WO -S TAGE Q UERY BY S INGING /H UMMING S YSTEM U SING GPU S Student:Andy Chuang ( 莊詠翔 )
Demos for QBSH J.-S. Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Content-based Music Retrieval from Acoustic Input (CBMR)
2016/6/41 Recent Improvement Over QBSH and AFP J.-S. Roger Jang (張智星) Multimedia Information Retrieval (MIR) Lab CSIE Dept, National Taiwan Univ.
Sorting Algorithms Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Chapter 9 DTW and VQ Algorithm  9.1 Basic idea of DTW  9.2 DTW algorithm  9.3 Basic idea of VQ  9.4 LBG algorithm  9.5 Improvement of VQ.
Sequence Comparison Algorithms Ellen Walker Bioinformatics Hiram College.
Sparse Vectors & Matrices Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Binary Search Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Melodic Similarity Presenter: Greg Eustace. Overview Defining melody Introduction to melodic similarity and its applications Choosing the level of representation.
1 Biometric template selection and update: a case study in fingerprints Source:Pattern Recognition, Vol. 37, 2004, pp Authors: Umut Uludag, Arun.
QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.
STL: Maps Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
DTW for Speech Recognition J.-S. Roger Jang ( 張智星 ) MIR Lab ( 多媒體資訊檢索實驗室 ) CS, Tsing Hua Univ. ( 清華大學.
DYNAMIC TIME WARPING IN KEY WORD SPOTTING. OUTLINE KWS and role of DTW in it. Brief outline of DTW What is training and why is it needed? DTW training.
Distance/Similarity Functions for Pattern Recognition J.-S. Roger Jang ( 張智星 ) CS Dept., Tsing Hua Univ., Taiwan
Simulation of Stock Trading J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
Linear Classifiers (LC) J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
Search in Google's N-grams
DP for Optimum Strategies in Games
Query by Singing/Humming via Dynamic Programming
Singing Voice Separation via Active Noise Cancellation 使用主動式雜訊消除於歌聲分離
MATCH A Music Alignment Tool Chest
National Taiwan University
Introduction to Music Information Retrieval (MIR)
Search in OOXX Games J.-S. Roger Jang (張智星) MIR Lab, CSIE Dept.
Dynamic Time Warping and training methods
National Taiwan University
True Motion Estimation Techniques Part I
Query by Singing/Humming via Dynamic Programming
Scientific Computing: Closing 科學計算:結語
Game Trees and Minimax Algorithm
Duration & Pitch Modification via WSOLA
Longest Common Subsequence (LCS)
Measuring the Similarity of Rhythmic Patterns
Sorting Algorithms Jyh-Shing Roger Jang (張智星)
Edit Distance 張智星 (Roger Jang)
Presentation transcript:

National Taiwan University DTW for QBSH J.-S Roger Jang (張智星) http://mirlab.org/jang MIR Lab, CSIE Dept. National Taiwan University

Dynamic Time Warping (DTW) Goal: Allows comparison of high tolerance to tempo variation Characteristics: Robust for irregular tempo variations Trial-and-error for dealing with key transposition Expensive in computation Does not conform to triangle inequality Some indexing algorithms do exist

Type-1 DTW j i t: input pitch vector (8 sec) r: reference pitch vector Local paths: 27-45-63 degrees 3-step formula for type-1 DTW (with anchored beginning) j r(j) r(j-1) t(i-1) t(i) i

Type-2 DTW j i t: input pitch vector (8 sec) r: reference pitch vector Local paths: 0-45-90 degrees 3-step formula for type-2 DTW (with anchored beginning) r(j) r(j-1) t(i-1) t(i) i

Local Path Constraints Type 1: 27-45-63 local paths Type 2: 0-45-90 local paths

Path Penalty Goal: To avoid paths deviated from 45 degrees Small/no penalty for 45-degree path Large penalty for paths deviated from 45-degree

Weighted DTW Distance 觀察: Weighted DTW Distance 在音符開始時,使用者的音高不穩定 在音符後半部,使用者的音高較穩定且逼近音符音高 Weighted DTW Distance 在音符開始時,權重函數 w(j) 較小 在音符後半部,權重函數 w(j) 較大

DTW Paths of “Anchored Beginning” Anchored beginning  end position is free to move Assumption: The speed of a user’s acoustic input falls within 1/2 and 2 times of that of the intended song. DTW table size for 8-sec query = 250x180 250 = 31.25*8 375 = 250*1.5 j i

DTW Paths of “Anchored Anywhere” Anchored anywhere  Both ends are free to move. DTW table size for 8-sec query against 3-min song = 250 x 5620 250 = 31.25*8 5620 = 31.25*180 j i

2 1 3 4 2 4 5 4 1 5 7 1 5 6 2 6 5 1 6 8 6 5 1 6 8 1 5 6 2 1 4 5 1 3 2 1 3 4 2 4 1 1 2 6 7 1 2 3 7 8 2

2 1 3 4 2 4 2 5 4 1 5 7 4 6 1 5 6 2 7 10 7 1 6 5 1 6 8 6 5 3 1 7 6 5 1 6 8 6 5 1 2 12 1 5 6 2 2 6 7 6 1 4 5 1 3 1 1 6 7 5 2 1 3 4 2 4 2 2 4 1 2 6 7 1 1 1 2 3 7 8 2

}Two-element layer DTW程式碼解說 D(i,j)的計算: j i 11 10 9 8 7 6 5 4 3 2 1 1 2 i 1 2 3 4 5 6 7 8 9

Implementation Issues To save memory Use 2-column table for type-1 DTW Use 1-column table for type-2 DTW To avoid too many if-then statements Pad type-1 DTW with two-layer padding Pad type-2 DTW with one-layer padding To find a suitable path Minimizing total distance Minimizing average distance

Other Variants Local constraints Flexible start/ending pos.

DTW Path of “Anchored Beginning”

DTW Path of “Anchored Anywhere”

Another Two Views of DTW Path of “Anchored Anywhere”

Demos of DTW Match beginning Match anywhere toolbox/dcpr/dtw/goDemoMelodyPath01.m Match anywhere toolbox/dcpr/dtw/goDemoMelodyPath02.m Alignment and note segmentation Toolbox/dcpr/dtw/goDemoNoteCut.m

Key Transposition (1/2) Goal: Method 1: Allow users’ input of different keys Method 1: Mean shift and heuristic modification 5 DTW computation when compared to each song t+2 (t’) t’-1 t’+1 t-2 t Mean -4 -2 1 2 3 4

Key Transposition (2/2) Method 2: Fixed point iteration Step 1: DTW alignment Step 2: Stop if mapping path fixed Step 3: Shift to the same mean based on the alignment Step 4: Go back to step 2. Characteristics DTW distance monotonically non-increasing to guarantee convergence

Example of Key Transposition

Score Function Score function m : length of matched string n : length of input string e : DTW distance A = 0.8 B = 0.6

DTW Demos Match corners with key transposition: toolbox/dtw/demoDtwPitch.m

Type-3 DTW: Frame to Note Alignment DP-based method for filling the table: Notes 65 62 65 64 Frame-level Pitch vector 67 Local constraint: Recurrent formula:

Type-3 DTW Characteristics Mapping path Frame-based query input vs. note-based music database Note duration unused More efficient, less effective Heuristics for key-transposition Mapping path

Type-3 DTW: Effects of Key Transposition Rough key transpos. Fine key transpos. Please refer to the online tutorial page for playback.