Download presentation
Presentation is loading. Please wait.
Published byAlbert Watts Modified over 8 years ago
1
Segmenting Popular Music Sentence by Sentence Wan-chi Lee
2
Basic Idea In a song, The energy of audio signal will be low in the gap between sentences. Trying to detect the energy gap. Problem: There will be accompaniment sound. The dynamic range of audio signal varies a lot: hard to choose threshold.
3
Examples of audio signal
4
Methods Band-pass Filtering the signal: Here I use 6 order elliptic filter with pass band 800Hz~1.6KHz. For a short sliding window, calculating the average energy of the signal I use a 0.1 second window. Detecting the valley of average energy by piecewise linear approximation.
5
Piece-wise Linear Approximation I used a top-down method in determining the approximation. Specify an error bound. Find a segmentation point that best improve the approximation. Calculate linear regression for each segment as the approximation. If the error bound is not achieved, repeat above steps.
7
Segmentation Point After finding the linear approximation, choose points representing the gap in energy. Place some restrictions to make the segments be in reasonable length.
8
Demo and Discussion I only used one feature. Other features can be incorporated. Heuristic method: no training needed, but lots of parameters to tune. It should be integrated with onset detection to let the segmenting points coincide with the onset.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.