Download presentation
Presentation is loading. Please wait.
1
Classifying Motion Picture Audio Eirik Gustavsen 07.06.07
2
Outline Motivation Thesis State of the Art Proposed system Experimental setup Results Future work Conclusion
3
Motivation Most projects classify clear classes or classes with noise. Few clear boundaries in motion picture audio Subjective descriptions of movies Dificult to compare movie content
4
Thesis It is possible to automatically create a table of contents of a motion picture, based on its audio track only.
5
Research questions Find best LLDs to classify motion picture audio Detect boundaries between audio classes within complex audio segments Automatically create a TOC based on the audio track only
6
Pre-Processing 44100 Hz sample rate Mono 16 bits 30 ms windows (L W )
7
Low Level Descriptors Time domain Frequency domain
8
Low Level Descriptors Total of 23 low level descriptors TIME DOMAIN Audio Power Audio Wave Form Root-Mean Square Short Time Energy Low Short Time Energy Ratio Zero-Crossing Rate High Zero-Crossing Rate Ratio FREQUENCY DOMAIN Audio Spectrum Centroid Fundamental Frequency 10 Mel-Frequency Cepstral Coefficients Spectrum Flux
9
Dimensionally reduction Principal components analysis (PCA) is a technique used to reduce multidimensional data sets to lower dimensions for analysis. f(1) f(2) f(3) f(4) f(5)... f(23) PCA d(1) d(2) d(3)
10
K Nearest Neighbors
11
Proposed system Pre- Prosessing LLDNorm PCAKNN Post- Prosessing TOC Generation
12
Classifying Audio Speech Noise (white) Music ”Silence” Mixed audio classes
13
Class Boundary Detection
16
Finding most suitable LLDs Most Suitable: ASC AWF RMS HZCRR
17
Sample Results Music with low volume Clear speech Speech with background environmental sounds Fading between music and speech Speech with Background music Jingle ” Some mistakes”
18
Future Work To be done in this thesis – Post processing – TOC Open research questions for future works – New motion picture audio classes – Detecting sound objects – Speech recognition
19
Conclusion Pre-processing makes it possible to classify motion picture audio correctly Using right combination of LLDs enhances the result of the classification
20
Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.