Presentation is loading. Please wait.

Presentation is loading. Please wait.

Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.

Similar presentations


Presentation on theme: "Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The."— Presentation transcript:

1

2 Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The Presence of Overlapped Classes - A Non-Exclusive Segmentation Approach to Mitigate Information Losses Global Summit and Expo on Multimedia & Applications August 10-11, 2015 Birmingham, UK

3 Increasing volume of digital Media archives leading to increased demand for these goals Develop UOA system to: Extract Information from multimedia files Automated metadata generation. Mitigate info. losses which embedded in soundtracks Overall Aim Automated Metadata Generation Indexing and Searching Audio Information Mining Introduction

4 Classification - Challenge and Solution 23:09  Classical classification problems are logicall y exclusive, i.e. an element is assumed to be a member of one class and of that class only. This hinders some practical uses in audio information mining, since a segment of the soundtrack can have either speech, music, event sounds or a combination of them (fuzzy element)  Non-exclusive classification can mitigate info losses.  Classical classification problems are logicall y exclusive, i.e. an element is assumed to be a member of one class and of that class only. This hinders some practical uses in audio information mining, since a segment of the soundtrack can have either speech, music, event sounds or a combination of them (fuzzy element)  Non-exclusive classification can mitigate info losses. 23:09

5  A system integration approach to audio information mining can be hypothetically built upon the success in the following diverse areas.  To re-deploy these tools, it is essential that a pre-processor should effectively  Where speech, music and audio events of interest occur.  These audio segments can be further processed by dedicated algorithms to obtain further information. The Concept Hello Door Knock ClassifyIndex Time- Stamp

6 Universal Open Architecture

7 Spectral Subtraction Algorithm 23:09  A noise reduction technique.  VAD is employed to detects musical speech and musical segments  Calculate spectral magnitude to musical and musical speech segments.  Estimate the clean speech through the following formula

8

9  Data reduction.  Extract characteristic features. Feature Spaces  Mel Frequency Cepstrum Coefficients (MFCCs).  STFT –Temporal pattern analysis.  ZCR, RMS ‘Loudness’, Entropy, Short term energy.  Optimized Feature Space For Speech and Music Detection.

10 23:09 Music Analysis Retrieval and SYnthesis for Audio Signals. Open source framework for audio processing by George Tzanetakis University of Victoria Canada. Development of real time audio analysis and synthesis tools Audio processing system with specific emphasis on MIR. Implemented for exclusive classification (Speech or Music). Music genre organisation.

11  Speech and Music classes are involved as starting point.  Toward generalization, different styles of samples were included in the training set.  Speech samples (children, male, female, speaker with different languages, aloud speech, speech with laughs,).  Music, all genres are added (Jazz, pop, classical, rock,…).  All speech and music samples were mixed together after normalizing them to produce speech over music samples. Training Database Building Pure SpeechMix SamplesPure Music Speech 100%90%80%70%60%50%40%30%0% Music 0%10%20%30%40%50%60%70%100%

12 Toolbox Demonstration

13 Results Comparison Before and After Speech Enhance 23:09 AUDIO CLASS MARSYAS EDUOALength/Seconds FrFaERDFrFaERD SPEECH45.56%7.03%26.30%2.49%8.45%5.47%1580 MUSIC7.70%45.56%26.63%11.76%1.53%6.65%2115

14  Open Structure and Common Interfaces toward general classifier.  Redeployment of currently available techniques.  Encourage third party contributions.  Rapid prototyping of UOA Audio Information Mining system. Summary and Conclusions

15 Thank you for Listening

16 Audio Routing

17 Machine Learning

18 Sound Events Detections

19 ASR

20 Role of MIR in UOA


Download ppt "Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The."

Similar presentations


Ads by Google