Download presentation
Presentation is loading. Please wait.
1
Research activities at AUTH related to dialogue detection Ioannis Pitas Constantine Kotropoulos Nikos Nikolaidis Research activities at AUTH related to dialogue detection Ioannis Pitas Constantine Kotropoulos Nikos Nikolaidis WP6 e-team: Audiovisual Understanding
2
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Outline Introduction Introduction Dialogue detection concept: cross-correlation of indicator functions Dialogue detection concept: cross-correlation of indicator functions Speaker turn detection based on speech and visual cues (mouth activity) Speaker turn detection based on speech and visual cues (mouth activity) Frontal face detection; facial feature detection (e.g. mouth) Frontal face detection; facial feature detection (e.g. mouth) One-two speaker detection One-two speaker detection Speaker clustering based on speech and visual cues Speaker clustering based on speech and visual cues Fingerprinting Fingerprinting
3
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Indicator functions and their cross- correlation (1) A dialogue between two persons from the movie “Secret Window” [Dialogue 1].
4
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Indicator functions and their cross- correlation (2) A scene without a dialogue between two persons
5
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Speaker Turn Detection Audio Segmentation aims at finding acoustic events within an audio stream. Speaker turn detection is a special case of speaker segmentation. Important step in pre-processing of speech in order to implement audio indexing or speaker tracking. Usually, no prior knowledge about speakers is assumed. Speaker 1Speaker 2
6
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki MODEL BASED SEGMENTATION DISTBIC CONTRAST THE HYPOTHESIS OF NO SPEAKER TURN ( ) AGAINST THE SPEAKER TURN ( ) BIC CRITERION Speaker turn!!!!
7
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Frontal face images at quartet and octet resolution Original Image Quartet Image Octet Image Original Image Quartet Image Octet Image
8
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Face detection based on corners Face detection based on corners The figures show the 3 possible feature point set configurations, having 100 feature points each. They differ at the minimum distance allowed between the feature points. In general, small inter feature point distances yield a feature point concentration and poor face detection. The minimum allowed distance is a parameter of the training procedure. The figures show the 3 possible feature point set configurations, having 100 feature points each. They differ at the minimum distance allowed between the feature points. In general, small inter feature point distances yield a feature point concentration and poor face detection. The minimum allowed distance is a parameter of the training procedure.
9
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Face detection Receiver Operating Characteristic (ROC) curves For the SVM-based face detection, the best results were obtained with the sigmoidal kernel. Best equal error rate 4.5% For the SVM-based face detection, the best results were obtained with the sigmoidal kernel. Best equal error rate 4.5% The maximum likelihood detection commits a few false alarm. For FAR in [5.2%, 5.67%] the FRR drops quickly from 6.1% to 0.7%. The maximum likelihood detection commits a few false alarm. For FAR in [5.2%, 5.67%] the FRR drops quickly from 6.1% to 0.7%.
10
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki One/Two Speaker Detection Two-speaker detection (NIST 2002): Best EER 16.2 % Kajarekar, Adami, Hermansky, 2003 One-speaker detection (NIST 2002): Best EER 7.1 %
11
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Frontal face authentication
12
AIIA Lab, Department of Informatics Aristotle University of Thessaloniki Fingerprinting
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.