Download presentation
Presentation is loading. Please wait.
Published byAmberly Cora Patterson Modified over 8 years ago
1
Secure contracts signed by mobile Phone IST-2002-506883 Jacques Koreman, NTNU Andrew Morris, Spinvox International Workshop on Verbal and Nonverbal Communiation Behaviours Vietri sul Mare, 29-31 March 2007 Multi-modal Biometric Verification for Small and Very Small Devices
2
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 2 Overview Background and application: SecurePhone Multimodal biometric recognition –face, voice, signature: natural For small devices: PDA –Good performance, short verification time –Security problem For very small devices: SIM card –Global features to run on slow CPU –Short verification time, acceptable performance Conclusion –Further improvements by glottal feature fusion? –Relevance for COST2102
3
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 3 Background: SecurePhone project Duration: 01.01.2004 – 30.11.2006 Aim: “a mobile phone with biometric authentication and e-signature support for dealing secure transactions on the fly ” SecurePhone consortium: –Management –Research –Implementation –Exploitation Financing: EU 6th framework IST
4
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 4 SecurePhone GPRS/UMTS e-signature manager SIM card PIN number video camera touch screen microphone data capture biometric preprocessor biometric recogniser
5
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 5 Multimodal biometric recogniser Haar LL4 wavelets GMM geometric featuresMFCCs Face Voice Signature reject user accept user release private key “biometric recogniser” user profile world model Later: HL4 LL HL LH HH
6
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 6 PDA: fusion results for PDAtabase DET curves/result table for 5-digit (left), 10-digit (middle) and phrase prompts (right) Modality5-digit10-digitPhrase Voice7.213.245.54 Face28.4027.5528.33 Signature8.01 Fusion (mean)2.391.542.30 Fusion (sd)0.960.831.85 Marcos Faundez-Zanuy: Face recognition: an unsolved problem
7
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 7 From small to very small devices: problem Biometric data cannot be stored or processed on the PDA, because impostors could steal biometric data. Therefore storage and processing must be on SIMcard, which self-destroys when tampered with physically. Instead of a few seconds on the PDA, verification on the SIMcard takes one hour! Bottleneck: large number of comparisons in voice and signature verification (for client model and UBM) –for large number of frames per prompt –for large number of Gaussian mixtures in GMM
8
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 8 Reducing the frame rate or the number of GMM mixtures cannot reduce the processing time in a sufficient order of magnitude Drastic solution: globalised features (idea taken from static signature representations) –Means (cf. Long-Term Average Spectrum for voice) and standard deviations per vector parameter across all frames; also greatly reduced number of Gaussians required for modelling the vectors –To counteract the effect of averaging, compute globalised features for subparts of the signal From small to very small devices: solution Marcos Faundez-Zanuy: Open your mind: sometimes a simple solution can give a good result“ (and sometimes you cannot get around it)
9
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 9 PDA results Global feat. Means only Means + sd #Gauss.12481248 Voice28.2030.0830.3632.0822.7822.5524.4125.71 Face32.2631.7829.0629.1932.2631.7829.0629.19 Signature37.2629.2827.1526.2528.3426.6021.2719.21 fused17.9517.1614.8315.0113.6812.3510.0510.31 EER (percent) for globalised means (columns 2-5) and means plus standard deviations (columns 6-9)
10
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 10 SIM card results EER (percent) for globalised means (columns 2-5) and means plus standard deviations (columns 6-9) for voice and signature divided into two equal subparts Global feat. Means only Means + sd #Gauss.12481248 Voice22.1321.0920.8721.8620.8819.7217.6818.49 Face32.2631.7829.0629.1932.2631.7829.0629.19 Signature38.2927.5822.5817.8628.1422.1617.5916.45 Fused12.8912.4810.499.3212.5610.488.289.15
11
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 11 Improvement needed Performance drop: –PDA EER 2.39% (meanwhile improved to 0.9%) –SIM EER 10.05% (8.28 for two equal subparts) Performance can be improved if we do not restrain the GMM models to be the same across all modalities Otherwise: Use of complementary features within a modality –Face: simple face geometric variables –Voice : parameter values of LF model fitted to glottal flow derivative, obtained from inverse filtering of mic signal
12
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 12 Interest to this COST action Interest in glottal flow derivative for speaker recognition stems from –expected complementarity to MFCC representation of spectrum –applicability in applications which use very little training data (as in SecurePhone, for user-friendliness) But can also be useful for other classification problems, like “the recognition of emotional states, gesture, speech and facial expressions, in anticipation of the implementation of useful application such as intelligent avatars and interactive dialog systems” (quote from aims website of this workshop)
13
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 13 Last night’s addendum: speech & gestures Source signal parameters can also be used together with other spectral parameters as well as F0, duration, loudness measures to signal prominence. In speech, these signals can be used differently across languages (syllable-timed vs. stress-timed) and speakers (German Research Council “rhythm project” led by Bill Barry, Saarland University, to which NTNU contributes with Norwegian database recordings and analyses). Prominence also signalled by extent/size as well as acceleration of gestures. In how far do gestures and speech signal parameters correlate? When are they used as complementary/ alternative strategies for signalling prominence?
14
Int’l Workshop on Verbal and Nonverbal Communication Behaviours, Vietri sul Mare, 29-31 March 2007, slide 14 Thank you for your attention.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.