ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University Electronics and Information Systems (ELIS) TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA A A A A A A

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 072 Overview Problem statement Methodology –computing phonological scores –foreignizable phonemes Experiments –baseline system –systems with methodology implementation Conclusions

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 073 Automatic attendant or car navigation systems –lexicon may contain > 100K words –many from foreign origin Native speaker of Dutch can pronounce Andrew as Problem statement

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 074 Automatic attendant or car navigation systems –lexicon may contain > 100K words –many from foreign origin Native speaker of Dutch can pronounce Andrew as nativizedA n d r E w intermediateE n d r u w foreignizedE n d r u Problem statement

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 075 Standard solutions –foreign g2p’s + mapping to native phonemes –include foreign phoneme acoustic models Our proposal –combine scores of standard acoustic models and phonologically inspired back-off model both models trained on native speech only –use foreign g2p’s without phoneme mapping –introduce foreignizable phonemes instead of traditional foreign-to-native phoneme mappings Problem situation

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 076 Combining scores two-stream score per acoustic model state q –standard model : log p A (x | q) –phonological back-off model : log p B (x | q) control parameters –g 1q, g 2q = state dependent stream weight (different risk for foreignized pronunciation) –α, β = state independent scaling coefficients (to get same overall mean, variance) –equidistant samples on g 1q + g 2q = 1 (factor has no effect)

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 077 Combining scores Computation of log p B (x | q) –phonological feature space: binary features f i (i=1,…,25) –map each state to phonological space select features of state on basis of forced alignment of speech with standard acoustic models select f i with large enough mean of P(f i | x) / P(f i ) on state other strategy for foreignizable phonemes (see further) –compute posterior probabilities P(f i | x) configuration of 4 neural networks

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 078 Combining scores Computation of log p B (x | q) –phonological feature space: binary features f i (i=1,…,25) –map each state to phonological space select features of state on basis of forced alignment of speech with standard acoustic models select f i with large enough mean of P(f i | x) / P(f i ) on state other strategy for foreignizable phonemes (see further) –compute posterior probabilities P(f i | x) configuration of 4 neural networks –convert posterior probabilities to log-likelihood

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 079 Combining scores Come to final two-stream score –g 2q less dependent on q than –g 2q log p B (x) = discardable –computation of log P B (q | x) / P B (q) P q : positive features that are ‘on’ for state q N q : negative features absent or ‘off’ for q

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0710 Combining scores Assuming independent PHFs we get (1) (2) Start with only positive features (term (1)) –problem : unequal number for different q –solution : take average or w qp x (1), with w qp = 1 / card(P q ) –experiment showed this is better Add negative features (term (2)) –supposed to represent same probability –experiment shows 75 % correlation between (1) and (2) –keeping (1) + (2) is slightly better than discarding (2)

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0711 Introducing foreignizable phonemes Baseline pronunciation of foreign name –take foreign language g2p output –map foreign phonemes to best native equivalent Our pronunciation –if equivalent has different PHFs  keep info of original  foreignizable phoneme: /NativePhon/_/ForeignPhon/ –e.g. /rr/  /r/_/rr/ (Dutch /r/ originating from English /rr/) –6 such phonemes for English  Dutch –use positive PHFs of /ForeignPhon/ (knowledge based)

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0712 Introducing foreignizable phonemes Pronunciation variants –mix of standard and new approach

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0713 Introducing foreignizable phonemes Pronunciation variants –mix of standard and new approach

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0714 Experiments Recognition of English names –database from Nuance (Cremelie, N and ten Bosch, L) –2050 English name utterances –21 different names –26 native speakers of Dutch Recognizer –Standard acoustic models: cross-word triphones, trained on Dutch read speech –PHF feature detector: neural network configuration, trained on Dutch read speech –Vocabulary: 21 English names + 1779 Dutch names –Lexicon: different transcriptions for each name (see next slide)

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0715 Baseline system No back-off model used Effects of different types of transcriptions measured

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0716 Baseline system No back-off model used Effects of different types of transcriptions measured

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0717 Baseline system No back-off model used Effects of different types of transcriptions measured Most important findings 1.English much better than Dutch transcriptions (alone)  model foreign pronunciations 2.Dutch transcriptions inevitable  model native pronunciations

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0718 Systems with back-off model system FOREIGN –consider one foreignizable phonemes at the time –same g 1 on all its states : find optimal value under condition that g 1 = 1 for all other phonemes –repeat process until all foreignizable phonemes treated system NATIVE –same g 1 on all states –search for best g 1 system ALL –foreignizable phonemes : g 1 = from FOREIGN –other phonemes: same g 1, g 1 = from NATIVE

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0719 Systems with back-off model Main results : relative improvement of 11% Other results –g 1 < 0.5 for system FOREIGN –g 1 > 0.5 for system NATIVE

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0720 Latest work Seek confirmation of results on other data Autonomata database (STEVIN-project) –60000 names, 5000 different names, 240 speakers –French + English + Dutch names –French + English + Dutch speakers –French + English + Dutch g2p outputs per name –large RI by using foreign g2p’s on French and English –much larger RI with our methodology than here –paper submitted to ASRU-2007

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Interspeech 07 - August 30th 0721 Conclusions as of today large improvements on foreign name recognition by adding foreign g2p outputs (RI of around 40%) substantial extra improvements by adding new methodology (RI of up to 30%)

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.

Similar presentations

Presentation on theme: "ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.

Similar presentations

Presentation on theme: "ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University."— Presentation transcript:

Similar presentations

About project

Feedback