Download presentation
Presentation is loading. Please wait.
1
From Word Spotting to OOV Modeling
OOV OOV spotting OOV OOV OOV [f r ah m] OOV spotting [t uw] OOV OOV [f r ah m] [w er d] spotting [t uw] OOV [m aa d el ih ng] From Word Spotting to OOV Modeling Paul Fitzpatrick (6345g11) Goal To automatically extract filler vocabulary for word- spotting Why? So language model has something to work with May improve recognition accuracy on keywords Gives earlier payoff in domain-specific training Scenario Start with small lexicon (e.g words) Start with weak language model Bootstrap by clustering filler vocabulary from large collection of untranscribed data
2
Methodology Run recognizer Extract OOV fragments Identify competition
Identify rarely-used additions Remove from lexicon Add to lexicon Update lexicon, baseforms Hypothesized transcript N-Best hypotheses Update Language Model
3
Results Initial lexicon
, phone, room, office, address Top 10 OOV clusters found (ranked by frequency) 1. n ah m b er 6. p l iy z 2. w eh r ih z ae ng k y uw 3. w ah t ih z 8. n ow 4. t eh l m iy 9. hh aw ax b aw 5. k ix n y uw g r uw p Example sentence hypothesis (w ah t ih z) (ih t er z uw) room (n ah m b er) What is Victor Zue’s room number?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.