Download presentation
Presentation is loading. Please wait.
1
ITCS 6010 Spoken Language Systems: Architecture
2
Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding Dialog management Endpointing Feature Extraction Recognition Natural Language Understanding Dialog Management
3
Elements of a Spoken Language System (cont’d) Endpointing Detects the beginning and ending of speech Represents caller’s spoken utterance as wave form
4
Elements of a Spoken Language System (cont’d) Feature extraction Transforms endpoint utterance into sequence of feature vectors Feature vector – list of numbers that represent measurable characteristics of speech Characteristics related to energy amounts at varying frequencies
5
Elements of a Spoken Language System (cont’d) Recognizer Determines spoken words using feature vectors Recognition model contains all word strings caller can say Consists of: 1. Acoustic model 2. Dictionary 3. Grammar
6
Elements of a Spoken Language System (cont’d) Acoustic model Internal representation of pronunciation of each basic sound/phoneme Created by training process Modeled features are same as those in feature vectors
7
Elements of a Spoken Language System (cont’d) Dictionary List of words and pronunciations Indicates which acoustic models create a word Can contain multiple entries/pronunciations for a word Dallas d a l * s Boston b o s t * n economics E k * n A m I k s economics i k * n A m I k s
8
Elements of a Spoken Language System (cont’d) Grammar Definition everything caller can say to system Includes all possible strings of words and rules that associate meaning to strings Two types of grammars: Rule-based grammar – set of explicit rules completely define grammar Statistical language model (SLM) – statistical grammar created from the probability of word occurrence in given context
9
Elements of a Spoken Language System (cont’d) Recognition search For each word model as defined in grammar: Defined in dictionary Has appropriate sequence of acoustic models Feature vectors compared to word model Recognition Comparing of possible models against sequence of feature vectors to find best match
10
Elements of a Spoken Language System (cont’d) 3 important features of recognition 1. Confidence measures 2. N-best processing 3. Barge-in
11
Elements of a Spoken Language System (cont’d) Confidence measures Quantitative measure of the recognizer’s confidence it found the right match Measure of closeness between feature vectors of caller’s utterance to best-matching path Used by designers in design process e.g. to determine if explicit confirmation required
12
Elements of a Spoken Language System (cont’d) N-best processing A number of results (best possible matches) returned with their confidence measures Barge-in Allows callers to interrupt prompt Recognizer starts listening at beginning of prompt
13
Elements of a Spoken Language System (cont’d) Natural language understanding Assigns meaning to spoken words Slots defined for each item of information required Example Dialog manager Determines application’s next step
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.