Download presentation
Presentation is loading. Please wait.
Published byZoe Grant Modified over 9 years ago
1
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State University Email: picone@isip.msstate.edupicone@isip.msstate.edu Co-Authors: Julie Baca, Feng Zheng, Hualin Gao Center for Advanced Vehicular Systems Mississippi State University Mississippi State, Mississippi 39762 URL: http://www.isip.msstate.edu/projects/speech EUROSPEECH 2003 Email: {baca,zheng,gao}@isip.msstate.edu{baca,zheng,gao}@isip.msstate.edu
2
In-vehicle dialog systems improve information access. Advanced user interfaces enhance workforce training and increase manufacturing efficiency. Noise robustness in both environments to improve recognition performance Advanced statistical models and machine learning technology Multidisciplinary team (IE, ECE, CS). INTRODUCTION IN-VEHICLE DIALOG SYSTEMS
3
DIALOG SYSTEM ARCHITECTURE SYSTEM ARCHITECTURE DARPA COMMUNICATOR FRAMEWORK
4
…. Uses publicly available ISIP speech recognition toolkit. Implements standard HMM- based speaker independent continuous speech recognition system. Complete toolkits available for many popular tasks including conversational speech. On-line educational materials Extensive documentation SYSTEM ARCHITECTURE PUBLIC DOMAIN ASR
5
Transduction: Andrea NC-65 head-mounted Feature extraction: standard 39-element MFCCs Acoustic modeling: 8-mixture Gaussian HMMs Lexicon: 7,100 words (5K WSJ, 2K names) Language modeling: Interpolated Bigram (ppl: ~70) Search: Hierarchical Viterbi Beam SYSTEM ARCHITECTURE ASR SYSTEM COMPONENTS
6
Uses Phoenix semantic case frame parser from Colorado Univ. (CU). Employs semantic grammar consisting of case frames with named slots. FRAME: Drive [route] [distance] [route] (*IWANT [go_verb][arrive_loc]) IWANT (I want *to)(I would *like *to) (I will) (I need *to) [go_verb] (go)(drive)(get)(reach) [arriveloc] [*to [placename][cityname]] SYSTEM ARCHITECTURE NATURAL LANGUAGE UNDERSTANDING
7
“I want to drive from Columbus Mississippi to New York.” SYSTEM ARCHITECTURE NATURAL LANGUAGE UNDERSTANDING
8
SYSTEM ARCHITECTURE Accepts ungrammatical input, “I want… I need to drive to the campus post office.” Current version of the semantic grammar contains over 500 rules and 2000 words. Developed from pilot test corpus of sentence patterns. Route IWANTgo_verbarrive_loc “I need to” “drive” placenamecityname “post office” “campus” NLU MODULE
9
Controls interaction between user and system. Accepts parsed input from NLU module. Determines data requested, obtains data and controls presentation to user. SYSTEM ARCHITECTURE DIALOG MANAGER User:“How can I get to campus?” System:“Are you going to a specific location on campus?” User:“Where is engineering?” System:“What department?”
10
Derived from CU toolkit. Bulk of development lies in construction of domain-specific frames, rules, and slots. Example frames and associated queries: Drive_Direction:“How can I get from Lee Boulevard to Kroger? Drive_Address:“Where is the campus bakery?” Drive_Distance:“How far is China Garden?” Drive_Quality:“Find me the most scenic route to Scott Field.” Drive_Turn:“I am on Nash Street. What’s my next turn?” SYSTEM ARCHITECTURE DIALOG MANAGER
11
Geographic Information System (GIS) contains map routing data for MSU and surrounding area. Dialog manager (DM) first determines the nature of query, then: obtains route data from the GIS database handles presentation of the data to the user APPLICATION DEVELOPMENT GIS BACKEND
12
Obtained domain-specific data by: 1.Initial data gathering and system testing 2.Retesting after enhancing LM and semantic grammar Initial efforts focused on reducing OOV utterances and parsing errors for NLU module. APPLICATION DEVELOPMENT PILOT SYSTEM
13
Refinements to NLU System: Overall System Enhancements : Vers.1.02.03.0 TestPrePostPrePostPrePost OOV25%0% 36% 0%4%0% Parser80%3%60%5%46%11% Test No. NLU Parser Error Rate DM Error Rate 143%49% 26%3% APPLICATION DEVELOPMENT RESULTS
14
Users participate in multiple scenarios in which they query for information (e.g., hotel and meeting locations). Tasks vary in scenarios according to role user plays: First-time visitors New residents Long-time residents SUMMARY AND CONCLUSIONS WIZARD OF OZ DATA
15
SUMMARY AND CONCLUSIONS FURTHER DEVELOPMENT Established a preliminary dialog system for future data collection and research Demonstrated significant domain-specific improvements for in-vehicle dialog systems. Created a testbed for future studies of workforce training applications. Extended the ISIP public domain toolkit and released relevant resources into the public domain.
16
SUMMARY RELEVANT RESOURCES CAVS Dialog System: review our experimental results and download the in-vehicle prototype architecture and associated components. Natural Language and Dialog Management Toolkits (CU): explore tools to build NLU and DM components for a specific domain. Speech Recognition Toolkit (ISIP): examine a state of the art public domain ASR toolkit for integration in a dialog system.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.