DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.

Slides:



Advertisements
Similar presentations
1990s DARPA Programmes WSJ and BN Dapo Durosinmi-Etti Bo Xu Xiaoxiao Zheng.
Advertisements

MAI Internship April-May MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages.
In collaboration with Hualin Gao, Richard Duncan, Julie A. Baca, Joseph Picone Human and Systems Engineering Center of Advanced Vehicular System Mississippi.
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Distribution-Based Feature Normalization for Robust Speech Recognition Leveraging Context and Dynamics Cues Yu-Chen Kao and Berlin Chen Presenter : 張庭豪.
PERFORMANCE ANALYSIS OF AURORA LARGE VOCABULARY BASELINE SYSTEM Naveen Parihar, and Joseph Picone Center for Advanced Vehicular Systems Mississippi State.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Jianwei Lu1 Information Extraction from Event Announcements Student: Jianwei Lu ( ) Supervisor: Robert Dale.
What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.
The Data Mining Visual Environment Motivation Major problems with existing DM systems They are based on non-extensible frameworks. They provide a non-uniform.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
LE A toolkit for building and experimenting with dialogue move engines and systems, based on the information state approach TrindiKit.
The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.
Overview of Search Engines
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
CS-EE 481 Spring Founders Day, 2005 University of Portland School of Engineering Project Pocket Gopher Conversational Learning Agent Team Josh Jones.
ArcGIS Workflow Manager An Introduction
UML - Development Process 1 Software Development Process Using UML (2)
Fall, Privacy&Security - Virginia Tech – Computer Science Click to edit Master title style Design Extensions to Google+ CS6204 Privacy and Security.
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Petter Nielsen Information Systems/IFI/UiO 1 Software Prototyping.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Spoken Dialogue Systems and the GALAXY Architecture 29 October 2000 Advanced Technology Laboratories 1 Federal Street A&E Building 2W Camden, New Jersey.
"Dude, Where's My... Signals and Systems Textbook?" Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Assistant Research Professor Center for Advanced Vehicular Systems Mississippi.
Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Expanding the Accessibility and Impact of Language Technologies for Supporting Education (TFlex): Edinburgh Effort Dr. Myroslava Dzikovska, Prof. Johanna.
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Center for Advanced Vehicular Systems Mississippi State University Computer Science.
Theban Stanley Human and Systems Engineering Center for Advanced Vehicular Systems Enhancements to the DARPA Communicator Architecture.
Compensating speaker-to-microphone playback system for robust speech recognition So-Young Jeong and Soo-Young Lee Brain Science Research Center and Department.
Temple University QUALITY ASSESSMENT OF SEARCH TERMS IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone, PhD Department of Electrical and Computer.
IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
SIS Spatial Information Solutions April 23, 2005 MSU ERAC Presentation Spatial Information Solutions: A New Business Delivering Spatial Technology Research.
INSTITUTE FOR SIGNAL AND INFORMATION PROCESSING Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Experimental Results Abstract Fingerspelling is widely used for education and communication among signers. We propose a new static fingerspelling recognition.
Theban Stanley, Julie Baca, Matt Elliott and Joseph Picone Human and Systems Engineering Center for Advanced Vehicular Systems Mississippi State University.
Basic structure of sphinx 4
M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Performance Analysis of Advanced Front Ends on the Aurora Large Vocabulary Evaluation Authors: Naveen Parihar and Joseph Picone Inst. for Signal and Info.
Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
Language Model Grammar Conversion Wesley Holland, Julie Baca, Dhruva Duncan, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University.
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
A NONPARAMETRIC BAYESIAN APPROACH FOR
Spectral and Temporal Modulation Features for Phonetic Recognition Stephen A. Zahorian, Hongbing Hu, Zhengqing Chen, Jiang Wu Department of Electrical.
HUMAN LANGUAGE TECHNOLOGY: From Bits to Blogs
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
EEG Recognition Using The Kaldi Speech Recognition Toolkit
Voice Activation for Wealth Management
HUMAN AND SYSTEMS ENGINEERING:
Presentation transcript:

DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State University Co-Authors: Julie Baca, Feng Zheng, Hualin Gao Center for Advanced Vehicular Systems Mississippi State University Mississippi State, Mississippi URL: EUROSPEECH

In-vehicle dialog systems improve information access. Advanced user interfaces enhance workforce training and increase manufacturing efficiency. Noise robustness in both environments to improve recognition performance Advanced statistical models and machine learning technology Multidisciplinary team (IE, ECE, CS). INTRODUCTION IN-VEHICLE DIALOG SYSTEMS

DIALOG SYSTEM ARCHITECTURE SYSTEM ARCHITECTURE DARPA COMMUNICATOR FRAMEWORK

…. Uses publicly available ISIP speech recognition toolkit. Implements standard HMM- based speaker independent continuous speech recognition system. Complete toolkits available for many popular tasks including conversational speech. On-line educational materials Extensive documentation SYSTEM ARCHITECTURE PUBLIC DOMAIN ASR

Transduction: Andrea NC-65 head-mounted Feature extraction: standard 39-element MFCCs Acoustic modeling: 8-mixture Gaussian HMMs Lexicon: 7,100 words (5K WSJ, 2K names) Language modeling: Interpolated Bigram (ppl: ~70) Search: Hierarchical Viterbi Beam SYSTEM ARCHITECTURE ASR SYSTEM COMPONENTS

Uses Phoenix semantic case frame parser from Colorado Univ. (CU). Employs semantic grammar consisting of case frames with named slots. FRAME: Drive [route] [distance] [route] (*IWANT [go_verb][arrive_loc]) IWANT (I want *to)(I would *like *to) (I will) (I need *to) [go_verb] (go)(drive)(get)(reach) [arriveloc] [*to [placename][cityname]] SYSTEM ARCHITECTURE NATURAL LANGUAGE UNDERSTANDING

“I want to drive from Columbus Mississippi to New York.” SYSTEM ARCHITECTURE NATURAL LANGUAGE UNDERSTANDING

SYSTEM ARCHITECTURE Accepts ungrammatical input, “I want… I need to drive to the campus post office.” Current version of the semantic grammar contains over 500 rules and 2000 words. Developed from pilot test corpus of sentence patterns. Route IWANTgo_verbarrive_loc “I need to” “drive” placenamecityname “post office” “campus” NLU MODULE

Controls interaction between user and system. Accepts parsed input from NLU module. Determines data requested, obtains data and controls presentation to user. SYSTEM ARCHITECTURE DIALOG MANAGER User:“How can I get to campus?” System:“Are you going to a specific location on campus?” User:“Where is engineering?” System:“What department?”

Derived from CU toolkit. Bulk of development lies in construction of domain-specific frames, rules, and slots. Example frames and associated queries: Drive_Direction:“How can I get from Lee Boulevard to Kroger? Drive_Address:“Where is the campus bakery?” Drive_Distance:“How far is China Garden?” Drive_Quality:“Find me the most scenic route to Scott Field.” Drive_Turn:“I am on Nash Street. What’s my next turn?” SYSTEM ARCHITECTURE DIALOG MANAGER

Geographic Information System (GIS) contains map routing data for MSU and surrounding area. Dialog manager (DM) first determines the nature of query, then:  obtains route data from the GIS database  handles presentation of the data to the user APPLICATION DEVELOPMENT GIS BACKEND

Obtained domain-specific data by: 1.Initial data gathering and system testing 2.Retesting after enhancing LM and semantic grammar Initial efforts focused on reducing OOV utterances and parsing errors for NLU module. APPLICATION DEVELOPMENT PILOT SYSTEM

Refinements to NLU System: Overall System Enhancements : Vers TestPrePostPrePostPrePost OOV25%0% 36% 0%4%0% Parser80%3%60%5%46%11% Test No. NLU Parser Error Rate DM Error Rate 143%49% 26%3% APPLICATION DEVELOPMENT RESULTS

Users participate in multiple scenarios in which they query for information (e.g., hotel and meeting locations). Tasks vary in scenarios according to role user plays:  First-time visitors  New residents  Long-time residents SUMMARY AND CONCLUSIONS WIZARD OF OZ DATA

SUMMARY AND CONCLUSIONS FURTHER DEVELOPMENT Established a preliminary dialog system for future data collection and research Demonstrated significant domain-specific improvements for in-vehicle dialog systems. Created a testbed for future studies of workforce training applications. Extended the ISIP public domain toolkit and released relevant resources into the public domain.

SUMMARY RELEVANT RESOURCES CAVS Dialog System: review our experimental results and download the in-vehicle prototype architecture and associated components. Natural Language and Dialog Management Toolkits (CU): explore tools to build NLU and DM components for a specific domain. Speech Recognition Toolkit (ISIP): examine a state of the art public domain ASR toolkit for integration in a dialog system.