CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM ICSLP’ 98 CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM J. Ferreiros,

Slides:



Advertisements
Similar presentations
Profiles Construction Eclipse ECESIS Project Construction of Complex UML Profiles UPM ETSI Telecomunicación Ciudad Universitaria s/n Madrid 28040,
Advertisements

HANd : A New Transcoding Technique for PDA Browsers Enrique Costa Montenegro Departamento de Ingeniería Telemática ETSI Telecomunicación Universidad de.
Linguistic and Logical Tools for an Advanced Interactive Speech System in Spanish J. Álvarez, V. Arranz, N. Castell & M. Civit TALP Research Centre UPC,
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
1 Relational Learning of Pattern-Match Rules for Information Extraction Presentation by Tim Chartrand of A paper bypaper Mary Elaine Califf and Raymond.
ETRW Modelling Pronunciation variation for ASR ESCA Tutorial & Research Workshop Modelling pronunciation variation for ASR INTRODUCING MULTIPLE PRONUNCIATIONS.
Expert System Human expert level performance Limited application area Large component of task specific knowledge Knowledge based system Task specific knowledge.
1 William Stallings Data and Computer Communications 7 th Edition Chapter 2 Protocols and Architecture.
Detection of Recognition Errors and Out of the Spelling Dictionary Names in a Spelled Name Recognizer for Spanish R. San-Segundo, J. Macías-Guarasa, J.
Project topics Projects are due till the end of May Choose one of these topics or think of something else you’d like to code and send me the details (so.
1 CODE TESTING Principles and Alternatives. 2 Testing - Basics goal - find errors –focus is the source code (executable system) –test team wants to achieve.
INCORPORATING MULTIPLE-HMM ACOUSTIC MODELING IN A MODULAR LARGE VOCABULARY SPEECH RECOGNITION SYSTEM IN TELEPHONE ENVIRONMENT A. Gallardo-Antolín, J. Ferreiros,
1 Voice Command Generation for Teleoperated Robot Systems Authors : M. Ferre, J. Macias-Guarasa, R. Aracil, A. Barrientos Presented by M. Ferre. Universidad.
J4www/jea Week 3 Version Slide edits: nas1 Format of lecture: Assignment context: CRUD - “update details” JSP models.
Acoustical and Lexical Based Confidence Measures for a Very Large Vocabulary Telephone Speech Hypothesis-Verification System Javier Macías-Guarasa, Javier.
Chapter 11 Exception Handling and Event Handling.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
VESTEL database realistic telephone speech corpus:  PRNOK5TR: 5810 utterances in the training set  PERFDV: 2502 utterances in testing set 1 (vocabulary.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
VARIABLE PRESELECTION LIST LENGTH ESTIMATION USING NEURAL NETWORKS IN A TELEPHONE SPEECH HYPOTHESIS-VERIFICATION SYSTEM J. Macías-Guarasa, J. Ferreiros,
Architectural Design Principles. Outline  Architectural level of design The design of the system in terms of components and connectors and their arrangements.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
1 Spontaneous-Speech Dialogue System In Limited Domains ( ) Development of an oral human-machine interface, by way of dialogue, for a semantically.
William Stallings Data and Computer Communications 7 th Edition Chapter 2 Protocols and Architecture.
Centro de Electrónica Industrial (CEI) | Universidad Politécnica de Madrid | | Abstract Texto TITLE 1 Texto Texto Bolo2  Texto.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Literacy Strategies for the At Risk Student (Ideas and methodologies were taken from Dr. Janet Allen’s research that was shared at the Bronx High Schools.
2APL A Practical Agent Programming Language March 6, 2007 Cathy Yen.
Mixed Narrative and Dialog Content Planning Based on BDI Agents Carlos León Aznar Samer Hassan Collado Pablo Gervás Juan Pavón Mestras CAEPIA 2007 Universidad.
Survey of Semantic Annotation Platforms
AToM 3 : A Tool for Multi- Formalism and Meta-Modelling Juan de Lara (1,2) Hans Vangheluwe (2) (1) ETS Informática Universidad Autónoma de Madrid Madrid,
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
GUI development with Matlab: GUI Front Panel Components 1 GUI front panel components In this section, we will look at -GUI front panel components -Programming.
PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
SPEECH CONTENT Spanish Expressive Voices: Corpus for Emotion Research in Spanish R. Barra-Chicote 1, J. M. Montero 1, J. Macias-Guarasa 2, S. Lufti 1,
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Using Client-Side Scripts to Enhance Web Applications 1.
Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad
111 Protocols CS 4311 Wirfs Brock et al., Designing Object-Oriented Software, Prentice Hall, (Chapter 8) Meyer, B., Applying design by contract,
ICT Assessment – Key stage 3 ICT Meeting 14/12.09.
SYNTAX ANALYSIS & ERROR RECOVERY By: Sarthak Swaroop.
Creating PHPs to Insert, Update, and Delete Data CS 320.
Syntax Why is the structure of language (syntax) important? How do we represent syntax? What does an example grammar for English look like? What strategies.
Telmo Zarraonandia Laboratorio DEI. Dpto. de Informática U. Carlos III de Madrid A Late Modelling Approach for the Definition of Computer-Supported Learning.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Speech Recognition MIT SMA 5508 Spring 2004 Larry Rudolph (MIT)
The Category-Partition Method for Specifying and Generating Functional Tests. Thomas J. Ostrand and Marc J.Balcer [ CACM,1988 ]. Slides from Prof. Shmuel.
HTLM Forms CS3505. Form Handling in Browser html User Files out form WEbBROWSErWEbBROWSEr User read response submit Get URL?input html Get file html script.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
Software Engineering Issues Software Engineering Concepts System Specifications Procedural Design Object-Oriented Design System Testing.
CS 4705 Lecture 17 Semantic Analysis: Robust Semantics.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
1. 2 Purpose of This Presentation ◆ To explain how spacecraft can be virtualized by using a standard modeling method; ◆ To introduce the basic concept.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
Software Architecture for Multimodal Interactive Systems : Voice-enabled Graphical Notebook.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
An AV Control Method Using Natural Language Understanding Author : M. Matsuda, T. Nonaka, and T. Hase Date : Speaker : Sian-Lin Hong IEEE Transactions.
Coupling and Cohesion Rajni Bhalla.
CO4301 – Advanced Games Development Week 2 Introduction to Parsing
OUTLINE Basic ideas of traditional retrieval systems
Department Array in Visual Basic
R.Rajkumar Asst.Professor CSE
On the Integration of Speech Recognition into Personal Networks
Module IV Memory Organization.
FUNDAMENTAL CONCEPTS OF PARALLEL PROGRAMMING
Presentation transcript:

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM ICSLP’ 98 CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM J. Ferreiros, J. Colás, J. Macías-Guarasa, A. Ruiz, J. M. Pardo Grupo de Tecnología del Habla - Departamento de Ingeniería Electrónica E.T.S.I. Telecomunicación - Universidad Politécnica de Madrid Ciudad Universitaria s/n, Madrid Spain

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM General Architecture Speech Recogniser TaggerTags RefinerUnderstandingActuator Speech Generation Module Text to Speech IR- LED Alternative Expresions SCHMM + Word Pair Tagged Dictionary Context Dependent Rules Context Dependent Rules HIFI Status

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM Speech recogniser l Characteristics: –Continuous speech commands –One-pass search with word-pair grammar –163 words –SCHMM phone models l Implementation: –Front-end: DSP LSI board –Rest of processing: PC

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM Speech understanding (I) l TAGGER: –78 semantic tags –several tags applied to each word –“garbage” tag used for no-meaning words l Gives robustness against speech recogniser errors l Will allow OOV in the recognised string “Please, set the volume higher” –Tagging directly specified in the lexicon

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM Speech understanding (II) l TAGS REFINER: –Aims: l Numbers processing l Disambiguation of words with several tags l “garbage” removal –May change the literal of the words “two five”  “25” –May introduce new refined semantic tags –Context dependent rules word: “right” tags: “position increment” rule: “if there exists any other word tagged as a tape parameter, then the word right is the position of this tape else it is a increment indicator”

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM Speech understanding (III) l UNDERSTANDING STAGE: –Context dependent rules l Gives independence on the order of the concepts –Trying to fill in frames: SUBSYSTEM=(radio,cd-player,cassette,...) PARAMETER=(volume,tone,broadcast station,song,...) VALUE=(higher,number,...) –One or several frames for each command –More specific rules: first to be executed –We also fill in message strings l With the “reasoning” l With the problems in the understanding stage

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM Speech understanding (IV) l ACTUATOR: –Sends IR commands to the HIFI set –Keeps track of the set status –Informs the user of the actions performed or the problems found USER: “switch the radio on” ACTUATOR: “The radio was already on”

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM Speech generation –Input: pattern string of both literals and concepts coming from the rest of the architecture –Performs random concepts substitution by text to achieve a certain degree of naturalness / variety Input: “C_SEEING the word higher with an increment meaning, C_THINK that put means an increasing action” C_SEEING  “As I can see", "As I have discovered", "As It appears",... C_THINK  "I think", "I imagine", "I suppose"... –Output through a text-to-speech subsystem

CONTROLLING A HIFI WITH A CONTINUOUS SPEECH UNDERSTANDING SYSTEM CONCLUSIONS & FUTURE WORK –Supporting ideas of the system: l Semantic-like tagging l Context dependent rules l “garbage” tag l pattern-based generation l random concepts substitution for generation –Desirable new aspects: l Use of more information of the recognised sentences l Handle more complex commands Introducing semantic-syntactic parsing of the sentence structure l Introduce dialogue to complete not understood or not given information and as a confirmation strategy