SSML extensions for multi-language usage Davide Bonardo W3C Workshop on Internationalizing SSML Crete, 30-31 May 2006.

Slides:



Advertisements
Similar presentations
VoiceXML: Application and Session variables, N- best and Multiple Interpretations.
Advertisements

Speech Synthesis Markup Language V1.0 (SSML) W3C Recommendation on September 7, 2004 SSML is an XML application designed to control aspects of synthesized.
1 © 2004 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Media Resource Control Protocol v2 A Tutorial Sarvi Shanmugham, Editor:
Speech Synthesis Markup Language SSML. Introduced in September 2004 XML based Assists the generation of synthetic speech Specifies the way speech is outputted.
Applying the Pronunciation Lexicon Specification to ASR & TTS 1 Patrizio Bergallo 1 Monday, August 20, 2007 SpeechTEK ASTS - Advances in Text-to-Speech.
Collaborative Customer Relationship Management (CCRM) User Group June 23 rd, 2004.
1 SSML The Internationalization of the W3C Speech Synthesis Markup Language SpeechTek 2007 – C102 – Daniel C. Burnett.
H E L S I N K I U N I V E R S I T Y O F T E C H N O L O G Y G O p r o j e c t : S e r v i c e A r c h i t e c t u r e f o r t h e N o m a d i c I n t e.
1 VoiceXML Data Logging Specification David Thomson CTO, SpeechPhone SpeechTEK – Aug. 20, 2007.
Which development tool is right for you? Commercial Tools John Fuentes – Principal Solutions Architect
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
Forschungszentrum Telekommunikation Wien [Telecommunications Research Center Vienna] Interfaces between Speech and Non-Speech Audio Technology Michael.
Project topics Projects are due till the end of May Choose one of these topics or think of something else you’d like to code and send me the details (so.
The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Pace VoiceXML Absentee System Paul Visokey, Ping Gallivan, Yani Mulyani, Lisa Jordan, Elaine Li, George Mathew, Qisheng Hong Presenter Name : Paul Visokey.
Requirements Specification
1 Linguistic Resources needed by Nuance Jan Odijk Cocosda/Write Workshop.
Thomas Kisner.  Unified Communications Architect at BNSF Railway  Board Member, DFW Unified Communications User Group ◦ Meets 4 th Thursday of Every.
WebDynpro for ABAP Short introduction.
CONFIDENTIAL | © Nuance Communications, Inc. All rights reserved. ENTERPRISE SOLUTIONS 1 Parteek Singh.
Position Paper for W3C Workshop on Internationalizing SSML The Usage of Part-Of-Speech for Resolving Multiple Pronunciations in SSML Myoung-Wan.
Speech Synthesis Markup Language -----Aim at Extension Dr. Jianhua Tao National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
1 SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML May 2006, Greece Nixon Patel and Kishore Prahallad Bhrigus.
XML, DITA and Content Repurposing By France Baril.
VoiceXML Builder Arturo Ramirez ACS 494 Master’s Graduate Project May 04, 2001.
Pronunciation Lexicon Background Paolo Baggia, Loquendo W3C SSML Workshop Beijing – 2-3 Nov 2005.
© 2007 Cisco Systems, Inc. All rights reserved.UCCXD v2.0—10-1 Configuring CME for CRS 5.0 & ASR Grammar.
Public 1 © 2005 Nokia V1-Filename.ppt / yyyy-mm-dd / Initials Development Challenges of Multilingual Text-to-Speech Systems Kimmo Pärssinen
How IPA is Used in SSML and PLS Paolo Baggia, Loquendo Wed. August 9 th, 2006.
The speech technology business and evolution scenario 1 Silvia Mosso 1 22/11/2006 Multilinguism and Language Technology a Challenge for Europe workshop.
W3C Workshop, Beijing, 2nd of November 2005 An extension to the SSML for diacritics auto-completion R&D Centre Vocal Services Section.
Du “Text-to-Speech” au multilinguïsme Isabel Meurisse Babel Technologies
Conversational Applications Workshop Introduction Jim Larson.
Integrating Timing into XML Documents Patrick Schmitz MS Research BARC Telepresence.
MOOC on M4D 2013 S PEECH T ECHNOLOGY FOR M OBILE P HONES Rajesh Hegde Indian Institute of Technology Kanpur Commonwealth of Learning Vancouver.
PrepTalk a Preprocessor for Talking book production Ted van der Togt, Dedicon, Amsterdam.
SIV Applications Claudia Daboul (IBP) Martin Eckert (T-Systems) Judith Markowitz (J. Markowitz, Consultants) 08. Aug 2006.
1 © 2004 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Media Resource Control Protocol v2 Sarvi Shanmugham, Editor: MRCP v1/v2.
1 W3C Workshop on Internationalizing SSML SSML Extension for Korean Workshop : 2005/11/02 (Wed) Sang-Jin Kim
Creating Speaking Web Pages: The Text-to-Speech Integrated Development Environment (TTS-IDE) David C. Gibbs Department of Mathematics and Computing University.
SSML 1.1: The Internationalization of SSML Daniel C. Burnett August 9, 2006.
Listener Controlled Navigation of VoiceXML Documents Gopal Gupta N. Annamalai, H. Reddy Dept. of Computer Science UT Dallas.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
Integrating VoiceXML with SIP services
1 David Thomson The Search for a Dialog Metalanguage that Makes Everybody Happy David Thomson Chair, VoiceXML Tools Committee, SpeechPhone CTO.
Programming in Java Unit 3. Learning outcome:  LO2:Be able to design Java solutions  LO3:Be able to implement Java solutions Assessment criteria: 
The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
A Dialogue System for Robots using VoiceXML Louise Funke & Marc Bauer 2007/12/11 EDA171/DATN06 Language Processing and Computational Linguistics Pierre.
Speech Synthesis Technology Voice Picking.
Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.
Developing an Effective Wireless Middleware Strategy.
Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better.
VoiceXML Version 2.0 Jon Pitcherella. What is it? A W3C standard for specifying interactive voice dialogues. Uses a “voice” browser to interpret documents,
An Introduction to S3ML Beijing InfoQuick SinoVoice Speech Technology Corp. CHEN Ming, LV Shinan, LI Xiulin.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Artificial Intelligence, simulation and modelling.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web.
PLS for SSML Paolo Baggia Loquendo Workshop II on Internationalizing SSML.
Presented By Sharmin Sirajudeen S7 CS Reg No :
Proposal for Term Project
Introduction to XHTML.
AJAX Impact on Telecom It’s not just for web sites anymore.
Templatized Model Transformation: Enabling Reuse in Model Transformations Amogh Kavimandan and Aniruddha Gokhale Dept. of EECS, Vanderbilt University,
Voice Activation for Wealth Management
VoiceXML An investigation Author: Mya Anderson
Presentation transcript:

SSML extensions for multi-language usage Davide Bonardo W3C Workshop on Internationalizing SSML Crete, May 2006

2 About Loquendo R&D of speech technology Over 30 years experience (from CSELT laboratories) Technologies: –TTS (text to speech) –ASR (automatic speech recognition) & SV (Speaker Verification) Solutions: –Easy integration of speech technologies –Speech servers (MRCPv1 & v2 protocols) –Speech platforms (VoiceXML & CCXML interpreters) –Embedded solutions (for many OS and devices)

3 Ideas for SSML extensions element –Extension of the values for the “ interpret-as ” attribute New element –

4 Proposal 1: extension (1/3) Problem: –How to interpret a part of an input text –Different contexts of dialog require different interpretations –The interpretation could be language dependent Many contexts could be defined: sms, s, news, application for rescue operations, … The TTS engines may use context information to activate the best configuration for: –reading acronyms –abbreviation expansions –using customized prosodic phrasing –activating a special reading style

5 Proposal 1: extension (2/3) Proposal: To extend the “ interpret-as ” attribute with new values, for instance: –sms – –news –banking –navigation –…

6 Proposal 1: extension (3/3) Examples I call you asap. I call you asap Mtfbwu

7 Proposal 2: New element (1/3) Problem 1: the activation of the correct language knowledge at the specific point of the text “xml:lang” attribute is currently available in,, and elements The behavior for the engine could be different: –In the root element, “xml:lang” defines the language of the whole document, but for the engine it involves the selection of a voice –In the element, it is an important recommendation in order to load the correct voice –In the and elements, it is mainly a language information and the engine, if able to do this, can use the same voice but a different language knowledge (e.g. phonetic mapping) Problem 2: it could be necessary to specify a language change for a text unit smaller than a sentence.

8 Proposal 2: New element (2/3) Proposal: To introduce a new element To extend the use of “xml:lang” attribute to the element Advantages: It is a generic element It is extensible –Without attributes, it could be used to give information on the segmentation, where needed. –With other attributes, it could specify new information for the token (i.e. part of speech)

9 Proposal 2: New element (3/3) Examples The movie is the product of Italian comic sensation Roberto Benigni, who wore three hats for "La vita è bella": director, co-writer, and star. The movie is the product of Italian comic sensation Roberto Benigni, who wore three hats for "La vita è bella" : director, co-writer, and star.

10 Conclusions Proposal 1: –To increase the number of “interpret-as” values with the identification of new context of speech Proposal 2: –To introduce a new element to define some specific information (i.e. the language) for a single word, or phrase and so on.