1 SSML 1.1 - The Internationalization of the W3C Speech Synthesis Markup Language SpeechTek 2007 – C102 – Daniel C. Burnett.

Slides:



Advertisements
Similar presentations
Critical Reading Strategies: Overview of Research Process
Advertisements

TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational.
Speech Synthesis Markup Language V1.0 (SSML) W3C Recommendation on September 7, 2004 SSML is an XML application designed to control aspects of synthesized.
Speech Synthesis Markup Language SSML. Introduced in September 2004 XML based Assists the generation of synthetic speech Specifies the way speech is outputted.
Applying the Pronunciation Lexicon Specification to ASR & TTS 1 Patrizio Bergallo 1 Monday, August 20, 2007 SpeechTEK ASTS - Advances in Text-to-Speech.
What is the purpose of your essay? To argue/contend. Your essay will always work best if you have a strong contention and you argue it enthusiastically.
SSML extensions for multi-language usage Davide Bonardo W3C Workshop on Internationalizing SSML Crete, May 2006.
J. Kunzmann, K. Choukri, E. Janke, A. Kießling, K. Knill, L. Lamel, T. Schultz, and S. Yamamoto Automatic Speech Recognition and Understanding ASRU, December.
MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.
HTML 4 - Introduction HTML stands for Hyper Text Markup Language. It is the standard format for documents on the World Wide Web. Just as Microsoft Word.
Distributed Collaborations Using Network Mobile Agents Anand Tripathi, Tanvir Ahmed, Vineet Kakani and Shremattie Jaman Department of computer science.
1 Chapter 11 Developing Custom Help. 11 Chapter Objectives Use HTML to create customized Help topics for an application Use the HTML Help Workshop to.
CONFIDENTIAL | © Nuance Communications, Inc. All rights reserved. ENTERPRISE SOLUTIONS 1 Parteek Singh.
Position Paper for W3C Workshop on Internationalizing SSML The Usage of Part-Of-Speech for Resolving Multiple Pronunciations in SSML Myoung-Wan.
Speech Synthesis Markup Language -----Aim at Extension Dr. Jianhua Tao National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese.
1 SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML May 2006, Greece Nixon Patel and Kishore Prahallad Bhrigus.
Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML.
Pronunciation Lexicon Background Paolo Baggia, Loquendo W3C SSML Workshop Beijing – 2-3 Nov 2005.
JEITA Speech Group1 Issues of SSML in Japanese Wataru IMATAKE (ANIMO LIMITED) Makoto AKABANE (Sony Computer Entertainment Inc.) Kazuyo TANAKA (Tsukuba.
Where are other language families distributed?
Public 1 © 2005 Nokia V1-Filename.ppt / yyyy-mm-dd / Initials Development Challenges of Multilingual Text-to-Speech Systems Kimmo Pärssinen
Verbal Communication Health Science. Rationale Expertise in communication skills is necessary for workers in health care. To deliver quality health care,
How IPA is Used in SSML and PLS Paolo Baggia, Loquendo Wed. August 9 th, 2006.
Lesson 4: Using HTML5 Markup.  The distinguishing characteristics of HTML5 syntax  The new HTML5 sectioning elements  Adding support for HTML5 elements.
W3C Workshop, Beijing, 2nd of November 2005 An extension to the SSML for diacritics auto-completion R&D Centre Vocal Services Section.
Conversational Applications Workshop Introduction Jim Larson.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
PrepTalk a Preprocessor for Talking book production Ted van der Togt, Dedicon, Amsterdam.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
1 W3C Workshop on Internationalizing SSML SSML Extension for Korean Workshop : 2005/11/02 (Wed) Sang-Jin Kim
XHTML1 Building Document Structure Chapter 2. XHTML2 Objectives In this chapter, you will: Learn how to create Extensible Hypertext Markup Language (XHTML)
HTML INTRODUCTION, EDITORS, BASIC, ELEMENTS, ATTRIBUTES.
SSML 1.1: The Internationalization of SSML Daniel C. Burnett August 9, 2006.
CRL Speech Team © 2003 IBM Corporation Chinese Romanization for Chinese Voice Browsing IBM China Research Lab.
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
Reading Aid for Visually Impaired Veera Raghavendra, Anand Arokia Raj, Alan W Black, Kishore Prahallad, Rajeev Sangal Language Technologies Research Center,
Writing for Media Course BJMC 102 Ratan Mani Lal.
An Overview 1 Pamela Harrod, DMS 546/446 Presentation, March 17, 2008.
Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China.
Unit 2, cont. September 12 More HTML. Attributes Some tags are modifiable with attributes This changes the way a tag behaves Modifying a tag requires.
Speech Technology. HOT! What are the big players in the area up to? Google – technology.htmlhttp://googleblog.blogspot.com/2010/12/can-we-talk-better-speech-
Elements of NONFICTION. WHAT IS NONFICTION?  The subject of nonfiction is real The author writes about actual persons, places and events. The writer.
Using AFL to Teach JC2 O Power. Using AFL to Teach JC2 O Power.
What is the distribution of world languages density concentration patterns How is culture influenced or limited by this language distribution? How does.
Acknowledgements Prof. Mctear, Natural Language Processing, University of Ulster.
PLS Considerations on using PLS for Slovenian Pronunciation Lexicon Construction Jerneja Žganec Gros Alpineon d.o.o., Ljubljana, Slovenia
XML Engr. Faisal ur Rehman CE-105T Spring Definition XML-EXTENSIBLE MARKUP LANGUAGE: provides a format for describing data. Facilitates the Precise.
PLS Considerations on using PLS for Slovenian Pronunciation Lexicon Construction Jerneja Žganec Gros Alpineon d.o.o., Ljubljana, Slovenia
1 English In A Changing World Introduction. 2 3 Text And New Words: Advice  Record New Unfamiliar Words  Organize In Textbook Units or by Topics 
Introduction to HTML Year 8. What is HTML O Hyper Text Mark-up Language O The language that all the elements of a web page are written in. O It describes.
Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better.
Introducing XLink and XPointer ©NIITeXtensible Markup Language/Lesson 10/Slide 1 of 23 Objectives In this lesson, you will learn to: * Identify the types.
Games: XML Presented by: Idham bin Mat Desa Mohd Sharizal bin Hamzah Mohd Radzuan bin Mohd Shaari Shukor bin Nordin.
An Introduction to S3ML Beijing InfoQuick SinoVoice Speech Technology Corp. CHEN Ming, LV Shinan, LI Xiulin.
GCSE English Language 8700 GCSE English Literature 8702 A two year course focused on the development of skills in reading, writing and speaking and listening.
Quick Overview on Tones
TEACHING PRONUNCIATION
HTML5 SEMANTICS TO OR NOT TO THAT IS THE QUESTION BY WILLIAM MURRAY.
Listening Speaking Reading Writing Dr. Antar Abdellah 1431.
VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web.
INTRODUCTION TO THE WIDA FRAMEWORK Presenter Affiliation Date.
An Efficient Hindi-Urdu Transliteration System Nisar Ahmed PhD Scholar Department of Computer Science and Engineering, UET Lahore.
PLS for SSML Paolo Baggia Loquendo Workshop II on Internationalizing SSML.
Introduction 3 Learning Chinese: Learning Mandarin Chinese, Three Content Areas 1. Sounds: Mandarin, Pinyin 2. Shapes: Written Characters, Hanzi 3. Meanings:
期中考试 Quarter’s Final Oct. 11th – Listening / Reading / Writing (60)
RI04 Determine the meaning of words and phrases as they are used in a text, including figurative, connotative, and technical meanings; analyze the impact.
Topic: Language Introduction
Another way to think about Text Analysis
TECHNICAL REPORTS WRITING
Presentation transcript:

1 SSML The Internationalization of the W3C Speech Synthesis Markup Language SpeechTek 2007 – C102 – Daniel C. Burnett

2 Overview SSML 1.0 Why SSML 1.1? SSML 1.1 scope Selected features Examples –voice/xml:lang –pronunciation alphabets – element For more info...

3 SSML 1.0 W3C Recommendation in 2004 Widely implemented – the primary authoring API for TTS engines Many extensions

4 Why SSML 1.1? 1.0 extensions are primarily to address language-related phenomena Workshops in China, Greece, and India to understand motivations for these extensions –How to correct tones for East Asian languages? –How to handle transliteration for Indian languages? –How to indicate word boundaries for written languages that do not display them? –How to precisely control voice and language changes?

5 SSML 1.1 scope Provide broadened language support –For Mandarin, Cantonese, Hindi*, Arabic*, Russian*, Korean*, and Japanese, we will identify and address language phenomena that must be addressed to enable support for the language. Where possible we will address these phenomena in a way that is most broadly useful across many languages. We have chosen these languages because of their economic impact and expected group expertise and contribution. –We will also consider phenomena of other languages for which there is both sufficient economic impact and group expertise and contribution. Fix incompatibilities with other Voice Browser Working Group languages, including PLS, SRGS, and VoiceXML 2.0/2.1. Out of scope: –VCR-like controls: fast-forward, rewind, pause, resume –New values. Collecting requirements for future work is okay * provided there is sufficient group expertise and contribution for these languages

6 SSML 1.1 scope – some workshop topics In scope –Token/word boundaries –Phonetic alphabets –Tones –Part of Speech support –Text w/multiple languages (separate control of xml:lang and voice) –Subword annotation (partial) –Syllable-level markup (partial) Out of scope –Providing number, case, gender info –Simplified/alternate/SMS text –Transliteration –Expressive (emotion) elements –Enhanced prosody rate control

7 Selected new features SSML 1.1 is a Working Draft – everything from this point on is subject to change Improved lexicon activation control Better linkage with PLS lexicons Clearer separation between xml:lang (document text content) and voice selection Improved author control of behavior upon xml:lang/voice selection mismatch Introduction of a Pronunciation Alphabet Registry to allow use of standardized pinyin, jyutping, and other language-specific pronunciation alphabets in addition to the IPA default New element for marking word boundaries

8 Examples – voice/xml:lang Next few examples demonstrate some of the new SSML 1.1 features that provide –Clearer separation between xml:lang (document text content) and voice selection –Improved author control of behavior upon xml:lang/voice selection mismatch

9 I want a big pepperoni pizza. Simple example Will find voices that can read US English, each time. Voice changes are scoped, so the same voice is used for “I want” and “pizza.” The “name” and “gender” values are requests only, and not required in order for voice selection to be successful.

10 I want a big pepperoni pizza. “required” attribute Now the name and gender attributes, respectively, are required rather than merely requested. “required” attribute lists *all* required voice selection features, so the two inner voices might not be able to speak English If one of the inner voices cannot read/speak English, processor can decide what to do (skip the text, try to read it anyway, or change voice)

11 I want a big pepperoni pizza. “onlangfailure” attribute Now, when any text is encountered that cannot be spoken by the currently selected voice, it will be skipped by the processor. The voice *will not* change. Other options are “processorchoice”, “ignorelang”, and “changevoice”.

12 <voice languages=“en-US” onvoicefailure=“keepexisting”> I want a big pepperoni pizza. “onvoicefailure” attribute What if the processor can’t find a voice that meets the required criteria? In the above example, the processor will keep the voice it had. This attribute is scoped as well. Other options are “priorityselect” and “processorchoice”.

13 Language and accent <voice languages=“zh-cmn:en-US en:en-US” onvoicefailure=“keepexisting”> 我想要 a big pepperoni pizza. First request is for a voice that can speak both English and Mandarin Chinese with a US-English accent If voice selection is successful, the voice will be able to speak both the Chinese text and the final “pizza.” Note that the female voice need not speak either language (as written).

14 Examples – pronunciation alphabets Developing a new Pronunciation Alphabet Registry Experts can register pronunciation alphabets for their languages Can also register historically used alphabets such as ARPAbet and Worldbet First entries will likely be pinyin, jyutping 此 <phoneme alphabet=“pinyin“ ph=“chu4"> 处 不准照相。

15 Examples – element element helps resolve ambiguities for languages that may not visually separate words. Markup is allowed within but does not cause word separation (unlike in the rest of SSML) => allows for sub- word,, etc. <!-- Ambiguous sentence is 南京市长江大桥 --> 南京市 长江大桥 南京市长 江大桥

16 For more info... Information about the Voice Browser Working Group can be found at Current SSML drafts: – –