Audio-visual Speaker association Zhijie Shao Master of Computer Science Supervisor: Trent Lewis.

Slides:



Advertisements
Similar presentations
Technology and Instruction Why is Technology Important? The Complexity of Adaptive Instruction Existing and Promising Technologies for Promoting Literacy.
Advertisements

1 Using the HTK speech recogniser to analyse prosody in a corpus of German spoken learners English Toshifumi Oba, Eric Atwell University of Leeds, School.
By: Hossein and Hadi Shayesteh Supervisor: Mr J.Connan.
An Integrated Toolkit Deploying Speech Technology for Computer Based Speech Training with Application to Dysarthric Speakers Athanassios Hatzis, Phil Green,
Information on GVL - Genomics Virtual Laboratory Oct 2013 Audience: Service Desk Developed as part of the Australian.
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Speech Synthesis Markup Language V1.0 (SSML) W3C Recommendation on September 7, 2004 SSML is an XML application designed to control aspects of synthesized.
Class 9 – April 27, 2012 An Abbreviated but Typical System Conversion Plan for a Medium-Sized Enterprise.
Facial expression as an input annotation modality for affective speech-to-speech translation Éva Székely, Zeeshan Ahmed, Ingmar Steiner, Julie Carson-Berndsen.
Final Year Project Steven Graham – B Supervisor – Dr Peter Nicholl Wednesday 1 st December 2010.
Submitters: Erez Rokah Erez Goldshide Supervisor: Yossi Kanizo Networked Software Systems Laboratory Department of Electrical Engineering Technion - Israel.
Bootstrapping a Language- Independent Synthesizer Craig Olinsky Media Lab Europe / University College Dublin 15 January 2002.
Spoken Language Generation Project II Synthesizing Emotional Speech in Fairy Tales.
1 Location Based File Exchange Controlled By Speech (LBFE-S) Final Project (Master Thesis) By Mohammed Marouf Supervisors John A. Sørensen.
Queen Mary, University of London
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
1 Location Based File Exchange Controlled By Speech (LBFE-S) Final Project (Master Thesis) By Mohammed Marouf Supervisors John A. Sørenson.
Final Project CS HCI Kim T Le. Screen Readers for Blind.
Introduction to Geographic Information Systems GIS is a Spatial tool used to query spatial information investigate spatial problems communicate spatial.
Text-To-Speech Synthesis An Overview. What is a TTS System  Goal A system that can read any text Automatic production of new sentences Not just audio.
Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project.
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
Sunee Holland University of South Australia School of Computer and Information Science Supervisor: Dr G Stewart Von Itzstein.
Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.
Speech Recognition Final Project Resources
04/08/04 Why Speech Synthesis is Hard Chris Brew The Ohio State University.
STANDARDIZATION OF SPEECH CORPUS Li Ai-jun, Yin Zhi-gang Phonetics Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Welcome To Excel Day 3: Modules 4 & 5 Instructor: Mary Magellan Class will start at approximately 8:05 AM.
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
Master thesis writing. time line October January1 June end of June.
Prepared by: Waleed Mohamed Azmy Under Supervision:
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
The role of prosody in dialect synthesis and authentication Kyuchul Yoon Division of English Kyungnam University Spring 2008 Joint Conference of KSPS.
SPEECH CONTENT Spanish Expressive Voices: Corpus for Emotion Research in Spanish R. Barra-Chicote 1, J. M. Montero 1, J. Macias-Guarasa 2, S. Lufti 1,
V e RSI Victorian eResearch Strategic Initiative VBL Introduction Crystal 25 Rev 1.2.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal VideoConference Archives Indexing System.
SPEECH SYNTHESIS --AusTalk Zhijie Shao Master of Computer Science Supervisor: Trent Lewis.
Virach Sornlertlamvanich Information R&D Division (iTech) National Electronics and Computer Technology Center (NECTEC) THAILAND 19 January 2001 Symposium.
MINISTRY OF EDUCATION AND SCIENCE OF UKRAINE NATIONAL AVIATION UNIVERSITY Air Navigation System Department.
Experiments of Opinion Analysis On MPQA and NTCIR-6 Yaoyong Li, Kalina Bontcheva, Hamish Cunningham Department of Computer Science University of Sheffield.
DatabaseCSIE NUK1 Fundamentals of Database Systems Wen-Yang Lin Dept. of Computer Science and Information Engineering National University of Kaohsiung.
November 4th, 1996ICAD Industry Panel1 Audio Taken Seriously; The present and future of audio at Microsoft Ken Greenebaum Internet.
1 Location Based File Exchange Controlled By Speech (LBFE-S) Final Project (Master Thesis) By Mohammed Marouf Supervisors John A. Sørenson.
Mrs. Henderson 7 th -Grade Literacy. An analytical exposition is a type of spoken or written text that is intended to persuade the listeners or readers.
Human Communication Research Centre Universities of Edinburgh, Glasgow John Lee (Deputy Director, Edinburgh)
Implementation of a Relational Database as an Aid to Automatic Target Recognition Christopher C. Frost Computer Science Mentor: Steven Vanstone.
Ways to generate computer speech Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal.
What Technology Is Used Today to Better Education?
Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Invitation to Computer Science 6 th Edition Chapter 10 The Tower of Babel.
IMSTD:Intelligent Multimedia System for teaching Databases By : NAZLIA OMAR Supervisors: Prof. Paul Mc Kevitt Dr. Paul Hanna School of Computing and Mathematical.
Librarian Brad B. Description Gathering Competitive Intelligence Writing Reports Verifying Facts Using new Computer Software Training.
THE GEORGE WASHINGTON UNIVERSITY SCHOOL OF ENGINEERING & APPLIED SCIENCE DEPARTMENT OF COMPUTER SCIENCE PRELIMINRY DESIGN TECHLER Sisay Habte CSCI
POLIMEDIA WORKSHOP Amman November. Introduction Inside E-learning technologies, that envolves: e-learnings platforms (moodle, sakay,..), web pages,
2014 Development of a Text-to-Speech Synthesis System for Yorùbá Language Olúòkun Adédayọ̀ Tolulope Department of Computer Science.
G. Anushiya Rachel Project Officer
Text-To-Speech System for English
SPEEch on the griD (SPEED)
Customer Satisfaction Based on Voice Signals
Why Study Spoken Language?
Speech Conductor Team Six (see below)
Why Study Spoken Language?
Data science online training.
Presentation transcript:

Audio-visual Speaker association Zhijie Shao Master of Computer Science Supervisor: Trent Lewis

Project Schedule Open MARYVoice Import ToolAusTalkA New VoiceEvaluation

Open MARY What is MARY? MARY – Modular Architecture for Research on speech sYnthesis Why is MARY?

Voice Import Tool Import new voices under the MARY environment. Two formats of files: 1.Wave files 2.Text files in MARY format Import Blizzard competition Data into MARY. (including English audiobook data and training data)

AusTalk AVSP requires large datasets. the largest-ever auditory-visual database of Australian speech HCSvLab Human Communication Science Virtual Laboratory A platform for eResearch in HCS

A New Voice Text Analysis Text Normalization Homonym Disambiguation Grapheme-to-Phoneme (Letter-to-Sound) Intonation Waveform Generation Unit Selection Diphones

Evaluation Emotion Quality intonation Consistent

Conclusion Final Object: Create a new voice Time schedule