SPEECH TECHNOLOGY An Overview Gopala Krishna. A

Slides:



Advertisements
Similar presentations
Introduction to Computational Linguistics
Advertisements

Introduction to Computational Linguistics
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Breakout session B questions. Research directions/areas Multi-modal perception cognition and interaction Learning, adaptation and imitation Design and.
Tanmoy Bhattacharya Coordinator Equal Opportunity Cell University of Delhi ICT for PwDs: with Special Reference to Indian Sign Language.
Interaction between Academia and Microsoft in Speech and Language Systems Kentaro Toyama Microsoft Research Chandar Sundaram, Andy Abbar, Alex Acero, Mythreyee.
Braille keyboard/printer (H) Braille keyboard/printer (H) PAC mates (S) PAC mates (S) Voice recognition devices (S) Voice recognition devices (S) Magnifiers.
Chapter 5 Input and Output. What Is Input? What is input? p. 166 Fig. 5-1 Next  Input device is any hardware component used to enter data or instructions.
IMPECS Workshop on Social Computing Niloy Ganguly & Krishna Gummadi.
Center for Research in Urdu Language Processing PAN Localization Project A Regional Initiative to Develop Local Language Computing Capacity in Asia ثناء.
MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
Final Project CS HCI Kim T Le. Screen Readers for Blind.
1 Problems and Prospects in Collecting Spoken Language Data Kishore Prahallad Suryakanth V Gangashetty B. Yegnanarayana Raj Reddy IIIT Hyderabad, India.
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.
1 SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML May 2006, Greece Nixon Patel and Kishore Prahallad Bhrigus.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
1 Darmstadt, October 02, 2007 Amalia Ortiz Asociación VICOMTech Mikeletegi Pasealekua Donostia - San Sebastián (Gipuzkoa)
Real-Time Speech Recognition Subtitling in Education Respeaking 2009 Dr Mike Wald University of Southampton.
MOOC on M4D 2013 S PEECH T ECHNOLOGY FOR M OBILE P HONES Rajesh Hegde Indian Institute of Technology Kanpur Commonwealth of Learning Vancouver.
U & I: Users & Information Lab Sept 2008  Alice Oh 
By: Meghal Bhatt.  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language.
Reading Aid for Visually Impaired Veera Raghavendra, Anand Arokia Raj, Alan W Black, Kishore Prahallad, Rajeev Sangal Language Technologies Research Center,
G52IVG, School of Computer Science, University of Nottingham 1 Administrivia Timetable Lectures, Friday 14:00 – 16:00 Labs, Friday 17:00 -18:00 Assessment.
Kishore Prahallad IIIT-Hyderabad 1 Unit Selection Synthesis in Indian Languages (Workshop Talk at IIT Kharagpur, Mar 4-5, 2009)
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
Professor Alan W. Black Language Technologies Institute, Carnegie Mellon University Erik Jonsson School of Engineering and Computer Science The University.
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
CS.UCSB.EDU U. C. S A N T A B A R B A R A C O M P U T E R S C I E N C E I want to be a Computer Scientist Why should I choose UCSB?
Advanced Topics in Speech Processing (IT60116) K Sreenivasa Rao School of Information Technology IIT Kharagpur.
Research Topics CSC Parallel Computing & Compilers CSC 3990.
Microsoft Assistive Technology Products Brought to you by... Jill Hartman.
Digital Learning India 2008 July , 2008 Mrs. C. Vijayalakshmi Department of Computer science and Engineering Indian Institute of Technology – IIT.
S PEECH T ECHNOLOGY Answers to some Questions. S PEECH T ECHNOLOGY WHAT IS SPEECH TECHNOLOGY ABOUT ?? SPEECH TECHNOLOGY IS ABOUT PROCESSING HUMAN SPEECH.
Carnegie Mellon University Computer Science Foundations for Ph.D. Students The Carnegie Mellon Perspective Computer Science Foundations for Ph.D. Students.
Reducing uncertainty in speech recognition Controlling mobile devices through voice activated commands Neil Gow, GWXNEI001 Stephen Breyer-Menke, BRYSTE003.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Language in Cognitive Science. Research Areas for Language Computational models of speech production and perception Signal processing for speech analysis,
Language Technologies Capability Demonstration Alon Lavie, Lori Levin, Alex Waibel Language Technologies Institute Carnegie Mellon University CATANAL Planning.
Computational Linguistics Courses Experiment Test.
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Mobile Speech Translation Systems Design for /19/2013 INST603 Term Project MIM, UMD Makoto Asami.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
Voice Computing and Reaching the 3B People at the Bottom of the Pyramid Raj Reddy Carnegie Mellon University Pittsburgh, PA Sep 20, 2016 Heidelberg.
How can speech technology be used to help people with disabilities?
Why industry cares about nlp for tamil?
G. Anushiya Rachel Project Officer
Products/Solutions/Expertise of C-DAC Mumbai in Smart City Domain
Computational UIUC Lane Schwartz Student Orientation August 23, 2017.
Natural Language Processing and Speech Enabled Applications
Speech recognition in mobile environment Robust ASR with dual Mic
Computer Science Courses
Artificial Intelligence for Speech Recognition
Course Projects Speech Recognition Spring 1386
3.0 Map of Subject Areas.
Why Study Spoken Language?
EXPERIMENTS WITH UNIT SELECTION SPEECH DATABASES FOR INDIAN LANGUAGES
Why Study Spoken Language?
Kocaeli University Introduction to Engineering Applications
Alexa Programming.
University of West Bohemia – Department of Cybernetics
Introduction.
Computer Science Courses in the Major
Presentation transcript:

SPEECH TECHNOLOGY An Overview Gopala Krishna. A (gopalakrishna@students)

SPEECH TECHNOLOGY WHAT IS SPEECH TECHNOLOGY ABOUT ?? SPEECH TECHNOLOGY IS ABOUT PROCESSING HUMAN SPEECH as SIGNAL as a form of LANGUAGE

SPEECH TECHNOLOGY Speech Processing By Machine Algorithms: Speech Recognition, synthesis, coding etc.

SPEECH TECHNOLOGY WHAT ALL IS INVOLVED IN PROCESSING SPEECH ?? MULTI-DISCIPLINARY FIELD Linguistics Physiology Psychology Signal Processing Acoustics (Physics) Statistics Pattern Recognition Communication Theory Computer Science: A.I. Heuristics / Machine Learning Speech Technology

Applications: Man-Machine Communication Bio Metrics Smart Talking Machines, devices Speech enabled web interface Communication Speech Coding Speech Enhancement Bio Metrics Speaker Identification – security applications Entertainment Technology Singing Voices Voice Conversion Artificial Characters / Avatar

Research Areas: Speech Recognition (Speech To Text) Speech Synthesis (Text to Speech) Speech Coding (Compression of speech) Speech Enhancement (Voice quality) Speaker Recognition (Identity of the speaker) Spoken Language Identification (Which language?) Language Models (Modeling of natural text) Multimedia (Integration of Audio & Visual modes)

WHAT ARE WE DOING CURRENTLY ?? SPEECH TECHNOLOGY WHAT ARE WE DOING CURRENTLY ?? INDIAN LANGUAGE SPEECH SYNTHESIS Hindi and Telugu TTS building Prosody SPEECH RECOGNITION - Large Vocabulary ASR - Landmark-based ASR 3. SPEECH-ENABLED INTERFACES

Text-To-Speech Synthesis (TTS) Indian Language TTS Effort Hindi, Telugu, Tamil, Kannada, Bengali Text Normalization Machine learning Techniques Speech Segmentation Ergodic HMMs, SVMs….etc

Automatic Speech Recognition Large Vocabulary ASR “Mimic”ing the Sphinx Alternative ASR Techniques HMM-ANN Hybrid Dynamic Bayesian Networks (DBNs) Landmark-based Segmentation

Speech Enabled Interfaces Screen Readers RAVI (Reading Aid for the Visually Impaired) Porting to Low Memory devices Talking Tourist Aid Agent (PDA) Speech-to-Speech Devices Limited Domain Bi-lingual translation

Projects Hindi TTS (Sponsored by NOKIA) Telugu TTS (Sponsored by Bhrigus Inc.) Speech Recognition Systems for Indian Lang. (Sponsored by HP labs India) Reading Software For Blind (Sponsored by Ministry of Social Justice).

Stream Courses Speech Technology: A Practical Introduction Topics in Speech Processing Building TTS and ASR Systems Signal Processing Language Modeling - Intro. To NLP - Language and Statistics Machine Learning, Pattern Recognition

WHO WILL YOU BE WORKING WITH ?? SPEECH TECHNOLOGY WHO WILL YOU BE WORKING WITH ?? S. P. Kishore (Ph.D. @ CMU, Scientist @ IIIT) Prof. Rajeev Sangal Dr. Vasudeva Varma Faculty Members of Speech Group at CMU Dr. Alan Black, Dr. Jim Baker….. (TTS) (ASR) Fellow researchers

What you get at the end Career Opportunities – Off late, many companies and R & D organizations are investing in Indian Language and specifically in speech systems - Microsoft Research (Bangalore), HP Labs India, Yahoo India Research skills – publications Interaction and collaboration with faculty members of Speech Group at Carnegie Mellon System building skills - Would have developed speech systems using state of art techniques Opportunities for higher education in India and abroad

SPEECH TECHNOLOGY QUESTIONS [ skishore@cs.cmu.edu ]