Spoken Language Systems: The Unfinished Agenda Raj Reddy School of Computer Science Carnegie Mellon University Pittsburgh September 21, 2006 The entire.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Speech-to-Speech Translation Hannah Grap Language Weaver, Inc.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Introduction to Computational Linguistics
Introduction to Computational Linguistics
TECHNOLOGY FOR MOBILE ADVERTISING SEARCH & COMMERCE © 2007 Apptera Inc. Optimizing Software Architecture for Voice Search SpeechTek 2007.
To Err is Human Computational Limits to Human Thinking : Implications for the Design of Human Centered Interfaces Raj Reddy Carnegie Mellon University.
To Err is Human Computational Limits to Human Thinking : Implications for the Design of Human Centered Interfaces Raj Reddy Carnegie Mellon University.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
Resources, Agents and Processes in the context of Next Generation World Wide Web Dr. Evgeny Osipov Head of Communication Networks group Luleå University.
IT: The Great Empowerment Engine V S Arunachalam Carnegie Mellon University Pittsburgh, USA UN World Summit on Sustainable Development Johannesburg, South.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Outline of Presentation Introduction of digital video libraries Introduction of the CMU Informedia Project Informedia: user perspective Informedia:
Introductory Remarks Robust Intelligence Solicitation Edwina Rissland Daniel DeMenthon, George Lee, Tanya Korelsky, Ken Whang (The Robust Intelligence.
1 Problems and Prospects in Collecting Spoken Language Data Kishore Prahallad Suryakanth V Gangashetty B. Yegnanarayana Raj Reddy IIIT Hyderabad, India.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ISSUES IN SPEECH RECOGNITION Shraddha Sharma
CAREERS IN LINGUISTICS OUTSIDE OF ACADEMIA CAREERS IN INDUSTRY.
Million Book Project (MBP) Gloriana St. Clair Johns Hopkins University February 5, 2003.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Natural Language Processing Neelnavo Kar Alex Huntress-Reeve Robert Huang Dennis Li.
Mobile and Pervasive Computing - 8 Natural Language Processing Presented by: Dr. Adeel Akram University of Engineering and Technology, Taxila,Pakistan.
The speech technology business and evolution scenario 1 Silvia Mosso 1 22/11/2006 Multilinguism and Language Technology a Challenge for Europe workshop.
Million Book Project (MBP) Coalition for Networked Information December 5-6, 2002.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.
1 Computational Linguistics Ling 200 Spring 2006.
1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC.
NLP Related Activities in Thailand Virach Sornlertlamvanich Information Research and Development Division National Electronics and Computer Technology.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
Chapter 3 Culture and Language. Chapter Outline  Humanity and Language  Five Properties of Language  How Language Works  Language and Culture  Social.
Introduction to Computational Linguistics
Creating User Interfaces Directed Speech. XML. VoiceXML Classwork/Homework: Sign up to be Voxeo developer. Do tutorials.
Audient: An Acoustic Search Engine By Ted Leath Supervisor: Prof. Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering University.
Understanding the field & setting expectations.  Personal  International  UNT Alumni (Mathematics)  Academic  Economics & Mathematics  Professional.
Cognitive Systems Foresight Language and Speech. Cognitive Systems Foresight Language and Speech How does the human system organise itself, as a neuro-biological.
Introduction to Linguistics Class # 1. What is Linguistics? Linguistics is NOT: Linguistics is NOT:  learning to speak many languages  evaluating different.
Mass Digitization Projects Celebration and Challenges Presented to the 2 nd ICUDL Alexandria, Egypt by Dr. Gloriana St. Clair Carnegie Mellon University.
Collaborator Revolutionizing the way you communicate and understand
Higher Vision, language and movement. Strong AI Is the belief that AI will eventually lead to the development of an autonomous intelligent machine. Some.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Language Technologies Capability Demonstration Alon Lavie, Lori Levin, Alex Waibel Language Technologies Institute Carnegie Mellon University CATANAL Planning.
 digital methodologies for global media research Randy Kluver Dept of Communication Texas A&M University.
Artificial Intelligence Chapter 1 - Part 2 Artificial Intelligence (605451) Dr.Hassan Al-Tarawneh.
Speech Recognition Created By : Kanjariya Hardik G.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Mobile Speech Translation Systems Design for /19/2013 INST603 Term Project MIM, UMD Makoto Asami.
Big Data: Every Word Managing Data Data Mining TerminologyData Collection CrowdsourcingSecurity & Validation Universal Translation Monolingual Dictionaries.
Million Book Project: Vision Becoming Reality Gabrielle Michalek, Carnegie Mellon Presentation to Carnegie Mellon Qatar Library November 9 & 10, 2005.
How can speech technology be used to help people with disabilities?
SPEECH TECHNOLOGY An Overview Gopala Krishna. A
Speech Recognition
Natural Language Processing and Speech Enabled Applications
课程名 编译原理 Compiling Techniques
3.0 Map of Subject Areas.
Speech Recognition: A 50 Year Retrospective Paper at ASA 2004 in Honor of Contributions of James Flanagan SR:50 year: It is a pleasure for me to be part.
Data Warehousing and Data Mining
Mobile and Pervasive Computing - 7 Natural Language Processing
Natural Language Processing
Command Me Specification
Presentation transcript:

Spoken Language Systems: The Unfinished Agenda Raj Reddy School of Computer Science Carnegie Mellon University Pittsburgh September 21, 2006 The entire 67MB talk with video clips can be downloaded from

Speech Language Systems Objective: Recognize, interpret, execute and respond to spoken language input to computer Background: –ATT, CMU, IBM, and MIT working on the problem for over 40 years –Other Key Contributors: BBN, Dragon Systems, Kurzweil, SRI, Japan Inc., Europe Inc. –Research and Development Level of Effort: About $200 million/year world wide Long Term Goal : Make speech the preferred mode of communication to computers

Why Speech Processin Has Been Difficult? Too Many Sources of Variability Noise Microphones Speakers Different Speech Sounds Different Pronunciations Non Grammaticality Imprecision of Language

Why Speech Recognition Has Been Difficult? (Cont) And Many Sources of Knowledge –Acoustics – Phonetics and Phonology –Lexical Information –Syntax –Semantics –Context –Task Dependent Knowledge

Land Marks Dragon Dictate and Naturally Speaking IBM Via Voice dictation Nuance-based Tellme 800 services allow voice query for directory information, stocks, sports, news, weather, and horoscopes Microsoft Speech Server e.g. voice dialing

Need for Interdisciplinary Teams Signal Processing –Fourier Transforms, DFT, FFT Acoustics –Physics of sounds & speech –Vocal tract model Phonetics and Linguistics –Sounds (Acoustic-Phonetics) –Words (Lexicon) –Grammar (Syntax) –Meaning (Semantics) Statistics –Probability Theory –Hidden Markov Models –Clustering –Dynamic Programming AI and Pattern Recognition –Knowledge Representation and Search –Approximate Matching –Natural Language Processing Human Computer Interaction –Cognitive Science –Design –Social Networks Computer Science –Hardware, Parallel Systems –Algorithms Optimization

The Unfinished Agenda Technical Application specific Societal

Technical Challenges Unrehearsed Spontaneous Speech Non Native Speakers of English Dynamic Learning from Sparse Data –New Words –New Speakers –New Grammatical Forms –New Languages No Silver Bullet on the Horizon! 50 more years? –Million times greater computational power, memory and bandwidth?

One Application Specific Challenge: The Million Book Digital Library Project

The Grand Challenge of Digital Libraries Create Access to All published works online Instantly available In any language Anywhere in the world Searchable, browsable, navigable By humans and machines

One Step at a Time… Million Book DL –Only about 1% of all the world’s books Harvard University12M Library of Congress30M OCLC catalog 42M All Multilingual Books~100M At the rate of digitization of the last decade it would take a 100 years!

Million Book Project: Issues Time –At one page per second (20,000 pages per day shift), it will take 100 years (200 working days per year) to scan a million books of 400 pages each Cost –100M books at US$100 per book would coat $10B –Even in India and China the cost will be $1B –The annual cost is currently expected to be close $10M per year with support from US, India and China. Selection –Selection of appropriate books for scanning is time consuming and expensive

Million Book Project: Issues (cont) Logistics –Each containers hold 10,000 to 20,000 books. Shipping and handling costs about $10,000 Meta Data –Accessing and/or creating Meta data requires professionals trained in Library science Optical Character Recognition Technology –Essential for searching, translation and summarization –Many languages don’t have OCR

Million Book Project: Status 18 Centers in India 22 centers in China 1 Center in Egypt Planned : Australia and Europe Over 200,000 books scanned –Over 50,000+ accessible on the web –Uses 4TB of storage –10 TB server at CMU Library –500,000 books by the end of 2006 –Capacity to scan a million pages a day

Million Book Project: Research Challenges Providing Access to Billions everyday –Distributed Cached Servers in every country and region Easy to use interfaces for Billions Multilingual Information Retrieval Translation Summarization Reading Assistant using Multi Lingual Speech Synthesis and Translation (e.g. for news paper DL)

Bringing the World Closer: Robust Communication among the People of the World

Vision Preservation of minority languages, cultures and heritage Study of Human Language including –Translation –Summarization –Speech –Search Facilitate the use ICT in languages other than English –In communication among uneducated people of the world –In commerce –Search and access to knowledge across all languages Globalization requires cross-border and cross-language communication Eliminate cultural and social barriers Language barriers can significantly slow down the economic growth Access to rare (and potentially beneficial) knowledge requires eliminating the language divide

Research Agenda: What we must do Create technologies and solutions for overcoming the language barrier Create toolkits for rapid acquisition of new language capabilities –Character codes, optical character recognition, speech recognition, speech synthesis, translation, search engines, text mining, summarization, language tutoring, etc. Capture data, information and knowledge from masses Make fundamental advances in language processing algorithms, e.g., –Deal with 1000 times more data –Conceptual advance in semantic information retrieval

The Research Plan: How we will do it Analogy to Human Genome Project Meticulous core-science based fundamentals Researcher toolkits for known methodologies Architecture supporting diversity of methodologies Long planning horizon to support development of novel and radical approaches Quantitative evaluation against a standard of steadily accumulating improvements in performance