Introduction to Computational Linguistics Jay Munson (special thanks to Misty Azara) May 30, 2003.

Slides:



Advertisements
Similar presentations
National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory
Advertisements

Introduction to Computational Linguistics
Introduction to Computational Linguistics
Natural Language Understanding Difficulties: Large amount of human knowledge assumed – Context is key. Language is pattern-based. Patterns can restrict.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Language and Cognition Colombo, June 2011 Day 8 Aphasia: disorders of comprehension.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Oct 2009HLT1 Human Language Technology Overview. Oct 2009HLT2 Acknowledgement Material for some of these slides taken from J Nivre, University of Gotheborg,
Natural Language Generation: Discourse Planning
Chapter 20: Natural Language Generation Presented by: Anastasia Gorbunova LING538: Computational Linguistics, Fall 2006 Speech and Language Processing.
Natural Language Generation Research Presentation Presenter Shamima Mithun.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Introduction to Computational Linguistics Lecture 2.
Research about dialogue and dialogue systems and the department of linguistics goal: –develop theories about human dialogue which can be used when building.
1 Phonetics Study of the sounds of Speech Articulatory Acoustic Experimental.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
1 Speech synthesis 2 What is the task? –Generating natural sounding speech on the fly, usually from text What are the main difficulties? –What to say.
Lecture 1 Introduction: Linguistic Theory and Theories
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
Natural Language Understanding
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Some Voice Enable Component Group member: CHUAH SIONG YANG LIM CHUN HEAN Advisor: Professor MICHEAL Project Purpose: For the developers,
Introduction to Natural Language Generation
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
9/8/20151 Natural Language Processing Lecture Notes 1.
04/08/04 Why Speech Synthesis is Hard Chris Brew The Ohio State University.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Parser-Driven Games Tool programming © Allan C. Milne Abertay University v
Machine Translation, Digital Libraries, and the Computing Research Laboratory Indo-US Workshop on Digital Libraries June 23, 2003.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
1 Computational Linguistics Ling 200 Spring 2006.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC.
Natural Language Processing Rogelio Dávila Pérez Profesor – Investigador
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Language Technology I © 2005 Hans Uszkoreit Language Technology I 2005/06 Hans Uszkoreit Universität des Saarlandes and German Research Center for Artificial.
Introduction to CL & NLP CMSC April 1, 2003.
Shijian Lu and C écile Paris CSIRO ICT Centre Sydney, Australia Authoring Content Structure for Adaptive Documents.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
Introduction to Computational Linguistics
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Volgograd State Technical University Applied Computational Linguistic Society Undergraduate and post-graduate scientific researches under the direction.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Language in Cognitive Science. Research Areas for Language Computational models of speech production and perception Signal processing for speech analysis,
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
NATURAL LANGUAGE PROCESSING
Basics of Natural Language Processing Introduction to Computational Linguistics.
1 Artificial Intelligence & Prolog Programming CSL 302.
Welcome to All S. Course Code: EL 120 Course Name English Phonetics and Linguistics Lecture 1 Introducing the Course (p.2-8) Unit 1: Introducing Phonetics.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Natural Language Processing and Speech Enabled Applications
The toolbox for language description Kuiper and Allan 1.2
SECOND LANGUAGE LISTENING Comprehension: Process and Pedagogy
Natural Language Processing
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 3 prof. ssa Laura Liucci –
Presentation transcript:

Introduction to Computational Linguistics Jay Munson (special thanks to Misty Azara) May 30, 2003

Today’s Goals I. Introduction to Computational Linguistics (CL) through the discussion of 7 CL core areas. II. Identify Common CL applications III. Identify the importance of theoretical linguistics in CL

What is Computational Linguistics? Essentially, CL is any task, model, algorithm, etc. that attempts to place any type of language processing (syntax, phonology, morphology, etc.) in a computational setting

What is Computational Linguistics (CL)? CL is interdisciplinary Linguistics Computer Science Mathematics Electrical Engineering Speech and Hearing Science

Seven Core Areas of CL 1. Machine Translation 2. Speech Recognition 3. Text-to-Speech 4. Natural Language Generation 5. Human-Computer Dialogs 6. Information Retrieval 7. Computational Modeling

1.0 Machine Translation (MT) Using computers to automate some or all of translating from one language to another

1.1 MT (cont.) Three general models or tasks: Tasks for which a rough translation is adequate Tasks where a human post-editor can be used to improve the output Tasks limited to a small sublanguage

1.2 MT (cont.) Linguistic knowledge is extremely useful in this area of CL MT benefits from knowledge of language typology and language- specific linguistic information Programs are typically “trained” using pre-translated documents/texts.

1.3 MT Example KANT Knowledge-based Machine Translation KANT The KANT project, Knowledge-based, Accurate Translation for technical documentation, was founded in 1989 for the research and development of large-scale, practical translation systems for technical documentation. KANT uses a controlled vocabulary and grammar for each source language, and explicit yet focused semantic models for each technical domain to achieve very high accuracy in translation. Designed for multilingual document production, KANT has been applied to the domains of electric power utility management and heavy equipment technical documentation.

2.0 Speech Recognition (SR) Taking spoken language as input and outputting the corresponding text

2.1 SR - Architecture SR takes the source speech and produces “guesses” as to which words could correspond to the source via some type of acoustic model The word with the highest probability is selected as the optimal candidate Contexts are “contained” to improve accuracy

2.2 Why use SR? Allow for hands-free human-computer interaction Assists in automated telephony

3.0 Text-to-Speech (TTS) Taking text as input and outputting the corresponding spoken language

3.1 Three types of TTS 1. Articulatory- models the physiological characteristics of the vocal tract 2. Concatenative- uses pre-recorded segments to construct the utterance(s) ScanSoft: Jennifer and Susana Speechify: British Female

3.2 Three types of TTS (cont.) 3. Parametric/Formant- models the formant transitions of speech ETI-Eloquence: Reed

3.3 Why is TTS so difficult? Spelling through, rough, though, thought Homonyms PERmit (n) vs. perMIT (v) Prosody (dependent on context) Pitch, duration of segments, phrasing of segments, intonational tune, emotion “I am so angry at you. I have never been more enraged in my life!!”

3.4 Why use TTS? Allows for text to be read automatically Extremely useful for the visually impaired and the hearing impaired. For a review of the history of TTS until 1987 with sound files, goto: latt/ latt/

4.0 Natural Language Generation (NLG) Constructing linguistic outputs from non-linguistic inputs; the NLG goal is to produce natural language from internal data/structure.

4.1 Natural language generation (cont) Maps meaning to text Nature of the input varies greatly from one application to another (i.e documenting structure of a computer program) The job of the NLG system is to extract the necessary information to drive the generation process

4.2 NLG systems have to make choices: Content selection- the system must choose the appropriate content for input, basing its decision on a pre- specified communicative goal Lexical selection- the system must choose the lexical item most appropriate for expressing a concept

4.3 NLG (cont) Sentence Structure Aggregation- the system must apportion the content into phrase, clause, and sentence-sized chunks Referential expression- the system must determine how to refer to the objects under discussion (not a trivial task).

4.4 NLG - Structures Discourse structure- many NLG systems have to deal with multi-sentence discourses, which must have a coherent structure

4.5 Sample NLG output To save a file 1. Choose save from the file menu 2. Choose the appropriate folder 3. Type the file name 4. Click the save button The system will save the document. …

5.0 Human-Computer Dialogs Uses a mix of SR, TTS, and pre- recorded prompts to achieve some goal

5.1 Human-Computer Dialogs Uses speech recognition, or a combination of SR and touch tone as input to the system The system processes the spoken information and outputs appropriate TTS or pre-recorded prompts

5.2 Human-Computer Dialogs Dialog systems have specific tasks, which limit the domain of conversation This makes the SR problem much easier, as the potential responses become very constrained

5.3 Sample dialog system for banking … Sys: would you like information for checking or savings? User: Checking, please. Sys: Your current balance is $2, Would you like another transaction? User: Yes, has check #2431 cleared? …

5.4 Linguistic knowledge in dialog systems Discourse structure- ensuring natural flowing discourse interaction Building appropriate vocabularies/lexicons for the tasks Ensuring prosodic consistencies (i.e. questions sound like questions and spliced prompts sound continuous)

5.5 Why use human-computer systems? Automate simple tasks- no need for a teller to be on the other end of the line! Allow access to system information from anywhere, via the telephone

6.0 Information Retrieval Storage, analysis, and retrieval of text documents

6.1 Information Retrieval (IR) Most current IR systems are based on some interpretation of “compositional semantics” (e.g. the meaning of the whole is based the meaning of its parts and their combination). IR is the core of web-based searching, i.e. Google, Altavista, etc.

6.2 IR - Architecture User inputs a word or string of words System processes the words and retrieves documents corresponding to the request

6.3 “Bag of Words” The dominant approach to IR systems is to ignore syntactic information and process the meaning of individual words only Thus, “I see what I eat” and “I eat what I see” would mean exactly the same thing to the system!

6.4 Linguistic Knowledge in IR Semantics Compositional Lexical Syntax (depending on the model used)

7.0 Computational Modeling Computational approaches to problem solving, modeling, and development of theories

7.1 How can we use computational modeling? Develop working models of language evolution Model speech perception, production, and processing Almost any theoretical model can have a computational counterpart

7.2 Why Use Computational Modeling? Forces explicitness – no black boxes or behind the scenes “magic” Allows us to test our formal theories given a large amount of data Allows for enhancements in technology and benefits to society through the implementaions of models.

Conclusions CL applications utilize linguistic knowledge from all of the major subfields of theoretical linguistics (e.g. theory is necessary!) Computational modeling can aid/test linguists’ theories of language processing and structure

Conclusions - Review of 7 core areas in CL 1. Machine Translation 2. Speech Recognition 3. Text-to-Speech 4. Natural Language Generation 5. Human-Computer Dialogs 6. Information Retrieval 7. Computational Modeling

Conclusions – Review of Today’s Goals I. Introduction to Computational Linguistics (CL) through the discussion of 7 CL core areas. II. Identify Common CL applications III. Identify the importance of theoretical linguistics in CL

El fin.