1 Computational Linguistics Ling 200 Spring 2006.

Slides:



Advertisements
Similar presentations
Natural Language Processing (or NLP) Reading: Chapter 1 from Jurafsky and Martin, Speech and Language Processing: An Introduction to Natural Language Processing,
Advertisements

Introduction to Computational Linguistics
Introduction to Computational Linguistics
Language Processing Technology Machines and other artefacts that use language.
Leksička semantika i pragmatika 5. predavanje. Ambiguity Find at least 5 meanings of this sentence: –I made her duck I cooked waterfowl for her benefit.
Introduction to Natural Language Processing A.k.a., “Computational Linguistics”
Language and Cognition Colombo, June 2011 Day 8 Aphasia: disorders of comprehension.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Oct 2009HLT1 Human Language Technology Overview. Oct 2009HLT2 Acknowledgement Material for some of these slides taken from J Nivre, University of Gotheborg,
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
PSY 369: Psycholinguistics Some basic linguistic theory part3.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
1/16 LELA Language and Computers Harold Somers Professor of Language Engineering.
Computational Language Andrew Hippisley. Computational Language Computational language and AI Language engineering: applied computational language Case.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Natural Language Understanding
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
9/8/20151 Natural Language Processing Lecture Notes 1.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Computational Linguistics INTroduction
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
1 The Ferret Copy Detector Finding short passages of similar texts in large document collections Relevance to natural computing: System is based on processing.
Lecture 2 What Is Linguistics.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Natural Language Processing Rogelio Dávila Pérez Profesor – Investigador
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
NLP And The Semantic Web Dainis Kiusals COMS E6125 Spring 2010.
1 University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Language Technology I © 2005 Hans Uszkoreit Language Technology I 2005/06 Hans Uszkoreit Universität des Saarlandes and German Research Center for Artificial.
Introduction to CL & NLP CMSC April 1, 2003.
Natural Language Processing Daniele Quercia Fall, 2000.
Levels of Language 6 Levels of Language. Levels of Language Aspect of language are often referred to as 'language levels'. To look carefully at language.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
Introduction to Computational Linguistics
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Lecture 1 Lec. Maha Alwasidi. Branches of Linguistics There are two main branches: Theoretical linguistics and applied linguistics Theoretical linguistics.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.
Distinctively Visual. Your task Define/describe what each symbol represents. Write down the first few things that pop into your mind.
Natural Language Processing (NLP)
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Chapter 1 Introduction PHONOLOGY (Lane 335). Phonetics & Phonology Phonetics: deals with speech sounds, how they are made (articulatory phonetics), how.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571 January 4, 2016 Gina-Anne Levow.
INTRODUCTION TO APPLIED LINGUISTICS
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Natural Language Processing (NLP)
The toolbox for language description Kuiper and Allan 1.2
Retrieval of audio testimonials via voice search
David Cyphert CS 2310 – Software Engineering
Natural Language Processing
Natural Language Processing (NLP)
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Natural Language Processing (NLP)
Presentation transcript:

1 Computational Linguistics Ling 200 Spring 2006

2 Speech and language processing Computational Linguistics  use of computers to facilitate linguistic research Natural Language Processing  computer-natural language interface applications

3 Combines disciplines Linguistics  e.g. grammar engineering Electrical Engineering  e.g. speech recognition Computer science  e.g. machine translation Psychology  e.g. cognitive modeling

4 2 Minute question (part 1) List the specific language related skills HAL exhibits. In other words, list the different abilities the computer (HAL) must have to display human-like language?

5

6 Today’s goals Convey:  some areas of research  some of the difficulties involved  some development strategies Provide examples of particular technologies as illustration

7 Computerized natural language speech recognition language understanding language generation speech synthesis

8 Other areas of interest searching  understanding search request  finding relevant documents  ordering by degree of relevance information extraction  retrieving information from documents data mining  discovering patterns and relationships in data

9...and still more topics machine translation   summarization grammar checking spell checking

10 Commonly used tools formal rule systems computational search algorithms formal logic probability theory machine learning techniques

11 Speech Recognition Demo Software Used: iListen from MacSpeech

12 What is Speech Recognition? Definition: Speech recognition turns acoustic input into strings of phonemes and then finds the best matching word in a database.  Can be built for open domain use, theoretically recognizing all possible strings of words e.g. dictation systems  Can also be built for a particular domain, recognizing small, finite sets of utterances e.g. automated call-centers.

13 Speech Recognition Acoustic Model First, the continuous speech signal is broken up into short segments. Segments are analyzed into features, which you can think of as quantitative versions of the phonetic features you learned in class. By comparing segments against internally stored phonological model, well matched phonemes are proposed for each segment End up with a list of most likely phoneme sequences.

14 Speech Recognition Language Model Sequences of phonemes are verified by comparing with a database of words and their likelihoods (in real time), and only actual words and phrases are accepted  [r ɛ kənajspič]  [r ɛ kənajspič] ‘recognize speech’  [r ɛ kənajspiš]  [r ɛ kənajspiš] ??‘recognize speesh’ *Fast speech: [z] -> [s] / _[s]

15 Problems Acoustic Model Recognizing different voice qualities as the same basic sounds. You can think of this as choosing the correct phoneme.  Phonemes sound different (allophones), depending on their environments. word position: /p/ --> [p h ] / #_ assimilation: /z/ --> [s] / _C [-voice] deletion: [s] --> ø / _[s]  “Three cats sit.” Speech signal is continuous and full of non-speech noise.

16 Problems Ambiguity Same or very similar sequence of phonemes can correspond to multiple words or phrases  Homophones Words  [dir] ‘deer’ ‘dear’ Phrases (remember there is no pause to separate word boundaries)  [r ɛ kənajspič] ‘recognize speech’  [r ɛ kənajspič] ‘wreck a nice beach’

17 Potential Fix Language Model Weight word/phrase interpretations (statistical language modeling) Lexical: Consider how often a word actually occurs.  [dir] ‘deer’ (50) ‘dear’ (215) Choose most frequent, in this case ‘dear’ Condition on context: Consider how often a word occurs within a particular context. I just shot a [dir]. (shot, a, dear) 1 (shot, a, deer) 10  In this case, ‘deer’ occurs more frequently in this environment, so we choose ‘deer’ as our interpretation.

18 Demo Training Data Matters Word and context frequencies are not just pulled from thin air. Frequencies are calculated (training)  From some collection of text (a corpus). Speech recognizers often train on a user’s s and documents, to better match the user’s lexical choice and phrase patterns. This training data helps decipher homophonous strings (strings that are acoustically ambiguous).

19 Demo 2 Training Data Matters I will attempt to utter the following phrase and iListen should transcribe my speech. It’s hard to…  [r ɛ kənajspič] ‘recognize speech’  [r ɛ kənajspič] ‘wreck a nice beach’

20 Demo 3 Linguist What if software is trained for a Computational Linguist?  Trained on 3 Wikipedia articles about various topics in Computational Linguistics  Which interpretation should we expect, based on words and phrases likely to be present in computational linguistics documents?  Results:Is hard to recognize speech New set the state  but is so bad and found a 544 is no sound better, even so it is etc is not really that bad so at and his exist listening to 89, Nancy of

21 Demo 4 Beach Bum What if software is trained for a Beach Bum?  Trained on 3 Wikipedia articles on beach topics.  Which interpretation should we expect, based on frequent words and phrases likely to be found in beach-related documents?  Results:It’s hard to wreck nice beach and

22 Language understanding morphology syntax semantics pragmatics discourse

23 "I made her duck.” I cooked waterfowl for her I cooked waterfowl belonging to her I created the (plaster?) duck she owns I caused her to quickly lower her head or body I waved my magic wand and turned her into undifferentiated waterfowl

24 Language generation “I'm sorry, Dave, I'm afraid I can't do that”  pragmatics: politeness indirect speech  morphology: contractions  discourse: reference (“that”)

25 Who/what is ELIZA?

26 Dialogue systems - issues HAL has complete understanding - How close are we to this? Eliza had no semantic understanding and only minimal syntactic knowledge dialogue systems: effective in limited domains like travel

27 Dialogue systems: demo [David] Chatbot website: 

28 2 minute question (part 2) Do you think that HAL quality computer communication is a reasonable expectation? Why or why not?