Natural Language Processing (NLP) Overview and history of the field Knowledge of language The role of ambiguity Models and Algorithms Eliza, Turing, and.

Slides:



Advertisements
Similar presentations
Natural Language Processing (or NLP) Reading: Chapter 1 from Jurafsky and Martin, Speech and Language Processing: An Introduction to Natural Language Processing,
Advertisements

Dr. Radhika Mamidi ENG 270 Lecture 2. History: ’s Major influences on the development of CL -Development of formal language theory (Chomsky,
Introduction to Computational Linguistics Dr. Radhika Mamidi ENG 270 Lecture 2.
Language Processing Technology Machines and other artefacts that use language.
Leksička semantika i pragmatika 5. predavanje. Ambiguity Find at least 5 meanings of this sentence: –I made her duck I cooked waterfowl for her benefit.
Introduction to Natural Language Processing A.k.a., “Computational Linguistics”
Language Perception and Comprehension
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
Chapter 1. Introduction to NLP
Leksička semantika i pragmatika 6. predavanje. Headlines Police Begin Campaign To Run Down Jaywalkers Iraqi Head Seeks Arms Teacher Strikes Idle Kids.
Oct 2009HLT1 Human Language Technology Overview. Oct 2009HLT2 Acknowledgement Material for some of these slides taken from J Nivre, University of Gotheborg,
Introduction to Semantics and Pragmatics. LING NLP 2 NLP tends to focus on: Syntax – Grammars, parsers, parse trees, dependency structures.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for Natural Language Processing Ling 571 January 3, 2011 Gina-Anne Levow.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
By Rohana Mahmud (NLP week 1-2)
Big Ideas in Cmput366. Search Blind Search Iterative deepening Heuristic Search A* Local and Stochastic Search Randomized algorithm Constraint satisfaction.
CMSC 723 / LING 645: Intro to Computational Linguistics September 8, 2004: Monz Regular Expressions and Finite State Automata (J&M 2) Prof. Bonnie J. Dorr.
CSE (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
CMSC 723: Intro to Computational Linguistics Lecture 2: February 4, 2004 Regular Expressions and Finite State Automata Professor Bonnie J. Dorr Dr. Nizar.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
What is Natural Language Processing (NLP)
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Lecture 2, 7/22/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 2 22 July 2005.
9/8/20151 Natural Language Processing Lecture Notes 1.
Search and Decoding in Speech Recognition
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 Natural Language Processing Gholamreza Ghassem-Sani.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
1 Computational Linguistics Ling 200 Spring 2006.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
LING 388: Language and Computers Sandiway Fong Lecture 30 12/8.
Chapter 2. Regular Expressions and Automata From: Chapter 2 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition,
March 1, 2009 Dr. Muhammed Al-mulhem 1 ICS 482 Natural Language Processing Regular Expression and Finite Automata Muhammed Al-Mulhem March 1, 2009.
Suléne Pilon & Danie Prinsloo Overview: Teaching and Training in South Africa 25 November 2008;
1 LING 6932 Spring 2007 LING 6932 Topics in Computational Linguistics Hana Filip Lecture 1: Introduction to Field, History, Quick Review of Regular Expressions,
CS 8520: Artificial Intelligence Natural Language Processing Introduction Paula Matuszek Fall, 2008.
Introduction to CL & NLP CMSC April 1, 2003.
Text Language Technology Natural Language Understanding Natural Language Generation Speech Recognition Speech Synthesis Text Meaning Speech.
CS 124/LINGUIST 180 From Languages to Information
1 Regular Expressions and Automata CPE 641 Natural Language Processing from Kathy McCoy’s slides, CISC 882 Introduction to NLP
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Basic Text Processing Regular Expressions. Dan Jurafsky 2 The original slides from: tml Some changes.
ICS 482: Natural language Processing Pre-introduction
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
1 LING 6932 Spring 2007 LING 6932 Topics in Computational Linguistics Hana Filip Lecture 2: Regular Expressions, Finite State Automata.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571 January 6, 2014 Gina-Anne Levow.
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571 January 5, 2015 Gina-Anne Levow.
Chapter1 Introduction to NLP, CL, and Speech Recognition Hae-Chang Rim.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571 January 4, 2016 Gina-Anne Levow.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Natural Language Processing Vasile Rus
Natural Language Processing [05 hours/week, 09 Credits] [Theory]
/208/.
CS4705 Natural Language Processing
LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing Dan Jurafsky 1/16/2019 LING 138/238 Autumn 2004.
Natural Language Processing
CPSC 503 Computational Linguistics
Artificial Intelligence 2004 Speech & Natural Language Processing
Natural Language Processing
Natural Language Processing (NLP)
Presentation transcript:

Natural Language Processing (NLP) Overview and history of the field Knowledge of language The role of ambiguity Models and Algorithms Eliza, Turing, and conversational agents History of speech and language processing Regular expressions

Computer Speech and Language Processing What is it? Getting computers to perform useful tasks involving human languages whether for: –Enabling human-machine communication –Improving human-human communication –Doing stuff with language objects Examples: –Question Answering –Machine Translation –Spoken Conversational Agents

Knowledge needed to build speaking computer Speech recognition and synthesis Dictionaries (how words are pronounced) Phonetics (how to recognize/produce each sound of English) Natural language understanding Knowledge of the English words involved –What they mean –How they combine (what is a `pod bay door’?) Knowledge of syntactic structure –I’m I do, Sorry that afraid Dave I’m can’t

Eliza: Weizenbaum (1966) User: You are like my father in some ways. ELIZA: What resemblance do you see User: You are not very aggressive but I think you don’t want me to notice that. ELIZA: What makes you think I am not aggressive User: You don’t argue with me ELIZA: Why do you think I don’t argue with you User: You are afraid of me ELIZA: Does it please you to believe I am afraid of you

Ambiguity Computational linguists are obsessed with ambiguity Ambiguity is a fundamental problem of computational linguistics Resolving ambiguity is a crucial goal

Ambiguity Find at least 5 meanings of this sentence: I made her duck

Ambiguity Find at least 5 meanings of this sentence: I made her duck I cooked waterfowl for her benefit (to eat) I cooked waterfowl belonging to her I created the (plaster?) duck she owns I caused her to quickly lower her head or body I waved my magic wand and turned her into undifferentiated waterfowl

Ambiguity is Pervasive I caused her to quickly lower her head or body Lexical category : “duck” can be a N or V I cooked waterfowl belonging to her. Lexical category: “her” can be a possessive (“of her”) or dative (“for her”) pronoun I made the (plaster) duck statue she owns Lexical Semantics: “make” can mean “create” or “cook”

Ambiguity is Pervasive Grammar: Make can be: Transitive: (verb has a noun direct object) –I cooked [waterfowl belonging to her] Ditransitive: (verb has 2 noun objects) –I made [her] (into) [undifferentiated waterfowl] Action-transitive (verb has a direct object and another verb) - I caused [her] [to move her body]

Ambiguity is Pervasive Phonetics! I mate or duck I’m eight or duck Eye maid; her duck Aye mate, her duck I maid her duck I’m aid her duck I mate her duck I’m ate her duck I’m ate or duck I mate or duck

Models and Algorithms Models: formalisms used to capture the various kinds of linguistic structure. State machines (fsa, transducers, markov models) Formal rule systems (context-free grammars, feature systems) Logic (predicate calculus, inference) Probabilistic versions of all of these + others (gaussian mixture models, probabilistic relational models, etc etc) Algorithms used to manipulate representations to create structure. Search (A*, dynamic programming) Supervised learning, etc etc

Language, Thought, Understanding A Gedanken Experiment: Turing Test Question “can a machine think” is not operational. Operational version: 2 people and a computer Interrogator talks to contestant and computer via teletype Task of machine is to convince interrogator it is human Task of contestant is to convince interrogator she and not machine is human.

History: foundational insights 1940s-1950s Automaton: Turing 1936 McCulloch-Pitts neuron (1943) – cpits/html/ cpits/html/ Kleene (1951/1956) Shannon (1948) link between automata and Markov models Chomsky (1956)/Backus (1959)/Naur(1960): CFG Probabilistic/Information-theoretic models Shannon (1948) Bell Labs speech recognition (1952)

History: the two camps: Symbolic Zellig Harris 1958 TDAP first parser –Cascade of finite-state transducers Chomsky AI workshop at Dartmouth (McCarthy, Minsky, Shannon, Rochester) Newell and Simon: Logic Theorist, General Problem Solver Statistical Bledsoe and Browning (1959): Bayesian OCR Mosteller and Wallace (1964): Bayesian authorship attribution Denes (1959): ASR combining grammar and acoustic probability

Four paradigms: Stochastic Hidden Markov Model 1972 –Independent application of Baker (CMU) and Jelinek/Bahl/Mercer lab (IBM) following work of Baum and colleagues at IDA Logic-based Colmerauer (1970,1975) Q-systems Definite Clause Grammars (Pereira and Warren 1980) Kay (1979) functional grammar, Bresnan and Kaplan (1982) unification Natural language understanding Winograd (1972) Shrdlu Schank and Abelson (1977) scripts, story understanding Influence of case-role work of Fillmore (1968) via Simmons (1973), Schank. Discourse Modeling Grosz and colleagues: discourse structure and focus Perrault and Allen (1980) BDI model

Finite State Approach Finite State Models Kaplan and Kay (1981): Phonology/Morphology Church (1980): Syntax Return of Probabilistic Models: Corpora created for language tasks Early statistical versions of NLP applications (parsing, tagging, machine translation) Increased focus on methodological rigor: –Can’t test your hypothesis on the data you used to build it! –Training sets and test sets

The field comes together: NLP has borrowed statistical modeling from speech recognition, is now standard: ACL conference: –1990: 39 articles 1 statistical – articles 48 statistical Machine learning techniques key NLP has borrowed focus on web and search and “bag of words models” from information retrieval Unified field: NLP, MT, ASR, TTS, Dialog, IR

Regular expressions A formal language for specifying text strings How can we search for any of these? woodchuck woodchucks Woodchuck Woodchucks

Regular Expressions Basic regular expression patterns Perl-based syntax (slightly different from other notations for regular expressions) Disjunctions /[wW]oodchuck/

Regular Expressions Ranges [A-Z] Negations [^Ss]

Regular Expressions Optional characters ?,* and + ? (0 or 1) –/colou?r/  color or colour * (0 or more) –/oo*h!/  oh! or Ooh! or Ooooh! – + (1 or more) /o+h!/  oh! or Ooh! or Ooooh!  Wild cards. - /beg.n/  begin or began or begun

Regular Expressions Anchors ^ and $ /^[A-Z]/  “Ramallah, Palestine” /^[^A-Z]/  “¿verdad?” “really?” /\.$/  “It is over.” /.$/  ? Boundaries \b and \B /\bon\b/  “on my way” “Monday” /\Bon\b/  “automaton” Disjunction | /yours|mine/  “it is either yours or mine”

Disjunction, Grouping, Precedence Column 1 Column 2 Column 3 … How do we express this? /Column [0-9]+ */ /(Column [0-9]+ +)*/ Precedence Parenthesis () Counters * + ? {} Sequences and anchors the ^my end$ Disjunction |

Example Find me all instances of the word “the” in a text. /the/ Misses capitalized examples /[tT]he/ –Returns other or theology /\b[tT]he\b/ /[^a-zA-Z][tT]he[^a-zA-Z]/ /(^|[^a-zA-Z])[tT]he[^a-zA-Z]/

Errors The process we just went through was based on fixing two kinds of errors Matching strings that we should not have matched (there, then, other) –False positives Not matching things that we should have matched (The) –False negatives

More complex RE example Regular expressions for prices /$[0-9]+/ Doesn’t deal with fractions of dollars /$[0-9]+\.[0-9][0-9]/ Doesn’t allow $199, not word-aligned \b$[0-9]+(\.[0-9]0-9])?\b)

Advanced operators