Natural Language Processing Regina Barzilay. What is NLP? Goal: intelligent processing of human language – Not just effective string matching Applications.

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

Configuration management
University of Sheffield NLP Module 4: Machine Learning.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
Introduction to Natural Language Processing Hongning Wang
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Advanced AI - Part II Luc De Raedt University of Freiburg WS 2004/2005 Many slides taken from Helmut Schmid.
The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Mon 3-4 TA: Fadi Biadsy 702 CEPSR,
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Document Centered Approach to Text Normalization Andrei Mikheev LTG University of Edinburgh SIGIR 2000.
Statistical Natural Language Processing Advanced AI - Part II Luc De Raedt University of Freiburg WS 2005/2006 Many slides taken from Helmut Schmid.
Natural Language Processing Prof: Jason Eisner Webpage: syllabus, announcements, slides, homeworks.
Natural Language Processing Ellen Back, LIS489, Spring 2015.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
SI485i : NLP Day 1 Intro to NLP. Assumptions about You You know… how to program Java basic UNIX usage basic probability and statistics (we’ll also review)
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ELN – Natural Language Processing Giuseppe Attardi
1 Ling 569: Introduction to Computational Linguistics Jason Eisner Johns Hopkins University Tu/Th 1:30-3:20 (also this Fri 1-5)
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Artificial Intelligence Introduction (2). What is Artificial Intelligence ?  making computers that think?  the automation of activities we associate.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
April 2008Historical Perspectives on NLP1 Historical Perspectives on Natural Language Processing Mike Rosner Dept Artificial Intelligence
6. N-GRAMs 부산대학교 인공지능연구실 최성자. 2 Word prediction “I’d like to make a collect …” Call, telephone, or person-to-person -Spelling error detection -Augmentative.
1 Computational Linguistics Ling 200 Spring 2006.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
UCREL: from LOB to REVERE Paul Rayson. November 1999CSEG awayday Paul Rayson2 A brief history of UCREL In ten minutes, I will present a brief history.
CHAPTER 13 NATURAL LANGUAGE PROCESSING. Machine Translation.
Introduction to CL & NLP CMSC April 1, 2003.
Presenter: Shanshan Lu 03/04/2010
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Research Topics CSC Parallel Computing & Compilers CSC 3990.
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
1 CS 385 Fall 2006 Chapter 1 AI: Early History and Applications.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
Introduction Chapter 1 Foundations of statistical natural language processing.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
CS 188: Artificial Intelligence Spring 2009 Natural Language Processing Dan Klein – UC Berkeley 1.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
The Unreasonable Effectiveness of Data
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Introduction to Natural Language Processing Hongning Wang
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
PRESENTED BY: PEAR A BHUIYAN
Machine Learning in Natural Language Processing
LING/C SC 581: Advanced Computational Linguistics
Computational Linguistics: New Vistas
Language and Learning Introduction to Artificial Intelligence COS302
A User study on Conversational Software
Information Retrieval
Presentation transcript:

Natural Language Processing Regina Barzilay

What is NLP? Goal: intelligent processing of human language – Not just effective string matching Applications of NLP technology: – Less ambitious (but practical goals): spelling corrections, name entity extraction – Ambitious goals: machine translations, language-based UI, summarization, question-answering

NLP is AI-complete All the difficult problems in artificial intelligence manifest themselves in NLP problems Turing Test: links machine intelligence with the ability to process language The interrogator C needs to determine which player - A or B - is a computer and which is a human.

Passing Turing Test Turing (1950): “I believe that in about fifty years’ it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.” ELIZA (Weizenbaum, 1966): first computer dialogue system based on keyword matching young woman: Men are all alike. eliza: In what way? young woman: They're always bugging us about something specific or other. eliza: Can you think of a specific example? young woman: Well, my boyfriend made me come here. eliza: Your boyfriend made you come here?

Speech Processing Automatic Speech Recognition (ASR): – Performance: 0.3% for digital string, 5% for dictation, 50%+TV Text to Speech (TTS): – Performance: totally intelligible (if sometimes unnatural)

Information Extraction Goal: Build database entries from text Simple Task: Named Entity Extraction

Information Extraction 10TH DEGREE is a full service advertising agency specializing in direct and interactive marketing. Located in Irvine CA, 10TH DEGREE is looking for an Assistant Account Manager to help manage and coordinate interactive marketing initiatives for a marquee automative account. Experience in online marketing, automative and/or the advertising field is a plus. Assistant Account Manager Responsibilities Ensures smooth implementation of programs and initiatives Helps manage the delivery of projects and key client deliverables … Compensation: $50,000-\$80,000 INDUSTRYAdvertising POSITIONAssist. Account Manag. LOCATIONIrvine, CA COMPANY10 th DEGREE Goal: Build database entries from text More advanced: Multi-sentence Template IE

Question Answering Find answers to general comprehension question in a document collection

Machine Translation

Google Translation

Deciphering Ugaritic Family: Northwest Semitic Tablets from: 14 th – 12 th century BCE Discovered: 1928 Deciphered: 1932 (by WW1 code breakers) Large portion of vocabulary covered by cognates with Semitic languages Arabic:malik مَلِك Syriac:malkā ܡܰܠܟ݁ܳܐ Hebrew:melek מֶלֶךְ Ugaritic:malku Task: Translate by identifying cognates Corpus: 34,105 tokens, 7,386 unique types

Why are these funny? Iraqi Head Seeks Arms Ban on Nude Dancing on Governor’s Desk Juvenile Court to Try Shooting Defendant Teacher Strikes Idle Kids Stolen Painting Found by Tree Kids Make Nutritious Snaks Local HS Dropout Cut in Half Hospitals Are Sued by 7 Foot Doctors

Why NLP is Hard? ( example from L.Lee ) ``At last, a computer that understands you like your mother''

Ambiguity at Syntactic Level Different structures lead to different interpretations

Ambiguity at Semantic Level “Alice says they've built a computer that understands you like your mother” Two definitions of mother: female parent a stringy slimy substance consisting of yeast cells and bacteria; is added to cider or wine to produce vinegar This is an instance of word sense disambiguation

Ambiguity at Discourse Level Alice says they've built a computer that understands you like your mother but she … doesn’t know any details … doesn’t understand me at all This is an instance of anaphora, where “she” co-refers to some other discourse entity

Ambiguity Varies Across Languages Tokenization English:in the country Hebrew: בארצי Easy task in English: space separator delineates words. Challenging for Semitic Languages Named Entity Detection English:She saw Jacob … Hebrew: היא ראתה את יעקב Easy task in English: capitalization is a strong hint. Challenging for Semitic languages.

Knowledge Bottleneck in NLP We need: Knowledge about language Knowledge about the world Possible solutions: Symbolic approach: encode all the required information into computer Statistical approach: infer language properties from language samples

Symbolic Era: Crowning Achievement

The Internals of SHRDLU Requires elaborate manually encoded knowledge representation

NLP History: Symbolic Era “ Colorless green ideas sleep furiously. Furiously sleep ideas green colorless. It is fair to assume that neither sentence (1) nor (2) (nor indeed any part of these sentences) had ever occurred in an English discourse. Hence, in any statistical model for grammaticalness, these sentences will be ruled out on identical grounds as equally "remote" from English. Yet (1), though nonsensical, is grammatical, while (2) is not.” (Chomsky 1957) 1970’s and 1980’s: statistical NLP is in disfavor emphasis on deeper models, syntax toy domains/manually developed grammars (SHRDLU, LUNAR) weak empirical evaluation

NLP History: Statistical Era “ Whenever I fire a linguist our system performance improves. ” (Jelinek 1988) 1990’s: The Empirical Revolution Corpus-based methods yield the first generation of NL tools (syntax, MT, ASR) Deep analysis is often traded for robust approximations Empirical evaluation is crucial 2000’s: Richer linguistic representations embedded in the statistical framework

Case Study: Determiner Placement Task: Automatically place determiners a, the, null in a text Scientists in United States have found way of turning lazy monkeys into workaholics using gene therapy. Usually monkeys work hard only when they know reward is coming, but animals given this treatment did their best all time. Researchers at National Institute of Mental Health near Washington DC, led by Dr Barry Richmond, have now developed genetic treatment which changes their work ethic markedly. "Monkeys under influence of treatment don't procrastinate," Dr Richmond says. Treatment consists of anti-sense DNA - mirror image of piece of one of our genes - and basically prevents that gene from working. But for rest of us, day when such treatments fall into hands of our bosses may be one we would prefer to put off.

Relevant Grammar Rules Determiner placement is largely determined by: – Type of noun (countable, uncountable) – Uniqueness of reference – Information value (given, new) – Number (singular, plural) However, many exceptions and special cases play a role: – The definite article is used with newspaper titles (The Times), but zero article in names of magazines and journals (Time) Hard to manually encode this information!

Statistical Approach: Determiner Placement Simple approach: Collect a large collection of texts relevant to your domain (e.g. newspaper text) For each noun seen during training, compute its probability to take a certain determiner Given a new noun, select a determiner with the highest likelihood as estimated on the training corpus

Determiner Placement as Classification Prediction: {``the'', ``a'', ``null''} Representation of the problem: – plural? (yes, no) – first appearance in text? (yes, no) – head token (vocabulary) Plural?First appearance? TokenDeterminer noyesdefendantthe yesnocarsnull no FBIthe Goal: Learn classification function that can predict unseen examples

Does it work? Implementation details: – Training --- first 21 sections of the Wall Street Journal corpus, testing -- the 23th section – Prediction accuracy: 71.5% The results are not great, but surprisingly high for such a simple method – A large fraction of nouns in this corpus always appear with the same determiner ``the FBI'', ``the defendant''

Corpora Corpus: a collection of annotated or raw text Antique corpus: Rosetta Stone Examples of corpora used in NLP today : Penn Treebank: 1M words of parsed text Brown Corpus: 1M words of tagged text North American News: 300M words The Web

Corpus for MT

Corpus for Parsing Canadian Utilities had 1988 revenue of $ 1.16 billion, mainly from its natural gas and electric utility businesses in Alberta, where the company serves about 800,000 customers.

Ambiguities

Problem: Scale

Problem: Sparsity

The NLP Cycle Get a corpus Build a baseline model Repeat: – Analyze the most common errors – Find out what information could be helpful – Modify the model to exploit this information – Use new features – Change the structure of the model – Employ new machine learning method