CS 188: Artificial Intelligence Spring 2009 Natural Language Processing Dan Klein – UC Berkeley 1.

Slides:



Advertisements
Similar presentations
Introduction to Computational Linguistics
Advertisements

A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
Probabilistic Language Processing Chapter 23. Probabilistic Language Models Goal -- define probability distribution over set of strings Unigram, bigram,
Universität des Saarlandes Seminar: Recent Advances in Parsing Technology Winter Semester Jesús Calvillo.
Leksička semantika i pragmatika 6. predavanje. Headlines Police Begin Campaign To Run Down Jaywalkers Iraqi Head Seeks Arms Teacher Strikes Idle Kids.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
CS 188: Artificial Intelligence Fall 2009 Advanced Applications: Robotics / Vision / Language Dan Klein – UC Berkeley Many slides from Sebastian Thrun,
CS 4705 Lecture 13 Corpus Linguistics I. From Knowledge-Based to Corpus-Based Linguistics A Paradigm Shift begins in the 1980s –Seeds planted in the 1950s.
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Mon 3-4 TA: Fadi Biadsy 702 CEPSR,
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 22, 2004.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
1 I256 Applied Natural Language Processing Fall 2009 Sentence Structure Barbara Rosario.
Natural Language Processing Prof: Jason Eisner Webpage: syllabus, announcements, slides, homeworks.
Natural Language Processing Ellen Back, LIS489, Spring 2015.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
SI485i : NLP Day 1 Intro to NLP. Assumptions about You You know… how to program Java basic UNIX usage basic probability and statistics (we’ll also review)
Statistical NLP Winter 2008 Lecture 1: Introduction Roger Levy (with grateful borrowing from Dan Klein)
Statistical NLP Spring 2010 Lecture 1: Introduction Dan Klein – UC Berkeley.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Introduction to NLP.
Introduction to Natural Language Processing Heshaam Faili University of Tehran.
1 Ling 569: Introduction to Computational Linguistics Jason Eisner Johns Hopkins University Tu/Th 1:30-3:20 (also this Fri 1-5)
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 Computational Linguistics Ling 200 Spring 2006.
Natural Language Processing Guangyan Song. What is NLP  Natural Language processing (NLP) is a field of computer science and linguistics concerned with.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
Abstract Question answering is an important task of natural language processing. Unification-based grammars have emerged as formalisms for reasoning about.
Statistical NLP Winter 2008 Lecture 1: Introduction Roger Levy большое спасибо to Dan Klein for slides.
CHAPTER 13 NATURAL LANGUAGE PROCESSING. Machine Translation.
CSE 517 Natural Language Processing Winter 2015 Introduction Yejin Choi Slides adapted from Dan Klein, Luke Zettlemoyer.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Research Topics CSC Parallel Computing & Compilers CSC 3990.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
NACLO 2008 North American Computation Linguistics Olympiad Brandeis CL Olympiad Team James Pustejovsky Tai Sassen-Liang Sharone Horowit-Hendler Noam Sienna.
Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Natural Language Processing Menu Based Natural Language Interfaces -Kyle Neumeier.
Compiler design Lecture 1: Compiler Overview Sulaimany University 2 Oct
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
NATURAL LANGUAGE PROCESSING Zachary McNellis. Overview  Background  Areas of NLP  How it works?  Future of NLP  References.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
/665 Natural Language Processing
CSC 594 Topics in AI – Natural Language Processing
Approaches to Machine Translation
Statistical NLP Spring 2010
Machine Learning in Natural Language Processing
Parsing and More Parsing
CS621/CS449 Artificial Intelligence Lecture Notes
Approaches to Machine Translation
CS246: Information Retrieval
David Kauchak CS159 – Spring 2019
PRESENTATION: GROUP # 5 Roll No: 14,17,25,36,37 TOPIC: STATISTICAL PARSING AND HIDDEN MARKOV MODEL.
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Statistical NLP Winter 2008
Presentation transcript:

CS 188: Artificial Intelligence Spring 2009 Natural Language Processing Dan Klein – UC Berkeley 1

What is NLP?  Fundamental goal: analyze and process human language, broadly, robustly, accurately…  End systems that we want to build:  Ambitious: speech recognition, machine translation, information extraction, dialog interfaces, question answering…  Modest: spelling correction, text categorization… 2

 Automatic Speech Recognition (ASR)  Audio in, text out  SOTA: 0.3% error for digit strings, 5% dictation, 50%+ TV  Text to Speech (TTS)  Text in, audio out  SOTA: totally intelligible (if sometimes unnatural) Speech Systems “Speech Lab” 3

Information Retrieval  General problem:  Given information needs, produce information  Includes, e.g. web search, question answering, and classic IR  Common case: web search 4 q = “Apple Computers”

Feature-Based Ranking q = “Apple Computers”

Learning to Rank  Setup  Optimize, e.g.: 6 … lots of variants are possible!

Information Extraction  Unstructured text to database entries  SOTA: perhaps 70% accuracy for multi-sentence temples, 90%+ for single easy fields New York Times Co. named Russell T. Lewis, 45, president and general manager of its flagship New York Times newspaper, responsible for all business-side activities. He was executive vice president and deputy general manager. He succeeds Lance R. Primis, who in September was named president and chief operating officer of the parent. startpresident and CEONew York Times Co.Lance R. Primis endexecutive vice president New York Times newspaper Russell T. Lewis startpresident and general manager New York Times newspaper Russell T. Lewis StatePostCompanyPerson 7

Document Understanding?  Question Answering:  More than search  Ask general comprehension questions of a document collection  Can be really easy: “What’s the capital of Wyoming?”  Can be harder: “How many US states’ capitals are also their largest cities?”  Can be open ended: “What are the main issues in the global warming debate?”  SOTA: Can do factoids, even when text isn’t a perfect match 8

Problem: Ambiguities  Headlines:  Teacher Strikes Idle Kids  Hospitals Are Sued by 7 Foot Doctors  Ban on Nude Dancing on Governor’s Desk  Iraqi Head Seeks Arms  Local HS Dropouts Cut in Half  Juvenile Court to Try Shooting Defendant  Stolen Painting Found by Tree  Kids Make Nutritious Snacks  Why are these funny?

Syntactic Analysis Hurricane Emily howled toward Mexico 's Caribbean coast on Sunday packing 135 mph winds and torrential rain and causing panic in Cancun, where frightened tourists squeezed into musty shelters. 10 [demo]

PCFGs  Natural language grammars are very ambiguous!  PCFGs are a formal probabilistic model of trees  Each “rule” has a conditional probability (like an HMM)  Tree’s probability is the product of all rules used  Parsing: Given a sentence, find the best tree – a search problem! ROOT  S375/420 S  NP VP. 320/392 NP  PRP127/539 VP  VBD ADJP 32/401 ….. 11

Summarization  Condensing documents  Single or multiple  Extractive or synthetic  Aggregative or representative  Even just shortening sentences  Very context- dependent!  An example of analysis with generation

Machine Translation  SOTA: much better than nothing, but more an understanding aid than a replacement for human translators  New, better methods Original Text Translated Text [demo] 13

Corpus-Based MT Modeling correspondences between languages Sentence-aligned parallel corpus: Yo lo haré mañana I will do it tomorrow Hasta pronto See you soon Hasta pronto See you around Yo lo haré pronto Novel Sentence I will do it soonI will do it around See you tomorrow Machine translation system: Model of translation

Levels of Transfer

MT Overview 16

A Phrase-Based Decoder  Probabilities at each step include LM and TM 17

Search for MT

Etc: Historical Change  Change in form over time, reconstruct ancient forms, phylogenies  … just an example of the many other kinds of models we can build

Want to Know More?  Check out the Berkeley NLP Group: nlp.cs.berkeley.edu 20