Overview of Statistical NLP IR Group Meeting March 7, 2006.

Slides:



Advertisements
Similar presentations
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Advertisements

For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Midterm Review CS4705 Natural Language Processing.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
1 Empirical Learning Methods in Natural Language Processing Ido Dagan Bar Ilan University, Israel.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Statistical techniques in NLP Vasileios Hatzivassiloglou University of Texas at Dallas.
Part of speech (POS) tagging
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Introduction to Machine Learning Approach Lecture 5.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ELN – Natural Language Processing Giuseppe Attardi
9/8/20151 Natural Language Processing Lecture Notes 1.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
ICS 482: Natural language Processing Pre-introduction
30 March – 8 April 2005 Dipartimento di Informatica, Universita di Pisa ML for NLP With Special Focus on Tagging and Parsing Kiril Ribarov.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
LING 001 Introduction to Linguistics Spring 2010 Syntactic parsing Part-Of-Speech tagging Apr. 5 Computational linguistics.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Semi-automatic Product Attribute Extraction from Store Website
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Part-Of-Speech Tagging Radhika Mamidi. POS tagging Tagging means automatic assignment of descriptors, or tags, to input tokens. Example: “Computational.
Natural Language Processing [05 hours/week, 09 Credits] [Theory]
Approaches to Machine Translation
Sentiment analysis algorithms and applications: A survey
Natural Language Processing (NLP)
Machine Learning in Natural Language Processing
Statistical NLP: Lecture 9
Approaches to Machine Translation
CS4705 Natural Language Processing
CS246: Information Retrieval
Natural Language Processing (NLP)
CS224N Section 3: Corpora, etc.
CS224N Section 3: Project,Corpora
Artificial Intelligence 2004 Speech & Natural Language Processing
Extracting Information from Diverse and Noisy Scanned Document Images
Statistical NLP : Lecture 9 Word Sense Disambiguation
Statistical NLP: Lecture 10
Natural Language Processing (NLP)
Presentation transcript:

Overview of Statistical NLP IR Group Meeting March 7, 2006

03/07/2006 IR Group Meeting -- NLP 2 Outline Some basic/important NLP problems Topics that recently attracted many interests NLP research groups Discussion on the relation between NLP and IR

03/07/2006 IR Group Meeting -- NLP 3 Levels of Analysis in NLP (from Dan Roth’s CS598) Morphology  How words are constructed Syntax  Structural relation between words Semantics  The meaning of words and of combinations of words Pragmatics.  How is a sentence used? What’s its purpose? Discourse (sometimes distinguished as a subfield of Pragmatics)  Relationships between sentences; global context.

03/07/2006 IR Group Meeting -- NLP 4 Some NLP Problems N-gram Models Word Sense Disambiguation Lexical Acquisition (POS) Tagging (Syntactic) Parsing Semantic Role Labeling (Semantic Parsing) Named Entity Recognition Textual Entailment …

03/07/2006 IR Group Meeting -- NLP 5 N-gram Models The task: to estimate P(w n |w 1,…,w n-1 ) Approaches:  Maximum likelihood estimation  Various smoothing methods Applications:  Automatic speech recognition  Spelling correction  Handwriting recognition  Statistical machine translation

03/07/2006 IR Group Meeting -- NLP 6 Word Sense Disambiguation (WSD) The task: to determine which of the senses of an ambiguous word is involved in a particular use of the word Approaches:  Supervised: Log-linear models Information-theoretic Memory-based learning (kNN)  Dictionary-based: Sense definitions Thesauri Translations in a second language  Unsupervised: Clustering using EM algorithm

03/07/2006 IR Group Meeting -- NLP 7 Word Sense Disambiguation (WSD) Accuracy:  Word-specific  Easy words: > 90%  Hard words: 50~70% Applications:  Statistical machine translation  Information retrieval

03/07/2006 IR Group Meeting -- NLP 8 Lexical Acquisition The task: to develop algorithms and statistical techniques for filling the holes in existing machine-learnable dictionaries by looking at the occurrence patterns of words in large text corpora Examples:  Verb subcategorization  Propositional phrase attachment disambiguation  Selectional preferences  Semantic similarity

03/07/2006 IR Group Meeting -- NLP 9 Semantic Similarity The task: to acquire a relative measure of similarity between two words Approaches:  Vector space measures (document space, word space, modifier space, etc.)  Probabilistic measures (KL-divergence, etc.) Applications:  Information retrieval (query expansion)

03/07/2006 IR Group Meeting -- NLP 10 POS Tagging The task: labeling each word in a sentence with its appropriate part of speech Major approaches  HMM  Transformation-based Advantages: speed and storage Other approaches  Neural networks, decision trees, memory-based learning, maximum entropy models

03/07/2006 IR Group Meeting -- NLP 11 POS Tagging Accuracy:  95~97%  Achieved only when the application text and the training text are from the similar source Applications  For higher-level NLP tasks: partial parsing, parsing, NER, etc. “…the best lexicalized probabilistic parsers are now good enough that they perform better starting with untagged text and doing the tagging themselves, rather than using a tagger as preprocessor.” (Charniak 1997)

03/07/2006 IR Group Meeting -- NLP 12 (Syntactic) Parsing The task: to find the most likely syntactic parse tree of a sentence Approaches:  Probabilistic context free grammar (PCFG) Supervised Unsupervised  Lexicalized models  Dependency-based models

03/07/2006 IR Group Meeting -- NLP 13 (Syntactic) Parsing Accuracy:  Charniak 1997: Rec Prec  Collins 1997: Rec Prec Applications:  For other NLP tasks such as semantic role labeling and relation extraction

03/07/2006 IR Group Meeting -- NLP 14 Semantic Role Labeling The task: to identify the predicate-argument structures in sentences Approaches:  Supervised learning Accuracy:  Best ~70% (CoNLL 04 shared task) Applications:  Information extraction  Question answering

03/07/2006 IR Group Meeting -- NLP 15 Textual Entailment The task: given two text fragments, to recognize whether the meaning of one text is entailed (can be inferred) from the other text Approaches:  Word overlap  Statistical lexical relations  Syntactic matching  Logic inference Accuracy:  ~0.56, best ~0.60 (PASCAL Challenge 05) Applications:  Question answering  Multi-document summarization

03/07/2006 IR Group Meeting -- NLP 16 Tools Brill Tagger Brill Charniak Parser Charniak Collins Parser Collins MiniPar Semantic Parser  ASSERT Parser ASSERT  CCG’s demodemo

03/07/2006 IR Group Meeting -- NLP 17 Corpora WordNet Penn Treebank (Sample) Penn TreebankSample PropBank FrameNet

03/07/2006 IR Group Meeting -- NLP 18 Other Tasks Automatic Speech Recognition Natural Language Generation Automatic Summarization …

03/07/2006 IR Group Meeting -- NLP 19 Outline Some basic/important NLP problems Topics that recently attracted many interests NLP research groups Discussion on the relation between NLP and IR

03/07/2006 IR Group Meeting -- NLP 20 Recent topics Unsupervised and semi-supervised approaches  Knowledge acquisition bottleneck Semantic role labeling  Improve the performance of SRL  Use the results for other tasks Relation extraction WSD Parsing Statistical machine translation  Word alignment

03/07/2006 IR Group Meeting -- NLP 21 Outline Some basic/important NLP problems Topics that recently attracted many interests NLP research groups Discussion on the relation between NLP and IR

03/07/2006 IR Group Meeting -- NLP 22 NLP Research Groups USC/ISI Stanford UPenn Johns-Hopkins UIUC …

03/07/2006 IR Group Meeting -- NLP 23 Outline Some basic/important NLP problems Topics that recently attracted many interests NLP research groups Discussion on the relation between NLP and IR