Chapter 8 Lexical Acquisition February 19, 2007 Additional Notes to Manning’s slides.

Slides:



Advertisements
Similar presentations
Information Extraction Lecture 4 – Named Entity Recognition II CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
Advertisements

Evaluating Classifiers
Semantics (Representing Meaning)
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
MINING FEATURE-OPINION PAIRS AND THEIR RELIABILITY SCORES FROM WEB OPINION SOURCES Presented by Sole A. Kamal, M. Abulaish, and T. Anwar International.
Statistical NLP: Lecture 3
Capturing linguistic interaction in a grammar A method for empirically evaluating the grammar of a parsed corpus Sean Wallis Survey of English Usage University.
Statistical Methods and Linguistics - Steven Abney Thur. POSTECH Computer Science NLP Lab Shim Jun-Hyuk.
Chunk Parsing CS1573: AI Application Development, Spring 2003 (modified from Steven Bird’s notes)
MORPHOLOGY - morphemes are the building blocks that make up words.
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
1 Words and the Lexicon September 10th 2009 Lecture #3.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Model Evaluation Metrics for Performance Evaluation
Evaluating Classifiers Lecture 2 Instructor: Max Welling Read chapter 5.
1 I256: Applied Natural Language Processing Marti Hearst Sept 27, 2006.
Decision Theory Naïve Bayes ROC Curves
Jeremy Wyatt Thanks to Gavin Brown
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
Albert Gatt Corpora and Statistical Methods Lecture 5.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.
Chapter 4 Pattern Recognition Concepts continued.
Albert Gatt Corpora and Statistical Methods Lecture 5.
Processing of large document collections Part 3 (Evaluation of text classifiers, applications of text categorization) Helena Ahonen-Myka Spring 2005.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
NLP superficial and lexic level1 Superficial & Lexical level 1 Superficial level What is a word Lexical level Lexicons How to acquire lexical information.
Poisson Random Variable Provides model for data that represent the number of occurrences of a specified event in a given unit of time X represents the.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
© Child language acquisition To what extent do children acquire language by actively working out its rules?
1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Aisha Sayidina, PhD. Department of English American University of Sharjah Copy right Aisha Sayidina, This work (except slide 2) is the intellectual.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
Natural Language Processing : Lexical Acquisition Lecture 8 Pusan National University Minho Kim
Linguistic Essentials
Chapter 10 Screening for Disease
Object Recognition a Machine Translation Learning a Lexicon for a Fixed Image Vocabulary Miriam Miklofsky.
Lexical Acquisition Extending our information about words, particularly quantitative information.
Sentence Analysis Week 2 – DGP for Pre-AP.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
Natural Language Processing
Query Suggestion. n A variety of automatic or semi-automatic query suggestion techniques have been developed  Goal is to improve effectiveness by matching.
Machine Learning Chapter 5. Evaluating Hypotheses
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Automatic acquisition for low frequency lexical items Nuria Bel, Sergio Espeja, Montserrat Marimon.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Linguistics Lecture-1: Words Pushpak Bhattacharyya, CSE Department, IIT Bombay 14 June, 2008.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
11-1 Chapter 11 Lexical Acquisition Lecture Overview Methodological Issues: Evaluation Measures Verb Subcategorization –the syntactic means by which.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 15: Text Classification & Naive Bayes 1.
Knowledge and Information Retrieval Dr Nicholas Gibbins 32/4037.
 When communicating in standard English, we rely on sentences to convey what we call a complete thought. A unit of complete thought consists of an actor.
7. Performance Measurement
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
Statistical NLP: Lecture 3
CSCI 5832 Natural Language Processing
Statistical NLP: Lecture 9
Clauses and phrases What is the difference?.
©2004 Pearson Education, Inc., publishing as Longman Publishers.
Chunk Parsing CS1573: AI Application Development, Spring 2003
Evaluating Models Part 1
Linguistic Essentials
Matlab Basics.
Clauses and phrases What is the difference?.
Statistical NLP : Lecture 9 Word Sense Disambiguation
Statistical NLP: Lecture 10
Presentation transcript:

Chapter 8 Lexical Acquisition February 19, 2007 Additional Notes to Manning’s slides

Slide 2 notes Language is constantly evolving NLP properties of interest is not available in dictionary form – for instance frequency or probability of occurrence Need to constantly learn and acquire new terms and usage Focus areas for this chapter –Attachment Ambiguity A. The children ate the cake with their hands. B. The children ate the cake with blue icing. –Semantic characterization of a verb’s argument

Slide 3 notes Evaluation measures discussion –tp  true positives –fp  false positives – Type II errors –fn  false negatives – Type I errors –tn  true negatives

Slide 5 notes Trade-off exist between precision and recall One can simply return all possible documents and get 100% recall (no false negatives) But precision will be low as there will be a lot of false positives

Slide 9 - Notes

Slide 11 - notes tell – has a subcategorization frame NP NP S (subject, object, clause) find – lacks such a frame. But has NP NP (subject, object)

Slide 12 - notes Cues for frame: –NP NP (OBJ|SUBJ_OBJ|CAP)(PUNC|CC) OBJ  personal pronouns like me and him SUBJ_OBJ  pronouns such as you, it CC  subordinating conjunction like if, before or as The error rate determination uses binomial distribution – each occurrence of the verb is an independent coin flip for which the cue occurs and does not correctly identify the frame (error rate e j ) – and (1-e j ) where it works correctly.

Slide 13 - notes Brent (1993) Lerner algorithm has high precision – but low recall. Manning (1993) By combining it with tagging that will look for patterns such as the following – one can increase the reliability. – (OBJ|SUBJ_OBJ|CAP)(PUNC|CC)

Slide 32 - notes For instance, the verb “eat” prefers strongly some thing edible as object. Exceptions include, metaphorical use of the word: –“eating one’s words” or “fear eats the soul”.

Slide 33 - Notes Kullback-Leibler Divergence Relative entropy or KL (Kullback-Leibler) divergence Example for A(v,n)  noun like “chair” –Susan interrupted the chair.

Slide 38 - notes X = { } Y = { } Matching coefficient = 1 Dice coefficient = (2 x 1)/(10+10) = 0.1 Jaccard coefficient = 1/( ) =~ 0.05 Overlap coefficient = 1/10 = 0.1 Cosine coefficient = 1/sqrt of (100) = 0.1 Cosine is very useful for comparing data with widely varying data set; if one vector with one non-zero entry and another with 1000 non-zero entries, Dice will give =~ 0.002, Cosine =~ 0.03