Introduction to Machine Learning and Text Mining

Slides:



Advertisements
Similar presentations
Tag-Questions or Question Tags
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.
CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
Modeling the Evolution of Product Entities Priya Radhakrishnan 1, Manish Gupta 1,2, Vasudeva Varma 1 1 Search and Information Extraction Lab, IIIT-Hyderabad,
Parts of Speech Generally speaking, the “grammatical type” of word: –Verb, Noun, Adjective, Adverb, Article, … We can also include inflection: –Verbs:
Used in place of a noun pronoun.
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein.
TagHelper: User’s Manual Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center.
BIOI 7791 Projects in bioinformatics Spring 2005 March 22 © Kevin B. Cohen.
1 Part-of-Speech (POS) Tagging Revisited Mark Sharp CS-536 Machine Learning Term Project Fall 2003.
From Textual Information to Numerical Vectors Chapters Presented by Aaron Hagan.
CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng.
Logics for Data and Knowledge Representation Exercises: Languages Fausto Giunchiglia, Rui Zhang and Vincenzo Maltese.
TagHelper & SIDE Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Chapter 4 Basics of English Grammar Business Communication Copyright 2010 South-Western Cengage Learning.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
NLP LINGUISTICS 101 David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Identifying Comparative Sentences in Text Documents
HW7 Extracting Arguments for % Ang Sun March 25, 2012.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
English Review for Final These are the chapters to review. In Textbook: Chapter 1 Nouns Chapter 2 Pronouns Chapter 3 Adjectives Chapter 4 Verbs Chapter.
Lecture 10 NLTK POS Tagging Part 3 Topics Taggers Rule Based Taggers Probabilistic Taggers Transformation Based Taggers - Brill Supervised learning Readings:
_____________________ Definition Part of Speech (circle one) Picture Antonym (Opposite) Vocab Word Noun Pronoun Adjective Adverb Conjunction Verb Interjection.
Word classes and part of speech tagging Chapter 5.
PIER Research Methods Protocol Analysis Module Hua Ai Language Technologies Institute/ PSLC.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 17 (14/03/06) Prof. Pushpak Bhattacharyya IIT Bombay Formulation of Grammar.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Sections 4.1 Inferring Rudimentary Rules Rodney Nielsen.
LANGUAGE ARTS LA WORKS UNIT 3 REVIEW STUDY GUIDE.
CPSC 422, Lecture 15Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15 Oct, 14, 2015.
English Review for Final These are the chapters to review. In Textbook: Chapter 9 Nouns Chapter 10 Pronouns Chapter 11 Adjectives Chapter 12 Verbs Chapter.
CS621: Artificial Intelligence
Parts of Speech Review. A Noun is a person, place, thing, or idea.
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
Part-of-speech tagging
Machine Learning in Practice Lecture 13 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
Word classes and part of speech tagging Chapter 5.
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
LightSIDE Tutorial Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Be Verbs Am Is was Are were Used in the present contentious tense. There Meaning as main verbs is (state of being) Main verbs after them should be in the.
Machine Learning in Practice Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Part-of-Speech Tagging CSE 628 Niranjan Balasubramanian Many slides and material from: Ray Mooney (UT Austin) Mausam (IIT Delhi) * * Mausam’s excellent.
CS 2750: Machine Learning Hidden Markov Models Prof. Adriana Kovashka University of Pittsburgh March 21, 2016 All slides are from Ray Mooney.
Parts of Speech Review.
The Parts of Speech.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15
University of Computer Studies, Mandalay
Dept. of Computer Science University of Liverpool
CS 2750: Machine Learning Hidden Markov Models
Chapter 4 Basics of English Grammar
Machine Learning Techniques for Data Mining
Machine Learning in Practice Lecture 11
FIRST SEMESTER GRAMMAR
The CoNLL-2014 Shared Task on Grammatical Error Correction
Chapter 4 Basics of English Grammar
Natural Language Processing
Machine Learning in Practice Lecture 6
Biomedical Language Processing: What's Beyond PubMed?
Natural language processing
PARTS OF SPEECH.
Part-of-Speech Tagging Using Hidden Markov Models
Ms. McDaniel 6th Grade Language Arts
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
Parts of Speech.
Presentation transcript:

Introduction to Machine Learning and Text Mining Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

Naïve Approach: When all you have is a hammer… Data Target Representation

Slightly less naïve approach: Aimless wandering… Data Target Representation

Expert Approach: Hypothesis driven Data Target Representation

Suggested Readings Witten, I. H., Frank, E., Hall, M. (2011). Data Mining: Practical Machine Learning Tools and Techniques, third edition, Elsevier: San Francisco

What is machine learning? Automatically or semi-automatically Inducing concepts (i.e., rules) from data Finding patterns in data Explaining data Making predictions Data Learning Algorithm Model New Data Prediction Classification Engine

Train Test

If Outlook = sunny, no else if Outlook = overcast, yes else if Outlook = rainy and Windy = TRUE, no else yes Perfect on training data

If Outlook = sunny, no else if Outlook = overcast, yes else if Outlook = rainy and Windy = TRUE, no else yes Not perfect on testing data Performance on training data?

If you evaluate the performance of your rule on the same data If Outlook = sunny, no else if Outlook = overcast, yes else if Outlook = rainy and Windy = TRUE, no else yes IMPORTANT! If you evaluate the performance of your rule on the same data you trained on, you won’t get an accurate estimate of how well it will do on new data.

Simple Cross Validation Fold: 1 Let’s say your data has attributes A, B, and C You want to train a rule to predict D First train on 2, 3, 4, 5, 6,7 and apply trained model to 1 The results is Accuracy1 TEST 1 TRAIN 2 TRAIN 3 TRAIN 4 TRAIN 5 TRAIN 6 TRAIN 7

Simple Cross Validation Fold: 2 Let’s say your data has attributes A, B, and C You want to train a rule to predict D First train on 1, 3, 4, 5, 6,7 and apply trained model to 2 The results is Accuracy2 TRAIN 1 TEST 2 TRAIN 3 TRAIN 4 TRAIN 5 TRAIN 6 TRAIN 7

Simple Cross Validation Fold: 3 Let’s say your data has attributes A, B, and C You want to train a rule to predict D First train on 1, 2, 4, 5, 6,7 and apply trained model to 3 The results is Accuracy3 TRAIN 1 TRAIN 2 TEST 3 TRAIN 4 TRAIN 5 TRAIN 6 TRAIN 7

Simple Cross Validation Fold: 4 Let’s say your data has attributes A, B, and C You want to train a rule to predict D First train on 1,2, 3, 5, 6,7 and apply trained model to 4 The results is Accuracy4 TRAIN 1 TRAIN 2 TRAIN 3 TEST 4 TRAIN 5 TRAIN 6 TRAIN 7

Simple Cross Validation Fold: 5 Let’s say your data has attributes A, B, and C You want to train a rule to predict D First train on 1, 2, 3, 4, 6,7 and apply trained model to 5 The results is Accuracy5 TRAIN 1 TRAIN 2 TRAIN 3 TRAIN 4 TEST 5 TRAIN 6 TRAIN 7

Simple Cross Validation Fold: 6 Let’s say your data has attributes A, B, and C You want to train a rule to predict D First train on 1, 2, 3, 4, 5, 7 and apply trained model to 6 The results is Accuracy6 TRAIN 1 TRAIN 2 TRAIN 3 TRAIN 4 TRAIN 5 TEST 6 TRAIN 7

Simple Cross Validation Fold: 7 Let’s say your data has attributes A, B, and C You want to train a rule to predict D First train on 1, 2, 3, 4, 5, 6 and apply trained model to 7 The results is Accuracy7 Finally: Average Accuracy1 through Accuracy7 TRAIN 1 TRAIN 2 TRAIN 3 TRAIN 4 TRAIN 5 TRAIN 6 TEST 7

Working with Text

Basic Idea Represent text as a vector where each position corresponds to a term This is called the “bag of words” approach Cheese Cows Eat Hamsters Make Seeds Cows make cheese. 110010 Hamsters eat seeds. 001101

Basic Idea Represent text as a vector where each position corresponds to a term This is called the “bag of words” approach But same representation for “Cheese makes cows.”! Cheese Cows Eat Hamsters Make Seeds Cows make cheese. 110010 Hamsters eat seeds. 001101

Part of Speech Tagging 1. CC Coordinating conjunction http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html 1. CC Coordinating conjunction 2. CD Cardinal number 3. DT Determiner 4. EX Existential there 5. FW Foreign word 6. IN Preposition/subord 7. JJ Adjective 8. JJR Adjective, comparative 9. JJS Adjective, superlative 10.LS List item marker 11.MD Modal 12.NN Noun, singular or mass 13.NNS Noun, plural 14.NNP Proper noun, singular 15.NNPS Proper noun, plural 16.PDT Predeterminer 17.POS Possessive ending 18.PRP Personal pronoun 19.PP Possessive pronoun 20.RB Adverb 21.RBR Adverb, comparative 22.RBS Adverb, superlative

Part of Speech Tagging 23.RP Particle 24.SYM Symbol 25.TO to http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html 23.RP Particle 24.SYM Symbol 25.TO to 26.UH Interjection 27.VB Verb, base form 28.VBD Verb, past tense 29.VBG Verb, gerund/present participle 30.VBN Verb, past participle 31.VBP Verb, non-3rd ps. sing. present 32.VBZ Verb, 3rd ps. sing. present 33.WDT wh-determiner 34.WP wh-pronoun 35.WP Possessive wh-pronoun 36.WRB wh-adverb

Basic Types of Features Unigram Single words prefer, sandwhich, take Bigram Pairs of words next to each other Machine_learning, eat_wheat POS-Bigram Pairs of POS tags next to each other DT_NN, NNP_NNP

Keep this picture in mind… Machine learning isn’t magic But it can be useful for identifying meaningful patterns in your data when used properly Proper use requires insight into your data ?