Machine Learning in Practice Lecture 25 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Slides:



Advertisements
Similar presentations
Imbalanced data David Kauchak CS 451 – Fall 2013.
Advertisements

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 7 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 3 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Issues with Data Mining
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
TagHelper: Basics Part 1 Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center and The Office of Naval.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Machine Learning in Practice Lecture 19 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
/0604 © Business & Legal Reports, Inc. BLR’s Training Presentations Effective Decision-Making Strategies.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Machine Learning in Practice Lecture 24 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 8 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
TagHelper Track Overview Carolyn Penstein Rosé Carnegie Mellon University Language Technologies Institute & Human-Computer Interaction Institute School.
Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Thank you for coming in quietly, Your Pencil & Daily Catch Binders off tables and out of way WARM UP Which of these is the most Valid? Why? Explain. What.
Listening & Note Taking University of Louisville Disability Resource Center.
Data Science Credibility: Evaluating What’s Been Learned
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Hidden Slide for Instructor
Machine Learning in Practice Lecture 18
Recommendation in Scholarly Big Data
Assessment.
Advanced data mining with TagHelper and Weka
Chapter 21 More About Tests.
Objectives of the Course and Preliminaries
Thesis-based Writing.
UNIT 3 – LESSON 5 Creating Functions.
Confidence Intervals for Proportions
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Writing for Academic Journals
Title of your science project
All About Business Phone Systems
RWS 100: ch-ch-ch-ch-changes.
Project 2 datasets are now online.
Machine Learning in Practice Lecture 12
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Computer Programming
Machine Learning in Practice Lecture 11
Social Media Sarah Mallen Information & Guidance Coordinator The University of Manchester Careers Service.
Machine Learning in Practice Lecture 26
iSRD Spam Review Detection with Imbalanced Data Distributions
Ensembles.
CSE 303 Concepts and Tools for Software Development
Approaching an ML Problem
Title of Paper or Topic you are Teaching
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
Machine Learning in Practice Lecture 23
Ensemble learning.
Machine Learning in Practice Lecture 22
Machine Learning in Practice Lecture 7
Machine Learning in Practice Lecture 17
Machine Learning in Practice Lecture 6
Ensemble learning Reminder - Bagging of Trees Random Forest
Machine Learning in Practice Lecture 27
Junheng, Shengming, Yunsheng 11/09/2018
Evaluating Classifiers
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

Machine Learning in Practice Lecture 25 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

Plan for the day Announcements  Questions?  Final quiz today!!!!  Next Lecture: Homework 9 due, midterm 2 review Strategies for Efficient Experimentation Call center application example  Active learning and Semi-Supervised Learning

Weka Helpful Hint of the Day

Strategies for Efficient Experimentation

Adversarial Learning People constantly “gaming the system” Behavior needs to adjust over time  Automatic essay grading  Spam detection  Computer network security  Computer Assisted Passenger Prescreening Solution: Incremental learning  Issues: eventually you have a huge amount of data

Personalization Similar issues  Massive amounts of data collected over time Example types of data  Click stream data  Web pages visited Why keep collecting data?  Interests change over time  Personal situation changes over time Connection between learning models and summarization

Incremental Learning Problem with cross validation  If spam changes over time, more recent data is more relevant than old data Instance based learning is a good solution If you can’t do incremental learning, you’ll need a quick learning scheme that allows you to periodically retrain

Learning from Massive Data Sets Critical issues: space and time One strategy is to carefully select an algorithm that can handle a large amount of data If space is an issue:  Naïve bayes is a good choice – only one instance needs to be kept in main memory at a time  Instance based learning: sophisticated caching and indexing mechanisms can make this workable

Learning from Massive Data Sets If time is an issue:  Again, Naïve bayes is linear with respect to time for both number of instances and number of attributes  Decision trees are linear with respect to number of attributes  Instance based learning can be parallelized  Bagging and stacking (but not boosting) can easily be parallelized Understanding check: Why do you think that is so?

Other options… Sub-sampling: just work with a carefully selected subset of data Note how performance varies with amount of training data  Tur, G., Hakkani-Tur, D., Schapire, E. (2005). Combining active and semi-supervised learning for spoken language understanding, Speech Communication, 45, pp

Strategies Find computers you can leave things running on Plan your strategy ahead of time  Keep good notes so you don’t have to rerun experiments  Make the most out of time-consuming tasks so you can avoid doing them frequently Error analysis, feature extraction from large texts

Strategies Do quick tests to get “in the ballpark”  Use part of your data  Only use 5-fold cross validation  Don’t tune  Test subsets of features – throw out ones that don’t have any predictive value

Strategies Sample broadly and shallowly at first  General feature space design  Linear versus non-linear  What class of algorithm? Push to see why some algorithms work better on your data than others  Eliminate parameters that it doesn’t make sense to tune You can use CVParameterSelection to determine whether the tuned performance is better when you tune a parameter

Call Center Routing

Tur et al. paper Call routing  Replacement for typical routing by touch tone interfaces System prompt: “How may I help you?” User: “I would like to buy a laptop”  Speech recognition, Language understanding  Requires a lot of hand labeling Tur, G., Hakkani-Tur, D., Schapire, E. (2005). Combining active and semi-supervised learning for spoken language understanding, Speech Communication, 45, pp

Tur et al. paper Goal is to reduce labeling by being strategic about which examples are labeled Evaluated in terms of reduction in number of labeled examples to achieve same performance as random selection

Tur et al. paper Goal: Limit the amount of data that needs to be labeled by hand Active Learning: Reduce human labeling by being strategic about which examples to label  On each iteration select the examples the algorithm is least confidence about  Note: since you get confidence scores back from TagHelper tools, you can do something like active learning with it Semi-supervised learning: You can increase confidence in predictions by adding very confidently classified examples to your set of labeled data  Note that TagHelper tools also has a form of semi-supervised learning built in – look at the Advanced tab (self-training)

Active Learning: Committee Based Methods Related idea to Bagging Similar to bagging: you have multiple versions of your labeled data You train a model on each version of the data You apply the models to the unlabeled data The examples with the highest “vote entropy” or disagreement are the ones with the least certainty The ones with the least certainty are the ones you should have someone label Problem: hard to distinguish outliers from informative examples Another problem: priors in labeled data are different from initial set

NOTE about Usefulness If only a very little amount of labeled data is available, then adding in even very confident predictions will add noise If a huge amount of labeled data is available, adding in predictions on unlabeled data won’t be that useful Take home message: semi-supervised learning is only useful if you have a “medium” amount of labeled data

Modified Semi-Supervised Learning

Combining Active and Semi-Supervised Learning

Project Advice

What Makes a Good Paper Good example is Arguello & Rosé 2006 paper in Course Documents folder on blackboard! Argue why the problem is important, what your big result is, and what you learned about your application area  Ideally there should be something cleaver about your approach Explain prior work, where previous approaches “went wrong”, and where your approach fits  Best to have a baseline from previous work – you might need a broad definition of prior work (there is *always* relevant prior work)  You need to show respect and awareness of what came before you and where you fit into the larger community Here prior work will be in multiple areas: your own application area, core machine learning work, and possibly computational linguistics work

More about What Makes a Good Paper Summarize your approach  Here you may give a lengthy analysis of your data, how you decided to code it, how you determined that your coding was reliable, why your coding was meaningful  Describe your experimentation process – I need to be able to see that you are aware of and used proper methodology  What was your final best approach?

More about What Makes a Good Paper Justify your evaluation metrics, corpus, gold standard, and baselines  You need to say enough to give the reader confidence that you know what you’re doing and your evaluation is valid  If possible evaluate baseline approach and yours on both the same data set from baseline previous publication and possibly an additional one

More about What Makes a Good Paper Present your results  Make sure you evaluate your work both in terms of typical metrics in comparison to a baseline as well as in a task specific manner Which errors affect users more? How would your classifier be used? Discuss your error analysis  What did you learn about your domain from doing this?  What are your current directions (based on your error analysis)

What Numbers Should You Report? Evaluation methodology is a matter of taste You should make your evaluation comparable to other recent published results  Evaluation standards change over time!!  How you do your evaluation is a big part of what determines the quality of your work  High quality work explains the reasons for its evaluation methodology

Debates Over Evaluation Metrics F-measure hides trade-offs between precision and recall F-measure, precision, and recall obscure agreement by chance  Less of an issue if everyone is comparing on the same data set F-measure ignores false alarm rate  False alarm rate is of the total number of errors you could make, how many do you make  In some cases, there are not many errors of comission you can make, so f-measure makes the classifier look better than it is F-measure might be overly harsh for some tasks  On topic segmentation, it punishes errors that are close just as much as errors that are way off Graphs show more trade-offs between classifiers – especially useful when classifiers can be tuned

Reporting Results Remember to use the measures that are standard for the problem you are working on Don’t just give a dump from the Weka interface Summarize the comparison in a table Discuss the significance of the difference between approaches *Always* have a baseline for comparison Report the significance of the difference (shows that you tested on enough data)

Last Minute Project Tips Don’t forget that process is more important than product! Talk about what you learned from doing this