Core Methods in Educational Data Mining

Slides:



Advertisements
Similar presentations
Feature Engineering Studio January 21, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
Advertisements

Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 18, 2013.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.
Classifiers, Part 1 Week 1, video 3:. Prediction  Develop a model which can infer a single aspect of the data (predicted variable) from some combination.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 13, 2012.
LOGISTIC REGRESSION David Kauchak CS451 – Fall 2013.
Feature Engineering Studio September 23, Let’s start by discussing the HW.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
1 CSC 9010 Spring Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.
Learning Analytics: Process & Theory March 24, 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Feature Engineering Studio September 9, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 6, 2013.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 25, 2012.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
BEHAVIOR BASED SELECTION Reducing the risk. Goals  Improve hiring accuracy  Save time and money  Reduce risk.
Introduction to Machine Learning, its potential usage in network area,
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Mr Barton’s Maths Notes
Writing across the Content Areas
Introduction to Machine Learning
Core Methods in Educational Data Mining
an introduction to: Deep Learning
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Check Your Assumptions
Core Methods in Educational Data Mining
Special Topics in Educational Data Mining
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Some say this is difficult, but in reality, it’s really very simple.
Cross-cutting concepts in science
What’s the big idea? Many things in the natural and constructed world come in a predictable amount or in a recognisable sequence of numbers. Children learning.
Core Methods in Educational Data Mining
Big Data, Education, and Society
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Feature Engineering Studio Special Session
Big Data, Education, and Society
Social Studies Methodology- Cause and Effect Organizers
Machine Learning practical
Core Methods in Educational Data Mining
Big Data, Education, and Society
Lecturer: Geoff Hulten TAs: Kousuke Ariga & Angli Liu
HAPPY NEW YEAR! Lesson 7: If-statements unplugged
Core Methods in Educational Data Mining
Machine Learning in Practice Lecture 7
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Machine Learning in Practice Lecture 27
Negotiation Skills BKB/NASC/2018.
Core Methods in Educational Data Mining
Evaluating Classifiers
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Core Methods in Educational Data Mining
Welcome! Let’s get creative!.
Instructor: Vincent Conitzer
Negotiation Skills BKB/NASC/33rd BAT/Khaptad/2018.
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

Core Methods in Educational Data Mining EDUC691 Spring 2019

The Homework Let’s go over basic homework 1

The Homework Let’s go over basic homework 1 Who did the assignment in Python? Who did the assignment in RapidMiner?

RapidMiner folks How well did you succeed in making the tool work? What were some of the biggest challenges?

Python folks How well did you succeed in making the tool work? What were some of the biggest challenges?

Did it make a difference? When you ran Decision Tree/W-J48 with an without student as a variable in the data set? What was the difference?

Did it make a difference? When you ran Decision Tree/W-J48 with an without student as a variable in the data set? What was the difference? Why might RapidMiner and Python produce different results for this?

Removing student from the model How did you remove student from the model? There were multiple ways to accomplish this

How would you know… If you were over-fitting to student? Or any variable, for that matter?

What are some variables… That could cause your model not to apply to new data sets you might be interested in? Student is one example… what else?

Did it make a difference? What happens when you turn on cross-validation?

Questions? Comments? Concerns?

How are you liking RapidMiner and Python?

Other RapidMiner or Python questions?

Note… Python and RapidMiner have a different set of algorithms available Python’s set tends to be more recent But it’s not totally clear they are *better* We’ll come back to this when we discuss Hand

What is the difference between a classifier and a regressor?

What are some things you might use a classifier for, in education? Bonus points for examples other than those in the BDE videos

Any questions about any classification algorithms?

Do folks feel like they understood logistic regression? Any questions?

Logistic Regression m = 0.5A - B + C A B C M P(M)

Logistic Regression m = 0.5A - B + C A B C M P(M) 1

Logistic Regression m = 0.5A - B + C A B C M P(M) 1

Logistic Regression m = 0.5A - B + C A B C M P(M) 4 1

Logistic Regression m = 0.5A - B + C A B C M P(M) 100 -100

Why would someone Use a decision tree rather than, say, logistic regression?

Has anyone Used any classification algorithms outside the set discussed/recommended in the videos? Say more?

Other questions, comments, concerns about lectures?

Did anyone read Hand article? Thoughts?

What is Hand’s main thesis?

What is Hand’s main thesis? Who thinks it makes sense? Who thinks he’s probably wrong?

What is Hand’s main thesis? Who thinks it makes sense? Who thinks he’s probably wrong? Please present arguments in favor of each perspective

If he is wrong Why do simple algorithms work well for many problems?

If he is right Why have some algorithms like recurrent neural networks become so popular?

If he is right Why have some algorithms like recurrent neural networks become so popular? Note that many of the key successes have been in very large scale data sets like voice recognition

One of Hand’s key arguments Data points trained on are not usually drawn from the same distribution As the data points where the classifier will be applied

One of Hand’s key arguments Data points trained on are not usually drawn from the same distribution As the data points where the classifier will be applied Is this a plausible argument for educational data mining?

One of Hand’s key arguments Data points trained on are not usually drawn from the same distribution As the data points where the classifier will be applied Is this a plausible argument for large-scale voice recognition technology?

Another of Hand’s key arguments Data points trained on are often treated as certainly true and objective But they are often arbitrary and uncertain

Another of Hand’s key arguments Data points trained on are often treated as certainly true and objective But they are often arbitrary and uncertain Is this a plausible argument for educational data mining?

Another of Hand’s key arguments Data points trained on are often treated as certainly true and objective But they are often arbitrary and uncertain Is this a plausible argument for large-scale speech recognition?

Note Hand refers to these issues as over-fitting But they are a specific type of over-fitting that is relevant to some problems and not to others And is different than the common idea that over-fitting comes from limited data

Another of Hand’s key arguments Researchers and practitioners usually do best when working with an algorithm they know very well And therefore more recent algorithms win competitions Because those are the algorithms the researcher knows best and wants to prove are better

Momentary digression Who here is familiar with data competitions like the KDD Cup, Kaggle competitions, and ASSISTments Longitudinal Challenge?

Some counter-evidence to Hand Recent algorithms win a lot of data mining competitions these days (where lots of people are trying their best)

Some counter-evidence to Hand Recent algorithms win a lot of data mining competitions these days (where lots of people are trying their best) Those of you who like Hand, how would you respond to this?

Some counter-evidence to Hand One possible rejoinder: These are usually well-defined problems where the training set and eventual test set resemble each other a lot

Another practical question

Should you Pick one algorithm that seems really appropriate? Run every algorithm that will actually run for your data? Something in between?

My typical lab practice Pick a small number of algorithms that Have worked on past similar problems Fit different kinds of patterns from each other

Is it really the algorithm? Or is it the data you put into it? We’ll come back to this in the Feature Engineering lecture in a month

Questions? Comments?

Creative HW 1

Questions about Creative HW 1?

Questions? Concerns?

Other questions or comments?

Next Class February 13 Behavior Detection Baker, R.S. (2015) Big Data and Education. Ch.1, V5. Ch. 3, V1, V2. Sao Pedro, M.A., Baker, R.S.J.d., Gobert, J., Montalvo, O. Nakama, A. (2013) Leveraging Machine-Learned Detectors of Systematic Inquiry Behavior to Estimate and Predict Transfer of Inquiry Skill. User Modeling and User-Adapted Interaction, 23 (1), 1-39.  Kai, S., Paquette, L., Baker, R.S., Bosch, N., D'Mello, S., Ocumpaugh, J., Shute, V., Ventura, M. (2015) A Comparison of Face-based and Interaction-based Affect Detectors in Physics Playground. Proceedings of the 8th International Conference on Educational Data Mining, 77-84.  Creative HW 1 due

The End