Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 27, 2013
Today’s Class Feature Engineering and Distillation - How
Special Rule for Today Everyone Who Turned in a Homework Participates
Let’s go back to the list of features from the last class As I read features off If you used this feature (or something very similar), raise your hand
For the features someone used Did it end up in your final model? Does this match the class’s overall intuition?
List other features I’d like everyone who turned in Assignment 4 To tell me all of your other features that ended up in final models
We now have… We now have a list of features that ended up being used in models
So let’s… Go through how several of them were created – Actually do it… Re-create it in real-time, or show us your code… Everyone who turned in the homework will show the class at least one feature No one can show a second feature until everyone has had a chance to show at least one
Comments? Questions?
What tools were used? Did anyone use any additional tools? How else could you have created features?
Now let’s… Make a supermodel!
Comments or Questions About Assignment 5?
Final Thoughts?
If you enjoyed today’s class… At some point in the next 2 years, we’ll be offering a Feature Engineering Design Studio course…
Next Class Monday, March 4 Automated Feature Creation and Selection Assignment Due: None
Excel Plan is to go as far as we can by 5pm We will continue after next class session Vote on which topics you most want to hear about
Topics Using average, count, sum, stdev (asgn. 4 data set) Relative and absolute referencing (made up data) Copy and paste values only (made up data) Using sort, filter (asgn. 4 data set) Using countif (asgn. 4 data set) Making scatterplot (Jan. 28 class data set) Making histogram (asgn. 4 data set) Z-test (made up data) 2-sample t-test (made up data) Other topics?
The End