Core Methods in Educational Data Mining

Slides:



Advertisements
Similar presentations
THE IMPACT ON PRE-KINDERGARTEN Common Core State Standards and Assessment.
Advertisements

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 March 12, 2012.
Educational Data Mining Overview Ryan S.J.d. Baker PSLC Summer School 2012.
Feature Engineering Studio November 11, Poster Session Features What features did each of you create after the poster session? Who did the ideas.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Feature Engineering Week 3 Video 3. Feature Engineering.
A 2 nd Grade Web Quest through the Solar System Created by Janifer Wheeler June 10, 2006.
Introduction to RUP Spring Sharif Univ. of Tech.2 Outlines What is RUP? RUP Phases –Inception –Elaboration –Construction –Transition.
© 2012 Common Core, Inc. All rights reserved. commoncore.org NYS COMMON CORE MATHEMATICS CURRICULUM A Story of Ratios Grade 8 – Module 6 Linear Functions.
Feature Engineering Studio March 30, Iterative Feature Refinement.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
 Building Networks. First Decisions  What do the nodes represent?  What do the edges represent?  Know this before doing anything with data!
Feature Engineering Studio October 14, Iterative Feature Refinement.
Design Process … and some design inspiration. Course ReCap To make you notice interfaces, good and bad – You’ll never look at doors the same way again.
Learning Analytics: Process & Theory March 24, 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 27, 2013.
Engineers create what has never existed!
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Requirements Engineering Processes. Syllabus l Definition of Requirement engineering process (REP) l Phases of Requirements Engineering Process: Requirements.
Feature Engineering Studio September 9, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 6, 2013.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Feature Engineering Studio April 13, Friend Features Who managed to find a friend with relevant background expertise?
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
FINAL PROJECT 01: SKETCHING AND EARLY PROTOTYPING March 30, 2015 SDS136: Communicating with Data.
The Project. A little video inspiration IDEO – an industrial design company.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 25, 2013.
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
OPEN HOUSE for Parents/Guardians and Students of Soundings XVII
Core Methods in Educational Data Mining
OPEN HOUSE for Parents/Guardians and Students of Soundings XVII
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Design.
This is a focusing stage empathize define test ideate prototype.
Computer tools for Scheduling
Core Methods in Educational Data Mining
Feature Engineering Studio
Core Methods in Educational Data Mining
Miss Luke Child Development
Core Methods in Educational Data Mining
Big Data, Education, and Society
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Big Data, Education, and Society
Feature Engineering Studio
KINDERGARTEN SOCIAL STUDIES
Chapter 6 Selecting Your Topic.
The Take-Away What are they learning?.
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Lecture 13 Teamwork Bryan Burlingame 1 May 2019.
Core Methods in Educational Data Mining
Group Brainstorming 1. A set of rules for “idea-generating groups
The Creative Process CREATIVE PROCESS
Presentation transcript:

Core Methods in Educational Data Mining EDUC691 Spring 2019

Feature Engineering Not just throwing spaghetti at the wall and seeing what sticks

Construct Validity Matters! Crap features will give you crap models Crap features = reduced generalizability/more over-fitting Nice discussion of this in the Sao Pedro paper

What’s a good feature? A feature that is potentially meaningfully linked to the construct you want to identify

Assignment C2

What tools did you use? Packages (e.g., Excel, Python) Features of Packages (e.g., Pivot Tables)

Let’s… Form groups of 3 What features did you generate? How did you generate them? Did it end up in your final model? In what direction? Choose 2 features per group that are the coolest/most interesting/most novel Be ready to share with rest of class Does this match the class’s overall intuition?

Let’s… Go through how you created some features Actually do it… Re-create it in real-time, or show us your code… We’ll have multiple volunteers One feature per customer, please… Does this match the class’s overall intuition?

Was feature engineering beneficial?

What was your process for feature engineering for CA2? How did you decide what features to create?

Baker’s feature engineering process Brainstorming features Deciding what features to create Creating the features Studying the impact of features on model goodness Iterating on features if useful Go to 3 (or 1)

What’s useful? Brainstorming features Deciding what features to create Creating the features Studying the impact of features on model goodness Iterating on features if useful Go to 3 (or 1)

What’s missing? Brainstorming features Deciding what features to create Creating the features Studying the impact of features on model goodness Iterating on features if useful Go to 3 (or 1)

How else could it be improved?

IDEO tips for Brainstorming 1. Defer judgment 2. Encourage wild ideas 3. Build on the ideas of others 4. Stay focused on the topic 5. One conversation at a time 6. Be visual 7. Go for quantity http://www.openideo.com/fieldnotes/openideo-team-notes/seven-tips-on-better-brainstorming

Your thoughts?

Deciding what features to create Trade-off between the effort to create a feature and how likely it is to be useful Worth biasing in favor of features that are different than anything else you’ve tried before Explores a different part of the space

General thoughts about feature engineering?

Automated Feature Generation What are the advantages of automated feature generation, as compared to feature engineering? What are the disadvantages?

Automated Feature Selection What are the advantages of automated feature selection, as compared to having a domain expert decide? (as in Sao Pedro paper from Monday) What are the disadvantages?

A connection to make

A connection to make Correlation filtering Eliminating collinearity in statistics In this case, increasing interpretability and reducing over-fitting go together At least to some positive degree

Outer-loop forward selection What are the advantages and disadvantages to doing this?

Knowledge Engineering What is knowledge engineering?

Knowledge Engineering What is the difference between knowledge engineering and EDM?

Knowledge Engineering What is the difference between good knowledge engineering and bad knowledge engineering?

Knowledge Engineering What is the difference between (good) knowledge engineering and EDM? What are the advantages and disadvantages of each?

How can they be integrated?

FCBF: What Variables will be kept? (Cutoff = 0.65) What variables emerge from this table? G H I J K L Predicted .7 .8 .4 .3 .72 .6 .5 .38 .82 .1 .75 .65 .42

Other questions, comments, concerns about textbook?

Next Class Clustering Wednesday, March 20 Baker, R.S. (2018) Big Data and Education. Ch. 7, V1, V2, V3, V4, V5. Bowers, A.J. (2010) Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping Out and Hierarchical Cluster Analysis. Practical Assessment, Research & Evaluation (PARE), 15(7), 1-18.  Lee, J., Recker, M., Bowers, A.J., Yuan, M. (2016). Hierarchical Cluster Analysis Heatmaps and Pattern Analysis: An Approach for Visualizing Learning Management System Interaction Data. Poster presented at the annual International Conference on Educational Data Mining (EDM). 

The End