Download presentation
Presentation is loading. Please wait.
Published byVanessa Dawson Modified over 9 years ago
1
Core Methods in Educational Data Mining HUDK4050 Fall 2014
2
Demo of using Java
3
Activity
4
Second task Break into *different* 3-4 person groups than last time No overlap allowed
5
Second task Let’s take a quick look at homework C2
6
Second task Make up features for Assignment C2 You need to – Come up with a new feature – Justify how you can would it from the data set – Justify why it would work
7
I need a volunteer
8
Your task is to write down the features suggested And the counts for thumbs up/thumbs down
9
Now… Each group needs to read their favorite feature to the class and justify it Who thinks this feature will improve prediction of off-task behavior? Who doesn’t? Thumbs up, thumbs down!
10
Questions or comments?
11
Special Request Bring a print-out of your Assignment C2 solution to class on the day it’s due – Next Tuesday
12
Textbook
13
Automated Feature Generation What are the advantages of automated feature generation, as compared to feature engineering? What are the disadvantages?
14
Automated Feature Selection What are the advantages of automated feature selection, as compared to having a domain expert decide? (as in Sao Pedro paper from Monday) What are the disadvantages?
15
A connection to make
16
Correlation filtering Eliminating collinearity in statistics In this case, increasing interpretability and reducing over-fitting go together – At least to some positive degree
17
Outer-loop forward selection What are the advantages and disadvantages to doing this?
18
Knowledge Engineering What is knowledge engineering?
19
Knowledge Engineering What is the difference between knowledge engineering and EDM?
20
Knowledge Engineering What is the difference between good knowledge engineering and bad knowledge engineering?
21
Knowledge Engineering What is the difference between (good) knowledge engineering and EDM? What are the advantages and disadvantages of each?
22
How can they be integrated?
23
FCBF: What Variables will be kept? (Cutoff = 0.65) What variables emerge from this table? GHIJKL Predicted G.7.8.4.3.72 H.8.7.6.5.38 I.8.3.4.82 J.8.1.75 K.5.65 L.42
24
Other questions, comments, concerns about textbook?
25
If you enjoyed today’s class… Next fall, I’ll be offering a Feature Engineering Design Studio course… Learn the feature engineering process in detail Create a model important to your research Submit a journal paper
26
Special Session Thursday 9/24 3pm-430pm, Grace Dodge Hall 545 An Inappropriately Brief Introduction to Frequentist Statistics
27
What if you can’t attend? Email me; I will send you the slides
28
Should you attend? Not mandatory Not necessary if you’ve taken a stats class that covers topics like Z, F, and Chi-squared tests
29
Next Class Tuesday, September 29 Advanced Detector Evaluation and Validation Baker, R.S. (2015) Big Data and Education. Ch. 2, V5, V6. Rosenthal, R., Rosnow, R.L. (1991) Essentials of Behavioral Research: Methods and Data Analysis, 2nd edition. Ch. 22: Meta-Analysis. Rupp, A.A., Gushta, M., Mislevy, R.J., Shaffer, D.W. (2010) Evidence-Centered Design of Epistemic Games: Measurement Principles for Complex Learning Environments.The Journal of Technology, Learning, and Assessment, 8 (4), 4-47.
30
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.