Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 January 23, 2013.

Slides:



Advertisements
Similar presentations
Intro to EDM Why EDM now? Which tools to use in class Week 1, video 1.
Advertisements

Feature Engineering Studio January 21, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
Educational Data Mining Overview Ryan S.J.d. Baker PSLC Summer School 2012.
Educational Data Mining Overview Ryan S.J.d. Baker PSLC Summer School 2010.
Discovery with Models Week 8 Video 1. Discovery with Models: The Big Idea  A model of a phenomenon is developed  Via  Prediction  Clustering  Knowledge.
CS/CMPE 535 – Machine Learning Outline. CS Machine Learning (Wi ) - Asim LUMS2 Description A course on the fundamentals of machine.
Early Term Test Some Study Reminders. General Topics Sources for information to be tested are –My slides and classroom presentations of slides –Chapter.
CSCD 555 Research Methods for Computer Science
Learning to Explain: Writing & Peer Review Techniques Laurie Burton Western Oregon University MAA PREP Active Learning Workshop July 9, 2003 Wednesday:
CS503: Tenth Lecture, Fall 2008 Review Michael Barnathan.
Biology 101L Spring Semester 2013 Biology and Society Laboratory Lab 01.
PROBABILITY AND STATISTICS FOR ENGINEERS Session 1 Dr Abdelaziz Berrado MTH3301 —Fall 09.
Educational Data Mining Overview John Stamper PSLC Summer School /25/2011 1PSLC Summer School 2011.
Nsm.uh.edu Math Courses Available After College Algebra.
Three Hours a Week?: Determining the Time Students Spend in Online Participatory Activity Abbie Brown, Ph.D. East Carolina University Tim Green, Ph.D.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Educational Data Mining and DataShop John Stamper Carnegie Mellon University 1 9/12/2012 PSLC Corporate Partner Meeting 2012.
Spring 2012 MATH 250: Calculus III. Course Topics Review: Parametric Equations and Polar Coordinates Vectors and Three-Dimensional Analytic Geometry.
Classifiers, Part 1 Week 1, video 3:. Prediction  Develop a model which can infer a single aspect of the data (predicted variable) from some combination.
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
CSCI 347 – Data Mining Lecture 01 – Course Overview.
Welcome... Simon Walls PhD Marketing School of Business Administration.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 12, 2012.
BIT 115: Introduction To Programming1 Sit in front of a computer Log in –Username: 230class –password: –domain: student Bring up the course web.
COURSE ADDITION CATALOG DESCRIPTION To include credit hours, type of course, term(s) offered, prerequisites and/or restrictions. (75 words maximum.) 4/1/091Course.
Mgt 20600: IT Management & Applications Introduction and Overview Tuesday August 30, 2005.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Welcome to AC122 Payroll Accounting 1. AC122 Payroll Accounting Seminar 1 Jim Eads, CPA, MST, MSF 2.
COMP 111 Programming Languages 1 First Day. Course COMP111 Dr. Abdul-Hameed Assawadi Office: Room AS15 – No. 2 Tel: Ext. ??
Meta-Cognition, Motivation, and Affect PSY504 Spring term, 2011 January 13, 2010.
Mgt 20600: IT Management & Applications Introduction and Overview Thursday January 19, 2006.
Learning Analytics: A short introduction Learning Analytics & Machine Learning March 25, 2014 #LAK14.
Highline Class, BI 348 Basic Business Analytics using Excel, Chapter 01 Intro to Business Analytics BI 348, Chapter 01.
Business Discipline Breakout Session Summer 2000 ION Conference Facilitated By: Marcy Satterwhite.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 2, 2012.
1 My Experiences as Faculty Member and Researcher Dr. Kalim Qureshi.
Welcome to CS 101! Introduction to Computers Spring 2015 This slide is based on Dr. Keen slides for CS101 day sections, with some modifications.
CSSE 513 – COURSE INTRO With homework and project details Wk 1 – Part 2.
How to start Milestone 1 CSSE 371 Project Info There are only 8 easy steps…
Brainstorm Solutions Problem Solving Module Session 4.
Introduction to Science Informatics Lecture 1. What Is Science? a dependence on external verification; an expectation of reproducible results; a focus.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
CSE8A: Introduction to Programming in Java Fall 2012 Prof. Christine Alvarado cse8afall.weebly.com.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 March 16, 2012.
COP4610/CGS5765 Operating Systems Syllabus. Instructor Xin Yuan Office: 168 LOV Office hours: W M F 9:10am – 10:00am, or by appointments.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Feature Engineering Studio September 9, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 6, 2013.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 25, 2012.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
1 1.Log in to the computer in front of you –Temp account: 210class / 2.Update your in Cascadia's system –If I need to you I'll use.
The People Of Utah A WebQuest for UEN Created by Kim Colton December, 2006.
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
Using Big Data and Analytics to Improve Learning and Education George Siemens CIDER February 10, 2016.
APPLIED MANAGEMENT SCIENCE IN AGRICULTURAL SYSTEMS I
Core Methods in Educational Data Mining
Fostering Student Success: Leveraging Canvas Analytics for face-to-face, hybrid, and online courses Welcome February 16, 2018.
Big Data, Education, and Society
Physics, Material Science and Engineering Science at Oxford
Big Data, Education, and Society
Introduction to Comparative Effectiveness Course (HAP 823)
Tonga Institute of Higher Education IT 141: Information Systems
Welcome! Knowledge Discovery and Data Mining
Tonga Institute of Higher Education IT 141: Information Systems
CS Problem Solving and Object Oriented Programming Spring 2019
Early Midterm Some Study Reminders.
How to Use Learning Analytics in Moodle
Presentation transcript:

Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 January 23, 2013

Wow Welcome! There’s a lot of you – It’s great to see so much interest in EDM

Administrative Stuff Is everyone signed up for class? If not, and you want to receive credit, please talk to me after class

Class Schedule

Updated versions will be available on the course webpage PDF files are also available there for publicly available readings Other readings will be made available in a course Dropbox

Class Schedule After I made the schedule – Two days a week, 100 minutes per session I was told that I’m teaching double the content of a typical class at TC

Class Schedule After I made the schedule – Two days a week, 100 minutes per session I was told that I’m teaching double the content of a typical class at TC Oops! – Newbie mistake!

My Solution I have cut the course schedule to be only 50% more content than usual These cuts also coincide with when I need to travel to the LAK, CREA, and AERA conferences – Might as well solve two problems at once… See the course schedule…

Required Texts Witten, I.H., Frank, E. (2011) Data Mining: Practical Machine Learning Tools and Techniques.

Readings This is a graduate class I expect you to decide what is crucial for you And what you should skim to be prepared for class discussion and for when you need to know it in 8 years

Readings That said

Readings and Participation It is expected that you come to class – I will not be taking attendance It is expected that you be prepared for class by skimming the readings to the point where you can participate effectively in class discussion – I will not be giving quizzes This is your education, make the most of it!

Course Goals This course covers methods from the emerging area of educational data mining. You will learn how to execute these methods in standard software packages And the limitations of existing implementations of these methods. Equally importantly, you will learn when and why to use these methods.

Course Goals Discussion of how EDM differs from more traditional statistical and psychometric approaches will be a key part of this course In particular, we will study how many of the same statistical and mathematical approaches are used in different ways in these research communities.

Assignments There will be 10 homeworks You choose 8 of them to complete – 4 from the first 5 (e.g. HW 1-5) – 4 from the second 5 (e.g. HW 6-10)

Assignments Homeworks will be due at least 3 hours before the beginning of class (e.g. noon) on the due date Since you have a choice of homeworks, extensions will only be granted for instructor error or extreme circumstances – Outside of these situations, late = 0 credit Homeworks will be due before the class session where their topic is discussed

Why? These are not your usual homeworks Most homework is assigned after the topic is discussed in class, to reinforce what is learned This homework is due before the topic is discussed in class, to enable us to talk more concretely about the topic in class

These homeworks These homeworks will not require flawless, perfect execution They will require personal discovery and learning from text resources Giving you a base to learn more from class discussion

Because of that You must be prepared to discuss your work in class You do not need to create slides But be prepared – to have your assignment projected – to discuss aspects of your assignment in class

I’m not your textbook I want you to learn what you can from the readings and homework And then we’ll leverage my experience in discussing the issues the readings and homeworks bring forth

Homework All assignments for this class are individual assignments – You must turn in your own work – It cannot be identical to another student’s work – The goal is to get diverse solutions we can discuss in class However, you are welcome to discuss the readings or technical details of the assignments with each other

Examples Buford can’t figure out the UI for the software tool. Alpharetta helps him with the UI. – OK! Deanna is struggling to understand the item parameter in PFA to set up the mathematical model. Carlito explains it to her. – OK!

Examples Fernando and Evie do the assignment together from beginning to end, but write it up separately. – Not OK Giorgio and Hannah do the assignment separately, but discuss their (fairly different) approaches over lunch – OK!

Plagiarism and Cheating: Boilerplate Slide Don’t do it If you have any questions about what it is, talk to me before you turn in an assignment that involves either of these University regulations will be followed to the letter That said, I am not really worried about this problem in this class

Grading 8 of 10 Assignments – 10% each (up to a maximum of 80%) Class participation 20% PLUS: For every homework, there will be a special bonus of 20% for the best hand‐in. “Best” will be defined in each assignment.

Examinations None

Accommodations for Students with Disabilities See syllabus and then see me

Questions Any questions on the syllabus, schedule, or administrative topics?

Who are you And why are you here? What kind of methods do you use in your research/work? What kind of methods do you see yourself wanting to use in the future?

This Class

EDM “Educational Data Mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings which they learn in.” (

EDM is… “… escalating the speed of research on many problems in education.” “Not only can you look at unique learning trajectories of individuals, but the sophistication of the models of learning goes up enormously.” Arthur Graesser, Editor, Journal of Educational Psychology 32

EDM is… “… great.” Me 33

Types of EDM method (Baker & Yacef, 2009) Prediction – Classification – Regression – Density estimation Clustering Relationship mining – Association rule mining – Correlation mining – Sequential pattern mining – Causal data mining Distillation of data for human judgment Discovery with models 34

Types of EDM method (Baker & Siemens, in preparation) Prediction – Classification – Regression – Latent Knowledge Estimation Structure Discovery – Clustering – Factor Analysis – Domain Structure Discovery – Network Analysis Relationship mining – Association rule mining – Correlation mining – Sequential pattern mining – Causal data mining Distillation of data for human judgment Discovery with models 35

Prediction Develop a model which can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables) Which students are off-task? Which students will fail the class?

Structure Discovery Find structure and patterns in the data that emerge “naturally” No specific target or predictor variable

Structure Discovery Different kinds of structure discovery algorithms find…

Structure Discovery Different kinds of structure discovery algorithms find… different kinds of structure – Clustering: commonalities between data points – Factor analysis: commonalities between variables – Domain structure discovery: structural relationships between data points (typically items) – Network analysis: network relationships between data points (typically people)

Relationship Mining Discover relationships between variables in a data set with many variables – Association rule mining – Correlation mining – Sequential pattern mining – Causal data mining

Discovery with Models Pre-existing model (developed with EDM prediction methods… or clustering… or knowledge engineering) Applied to data and used as a component in another analysis

Distillation of Data for Human Judgment Making complex data understandable by humans to leverage their judgment

Why now? Just plain more data available Education can start to catch up to research in Physics and Biology…

Why now? Just plain more data available Education can start to catch up to research in Physics and Biology… from the year 1985

Why now? In particular, the amount of data available from educational software is orders of magnitude more than was available just a decade ago Supported by open educational data bases like the PSLC DataShop (next week)

Learning Analytics A closely related community Who here has heard of Learning Analytics?

Two communities Society for Learning Analytics Research – First conference: LAK2011 International Educational Data Mining Society – First event: EDM workshop in 2005 (at AAAI) – First conference: EDM2008 – Publishing JEDM since 2009

Learning Analytics “… the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs.”

Two communities Joint goal of exploring the “big data” now available on learners and learning To promote – New scientific discoveries & to advance learning sciences – Better assessment of learners along multiple dimensions Social, cognitive, emotional, meta-cognitive, etc. Individual, group, institutional, etc. – Better real-time support for learners

Key Distinctions (Siemens & Baker, 2012)

Key Distinctions: Origins LAK – Semantic web, intelligent curriculum, social networks, outcome prediction, and systemic interventions EDM – Educational software, student modeling, course outcomes

Key Distinctions: Modes of Discovery LAK – Leveraging and supporting human judgment is key; automated discovery is a tool to accomplish this goal – Information distilled and presented to human decision- maker EDM – Automated discovery is key; leveraging human judgment is a tool to accomplish this goal – Humans provide labels which are used in classifiers

Key Distinctions: Guiding Philosophy LAK – Stronger emphasis on understanding systems as wholes, in their full complexity – “Holistic” approach EDM – Stronger emphasis on reducing to components and analyzing individual components and relationships between them

Key Distinctions: Adaptation and Personalization LAK – Greater focus on informing and empowering instructors and learners and influencing the design of the education system EDM – Greater focus on automated adaption (e.g. by the computer with no human in the loop) and influencing the design of interactions

Questions? Comments?

Tools There are a bunch of tools you can use in this class – I don’t have strong requirements about which tools you choose to use We’ll talk about them throughout the semester You may want to think about downloading or setting up accounts for – RapidMiner (I prefer is fine, I just will not be able to give as much tech support) – SAS OnDemand for Academics – Weka – Microsoft Excel – Java – Matlab No hurry, but keep it in mind…

Next Class Monday, January 8 3pm-4:40pm Bayesian Knowledge Tracing Corbett, A.T., Anderson, J.R. (1995) Knowledge Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction, 4, Baker, R.S.J.d., Corbett, A.T., Aleven, V. (2008) More Accurate Student Modeling Through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing. Proceedings of the 9th International Conference on Intelligent Tutoring Systems,

The End