Core Methods in Educational Data Mining HUDK4050 Fall 2015.

Slides:



Advertisements
Similar presentations
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Advertisements

Relationship Mining Association Rule Mining Week 5 Video 3.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 18, 2013.
LOGO Association Rule Lecturer: Dr. Bo Yuan
Class of 2018 Good Choices now= lots of opportunities later.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 11, 2012.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 March 16, 2012.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining: Next 10 Years Rakesh Agrawal IBM Almaden Research Center Position from KDD-2001 Revisited.
“Muddy point” one-minute papers
Some Interesting Problems Rakesh Agrawal IBM Almaden Research Center.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
2/8/00CSE 711 data mining: Apriori Algorithm by S. Cha 1 CSE 711 Seminar on Data Mining: Apriori Algorithm By Sung-Hyuk Cha.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Educational Data Mining and DataShop John Stamper Carnegie Mellon University 1 9/12/2012 PSLC Corporate Partner Meeting 2012.
Measuring for Success Module Nine Instructions:
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
CCT 355: E-Business Technologies Class 1: Introduction to Course.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
CONTEMPLATION, INQUIRY, AND CREATION: HOW TO TEACH MATH WHILE KEEPING ONE’S MOUTH SHUT Andrew-David Bjork Siena Heights University 13 th Biennial Colloquium.
1 Apriori Algorithm Review for Finals. SE 157B, Spring Semester 2007 Professor Lee By Gaurang Negandhi.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 2, 2012.
Mining Sequential Patterns Rakesh Agrawal Ramakrishnan Srikant Proc. of the Int ’ l Conference on Data Engineering (ICDE) March 1995 Presenter: Sam Brown.
Course on Data Mining: Seminar Meetings Page 1/17 Course on Data Mining ( ): Seminar Meetings Ass. Rules EpisodesEpisodes Text Mining
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
Christoph F. Eick Questions and Topics Review November 11, Discussion of Midterm Exam 2.Assume an association rule if smoke then cancer has a confidence.
Association Rule Mining
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 March 16, 2012.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
ENCRYPTION TAKE 2: PRACTICAL DETAILS David Kauchak CS52 – Spring 2015.
Special Topics in Educational Data Mining HUDK5199 Spring, 2013 April 17, 2012.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 6, 2013.
Physics of Animation (Art/Physics 123) Prof. Alejandro Garcia Fall 2009 Class is fully enrolled and I am not allowed to add students. Sorry.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 April 15, 2013.
Advanced Methods and Analysis for the Learning and Social Sciences
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Big Data, Education, and Society
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Presentation transcript:

Core Methods in Educational Data Mining HUDK4050 Fall 2015

Factor Analysis.vs. Clustering What’s the difference?

Factor Analysis: Any Questions?

What… Are the general advantages of structure discovery algorithms (clustering, factor analysis) Compared to supervised/prediction modeling methods?

What… Are the general advantages of structure discovery algorithms (clustering, factor analysis) Compared to supervised/prediction modeling methods? What are the disadvantages?

Important point… If you cluster in a well-known domain, you are likely to obtain well-known findings

Because of this… Clustering is relatively popular But somewhat prone to uninteresting papers in education research – Where usually a lot is already known So be thoughtful…

Bowers (2010) Any questions?

Association Rule Mining

Today’s Class The Land of Inconsistent Terminology

Association Rule Mining Try to automatically find simple if-then rules within the data set Another method that can be applied when you don’t know what structure there is in your data Unlike clustering, association rules are often obviously actionable

Association Rule Metrics Support Confidence What do they mean? Why are they useful?

Association Rule Metrics Interestingness What are some interestingness metrics? Why are they needed?

Why is interestingness needed? Possible to generate large numbers of trivial associations – Students who took a course took its prerequisites (Vialardi et al., 2009) – Students who do poorly on the exams fail the course (El-Halees, 2009)

Association Rule Metrics What do Merceron & Yacef argue?

Association Rule Metrics What do Luna-Bazaldua and colleagues argue?

Association Rule Metrics What are the merits and drawbacks to each of these approaches?

Arbitrary Cut-offs The association rule mining community differs from most other methodological communities by treating cut-offs for support and confidence as arbitrary Researchers typically adjust them to find a desirable number of rules to investigate, ordering from best-to- worst… Rather than arbitrarily saying that all rules over a certain cut-off are “good” What are the strengths and weaknesses of this approach?

Any questions on apriori algorithm?

Let’s do an example Volunteer please?

Someone pick Support Confidence

Generate Frequent Itemset ABCFABDGABEFBEGH BDIJBCDJDEFJABCD DEGJDEGJABCEABCF BCDJBCDEDEFKDEGH

Was the choice of support level appropriate? ABCFABDGABEFBEGH BDIJBCDJDEFJABCD DEGJDEGJABCEABCF BCDJBCDEDEFKDEGH

Re-try with lower support ABCFABDGABEFBEGH BDIJBCDJDEFJABCD DEGJDEGJABCEABCF BCDJBCDEDEFKDEGH

Generate Rules From Frequent Itemset ABCFABDGABEFBEGH BDIJBCDJDEFJABCD DEGJDEGJABCEABCF BCDJBCDEDEFKDEGH

Questions? Comments?

Rules in Education What might be some reasonable applications for Association Rule Mining in education?

Assignment B8 Sequential Pattern Mining Due Tuesday

How are people doing on Assignment B8 One person posting to forums Looks like this is more challenging than other basic assignments So get rolling and get on the forums!

Final Project

By my count, we have 30 people still actively doing assignments 30*5=150 Our class is 100 minutes long, and we can’t be guaranteed that our class won’t be interrupted exactly at 240pm by people who think they are more important than us

Earlier optional session Therefore, I’ll be giving over part of class on 12/10 and 12/15 to final project presentations Please sign up on the discussion forum for what slot you would like

Next Class Tuesday, December 1: Sequential Pattern Mining Readings Baker, R.S. (2014) Big Data and Education. Ch. 5, V4. Srikant, R., Agrawal, R. (1996) Mining Sequential Patterns: Generalizations and Performance Improvements. Research Report: IBM Research Division. San Jose, CA: IBM. [pdf][pdf] Perera, D., Kay, J., Koprinska, I., Yacef, K., Zaiane, O. (2009) Clustering and Sequential Pattern Mining of Online Collaborative Learning Data. IEEE Transactions on Knowledge and Data Engineering, 21, [pdf][pdf] Shanabrook, D.H., Cooper, D.G., Woolf, B.P., Arroyo, I. (2010)Identifying High-Level Student Behavior Using Sequence- based Motif Discovery. Proceedings of the 3rd International Conference on Educational Data Mining,

Happy Thanksgiving! Enjoy your Turkey or Tofurkey or Traditional Thanksgiving Yak

The End