EDUCAUSE Annual Conference

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon.
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Introduction to Data Mining with XLMiner
Decision Tree Rong Jin. Determine Milage Per Gallon.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Ensemble Learning: An Introduction
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Chapter 5 Data mining : A Closer Look.
Gavin Russell-Rockliff BI Technical Specialist Microsoft BIN305.
Microsoft Enterprise Consortium Data Mining Concepts Introduction: The essential background Prepared by David Douglas, University of ArkansasHosted by.
Enterprise systems infrastructure and architecture DT211 4
Classifiers, Part 1 Week 1, video 3:. Prediction  Develop a model which can infer a single aspect of the data (predicted variable) from some combination.
Next Generation Techniques: Trees, Network and Rules
Understanding Data Analytics and Data Mining Introduction.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
by B. Zadrozny and C. Elkan
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
More value from data using Data Mining Allan Mitchell SQL Server MVP.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Data MINING Data mining is the process of extracting previously unknown, valid and actionable information from large data and then using the information.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
1 Improving quality of graduate students by data mining Asst. Prof. Kitsana Waiyamai, Ph.D. Dept. of Computer Engineering Faculty of Engineering, Kasetsart.
A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
BOOTSTRAPPING INFORMATION EXTRACTION FROM SEMI-STRUCTURED WEB PAGES Andrew Carson and Charles Schafer.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
1 Systematic Data Selection to Mine Concept-Drifting Data Streams Wei Fan Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery.
Knowledge Discovery and Data Mining 19 th Meeting Course Name: Business Intelligence Year: 2009.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
A Decision Support Based on Data Mining in e-Banking Irina Ionita Liviu Ionita Department of Informatics University Petroleum-Gas of Ploiesti.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Predicting Mortgage Pre-payment Risk. Introduction Definition Borrower pays off the loan before the contracted term loan length. Lender loses future part.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Introduction to Machine Learning, its potential usage in network area,
A Generic Approach to Big Data Alarms Prioritization
Once Upon a Time: The Story of a Successful BI Implementation
Pengantar Sistem Informasi
Developing an early warning system combined with dynamic LMS data
Still a Toddler but growing fast
KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional.
MIS2502: Data Analytics Advanced Analytics - Introduction
Text Mining CSC 600: Data Mining Class 20.
Data Mining It's not the size of your data it's what you do with it
COMP61011 : Machine Learning Ensemble Models
Data Mining in SQL Server 2005
Data Mining Practical Machine Learning Tools and Techniques
כריית מידע -- מבוא ד"ר אבי רוזנפלד.
Prepared by: Mahmoud Rafeek Al-Farra
Analytics: Its More than Just Modeling
Prepared by: Mahmoud Rafeek Al-Farra
Somi Jacob and Christian Bach
Text Mining CSC 576: Data Mining.
Presentation transcript:

EDUCAUSE Annual Conference Who are your at-risk Students? Using Data Mining to Target Intervention Efforts Lalitha Agnihotri , Ph.D., Senior Systems Analyst, DWH Alex Ott , Ed.D., Associate Dean, Academic & Enrollment Services Niyazi Bodur, Ph.D., VP, Information Technology & Infrastructure New York Institute of Technology EDUCAUSE Annual Conference October 16th, 2013

Presentation Description and Goals Learn how to improve targeted intervention by building a model to identify and classify at-risk students using data at your institution. Gain an understanding of the complete life cycle of the At-Risk Student Identification Model.

Targeted Intervention for At Risk Students The Goal: Early targeted intervention based on risk factors for each at-risk student to improve retention Rationale for Key Elements: Early Targeted intervention Risk factors for each student

Before the Model, All We Had Was…

Students At Risk (STAR) Model Version 1.0 Data sources: Admissions data Registration/Placement test data Survey data   Method: Combine all risk variables into an aggregated measure. Alex to insert an excel

Version 1.0 Report Output:

Major Challenges with STAR 1.0 Limited attributes. Attributes of unknown strength, relevance, or even direction. Attributes equally weighted. Static Excel document: Big effort in getting all the attributes in one place. Major limitations ( limited factors, equally treated, applied simple logic calculation with manual process, no data mining applied).

Data Mining

Data Mining Classification Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. Goal: previously unseen records should be assigned a class as accurately as possible. Find a model for class attribute as a function of the values of other attributes. Select the model that performs the best. Student ID Attributes Class

STAR Model: Version 2.0 with Data Mining and Automated Tools Built and automated the full dataset in our Data Warehouse Used Data Mining tools (SQL Server Analysis Services) to train multiple dynamic statistical models Enterprise solution SSAS Modeling DMX Prediction Query SSRS Report SQL Build Data

Models Trained Logistic Regression Logistic Regression Naïve Bayes Ensemble Logistic Regression Naïve Bayes Neural Network Decision Trees Logistic Regression Naïve Bayes Neural Network Decision Trees

Data Mining Knowledge Discovery: BIG Picture Lalitha to update the graphs with new labels

Data Mining Knowledge Discovery: Detailed Picture

Model Significance And Results You can re-org

So How Did the Model Actually Perform? Change this slide

Key Takeaways Success depends on productive partnership between IT and business. Data is the KEY. Data mining is a process. Select attributes based on (retention) research and particulars of your school. a.       Data mining fundamentally is process (setup a high level goal –blue print, have departmental goal – detail drawing, physical, and data mining tools implementation, make a reality)   b.      Successful implementation needs a good partnership business department and IT c.       Data is key. No data no talking.

Questions? Lalitha Agnihotri, lagnihot@nyit.edu Alexander Ott, aott@nyit.edu Niyazi Bodur, nbodur@nyit.edu