A Predictive Model for Student Retention Using Logistic Regression

Slides:



Advertisements
Similar presentations
Copyright , SPSS Inc. 1 Practical solutions for dealing with missing data Rob Woods Senior Consultant.
Advertisements

Two-sample tests. Binary or categorical outcomes (proportions) Outcome Variable Are the observations correlated?Alternative to the chi- square test if.
Acknowledgment: This project was developed in Stat 511 course – Fall Thanks to the College of Science for providing the data.Objective: For students.
Introduction to Data Mining with XLMiner
Chapter 17 Overview of Multivariate Analysis Methods
1 DATA MINING: DEFINITIONS AND DECISION TREE EXAMPLES Emily Thomas Director of Planning and Institutional Research.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Intelligible Models for Classification and Regression
Chapter 14 Inferential Data Analysis
Classification and Prediction: Regression Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Guide to Using Excel For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 6th Ed. Chapter 14: Multiple Regression.
Business Research Methods William G. Zikmund Chapter 24 Multivariate Analysis.
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
Machine Learning CUNY Graduate Center Lecture 1: Introduction.
بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.
1 Multivariate Analysis (Source: W.G Zikmund, B.J Babin, J.C Carr and M. Griffin, Business Research Methods, 8th Edition, U.S, South-Western Cengage Learning,
Chapter 24 Multivariate Statistical Analysis © 2010 South-Western/Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted.
Research Tools and Techniques The Research Process: Step 7 (Data Analysis Part C) Lecture 30.
Outline Class Intros Overview of Course & Series Example Research Projects Beginning R.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Outline Class Intros Overview of Course Example Research Project.
APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
IT Management Case # 8 - A Case on Decision Tree: Customer Churning Forecasting and Strategic Implication in Online Auto Insurance using Decision Tree.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 16.
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
APPLICATION OF DATAMINING TOOL FOR CLASSIFICATION OF ORGANIZATIONAL CHANGE EXPECTATION Şule ÖZMEN Serra YURTKORU Beril SİPAHİ.
1.2 An Introduction to Statistics Objectives: By the end of this section, I will be able to… 1) State the meaning of descriptive statistics.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Time Series Analysis – Chapter 6 Odds and Ends
MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1)
Copyright © 2010 SAS Institute Inc. All rights reserved. Decision Trees Using SAS Sylvain Tremblay SAS Canada – Education SAS Halifax Regional User Group.
Pavel B. Klimov Barry M. OConnor University of Michigan, Museum of Zoology, 1109 Geddes Ave., Ann Arbor, MI The next generation of identification tools:
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Nurissaidah Ulinnuha. Introduction Student academic performance ( ) Logistic RegressionNaïve Bayessian Artificial Neural Network Student Academic.
Patch Based Prediction Techniques University of Houston By: Paul AMALAMAN From: UH-DMML Lab Director: Dr. Eick.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
MKT 700 Business Intelligence and Decision Models Week 8: Algorithms and Customer Profiling (1)
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Enrollment Management Predictive Modeling Simplified Vince Timbers, Penn State University.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
Predicting Mortgage Pre-payment Risk. Introduction Definition Borrower pays off the loan before the contracted term loan length. Lender loses future part.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Analysis and Interpretation: Multiple Variables Simultaneously
Causality, Null Hypothesis Testing, and Bivariate Analysis
Chapter 7. Classification and Prediction
Data Based Decision Making
Hypothesis Testing.
DEMONSTRATING TOOLS FOR SUPPORTING PROGRAMS ALONG DIFFERENT STAGES OF THE EVALUATION LIFE CYCLE JBS International 10 year whole school initiative.
What Matters in Student Rating of Instructor Teaching (SRI)?
Using JMP for the Case Competition
LEVELS of DATA.
Multivariate Analysis
THE BEGINNING.
Vincent Granville, Ph.D. Co-Founder, DSC
Employee Turnover: Data Analysis and Exploration
Association Between Variables Measured at Nominal Level
Applied Statistical Analysis
Response Analysis.
כריית נתונים.
What Makes a Difference: Research on Student GPA Using ANCOVA
Multivariable Logistic Regression Split Cohort into Development &
Predicting Students’ Course Success Using Machine Learning Approach
Using JMP for the Case Competition
FOUNDATIONS OF BUSINESS ANALYTICS Introduction to Machine Learning
Developing Honors College Admissions Rubric to Ensure Student Success
Presentation transcript:

A Predictive Model for Student Retention Using Logistic Regression Fangyu Du, Sam Shi TAIR 2017

Strategic Analysis and Reporting UNT Dallas Strategic Analysis and Reporting. New Trend.

At a Glance

At a Glance 2

Structure of the Presentation Background Information of the dataset Data Preparation Modeling Use the results

Background Information of Dataset Goal : Predicting whether or not the students will retain after one year and patterns

Background Information of Dataset 2 Students who are in: Enrolled in Fall 2014 Only Undergraduate students Get rid of students who graduated

Background Information of Dataset 3

Data Preparation, Data Type

Data Preparation, Data Type 2 Measurement Continuous: height, weight, length Flag: Yes-NO Nominal: Hair color, city you live Ordinal: How you feel, how satisfied Categorical: Number to present discrete Role Target: Y Input: Xs

Data Preparation, Auto Data Prep Target: Y Predictors: Xs Recommended for use: In Equation Predictor not used: Discard

Data Preparation, Auto Data Prep 2 Predictive Power of Predictors / Xs Missing value: Keep or Drop - 50% Standardize Continuous: Easy to compare

Modeling, Algorithms Selecting Logistic Regression CHAID Neural Net

Modeling, Logistic regression Logistic regression is the appropriate statistical technique when the dependent variable is a categorical variable and the independent variables are metric or nonmetric variables. ---Multivariate Data Analysis (Seventh Edition) Y is pass/fail, win/lose, alive/dead, healthy/sick, retain/drop and you want to know the possibility based on the predictors.

Modeling, Logistic regression (Continue)

Modeling, Logistic regression (Continue2) Predictor Importance

Use the Result, Possible Leaving Students Feed new data and get result

Use the Result, Possible Leaving Students (Continue) Sort the predictive index $LP-0 (possibility of drop)

Use the Result, What matters the most

Use the Result, Decision Tree CHAID (Chi-square automatic interaction detection)

Use the Result, Decision Tree 2 CHAID (Chi-square automatic interaction detection)

Summary

Thank you! Questions? Contact us anytime if you need help! Sam Shi (Director) Sam.Shi@untdallas.edu 972-338-1785 Fangyu Du Fangyu.Du@untdallas.edu 972-338-1343