Predicting Second to Third Year Retention

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Student Retention Tracking at UM. How to Define Student Success or Student Retention: First Year Retention (& Second, Third, etc. Year Persistence) Success.
 The University of Hawai ʻ i at Mānoa – Spring 2011.
Examining Retention of Sophomores from a Consumer Satisfaction Perspective Authors: M. Rita Caso, Xiaohong Li Rebecca Bowyer & John J. Scariano O FFICE.
Institutional and Student Characteristics that Predict Graduation and Retention Rates Braden J. Hosch, Ph.D. Director of Institutional Research & Assessment.
Toya Roberts-Conston African American Male Transfer Students’
1 Predicting Success and Risk: Multi-spell Analyses of Student Graduation, Departure and Return Roy Mathew Director Center for Institutional Evaluation.
Standard Binary Logistic Regression
Logistic Regression – Complete Problems
A WDQI RESEARCH REPORT TOBY PATERSON AND GREG WEEKS FORECASTING DIVISION OFFICE OF FINANCIAL MANAGEMENT MAY 2014 The economic returns to a bachelor’s degree.
SW388R7 Data Analysis & Computers II Slide 1 Logistic Regression – Hierarchical Entry of Variables Sample Problem Steps in Solving Problems.
ARCC /08 Reporting Period Prepared by: Office of Institutional Research & Planning February 2010.
Hierarchical Binary Logistic Regression
STUDENT RETENTION PREDICTION USING DATA MINING TOOLS AND BANNER DATA Admir Djulovic Dennis Wilson Eastern Washington University Business Intelligence Coeur.
Collaboration with College Faculty to Develop and Implement an Enrollment Management Plan Presented to the Texas Association for Institutional Research,
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
WEST VIRGINIA UNIVERSITY Institutional Research WEST VIRGINIA ADVENTURE ASSESSMENT Created by Jessica Michael & Vicky Morris-Dueer.
Board of Trustees Quarterly Data Report Volume 1, Number 2 Graduation and Retention Update January 7, 2014.
A Tool for Tracking the Enrollment Flow of Older Undergraduates William E. Knight and Robert W. Zhang Bowling Green State University Dwindling state financial.
Mark Hamner Texas Woman’s University Department of Mathematics and Computer Science Preet Ahluwalia Credit Risk Analyst-AmeriCredit Predicting Real-Time.
The University of Hawai ʻ i at Mānoa ACCESS TO SUCCESS: LEADING INDICATORS WORKGROUP.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
DATA PREPARATION: PROCESSING & MANAGEMENT Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
Identifying At-Risk Students Gary R. Pike Information Management & Institutional Research Indiana University Purdue University Indianapolis.
Retention of CASNR Undergraduate Students Brianna Hitt, University of Nebraska-Lincoln.
Bonds and bridges: The relative importance of relations with peers and faculty for college student achievement Sandra Dika, PhD Assistant Research Professor.
Identifying At-Risk Students With Two- Phased Regression Models Jing Wang-Dahlback, Director of Institutional Research Jonathan Shiveley, Research Analyst.
Examining the Enrollment and Persistence of Students with Discrepant High School Grades and Standardized Test Scores Anne Edmunds, Ed.D. Higher Education.
· IUPUI · Conceptualizing and Understanding Studies of Student Persistence University Planning, Institutional Research, & Accountability April 19, 2007.
Graduation Initiative 09/14/2011NISTS STEM Transfer Success Conference1 Native vs. Transfer Students at the University of Texas at San Antonio (UTSA):
Vicki A. McCracken, Professor, School of Economic Sciences Fran Hermanson, Associate Director, Institutional Research Academic Performance and Persistence.
Undergraduate Student Persistence & Graduation advisor UI/WSU Advising Symposium September 9, 2011 Joel Michalski, Ph.D. Candidate & Karla Makus, Academic.
Template provided by: “posters4research.com” Academic Performance and Persistence of Undergraduate Students at a Land-Grant Institution: A Statistical.
Abstract Improving student success in postsecondary education is a key federal, state, and university objective that is inseparable from the focus on increasing.
Academic Performance and Persistence of Washington State University Students Vicki A. McCracken, Professor, School of Economic Sciences Fran Hermanson,
SB1440-Initial Outcomes Brian SterN Sunny Moon
Vikash Lakhani, MBA, Assistant Vice President for Student Success
Eastern Michigan University
A Statistical Analysis Utilizing Detailed Institutional Data
Defining Non-Traditional Students
Retain a Freshman Today…
BINARY LOGISTIC REGRESSION
North Texas Regional P-16 Gap Analysis for the School Year of
Community for Excellence Assessment Results
Joshua Garrison Director of Policy and Legislation
Capital Community College
Pace’s Inaugural Retention Conference June 16, 2017
Student Entry Information Cumulative1 2nd Semester
TEXAS Grant Program Report
Presented by: Office of Institutional Research (UNCG-IR) November 2017
Is High School GPA a Predictor of College Student Success?
Don’t Lag Behind Mission over past 13 years has been to help Community colleges Make better decisions, guide intervention Brad Phillips Institute for Evidence-Based.
The Impact of a Special Advising Program on Students’ Progress
The Impact, Costs, and Benefits of NC’s Early College Model
Dissertation RESULTS by Erin E. Cooper
Student Success Scorecard & Other Institutional Effectiveness Metrics
Multiple logistic regression
Allison Ambrose, PhD Illinois State University
  Dr. Yoshiko Takahashi, OIE Faculty Fellow
Using Advanced Analytics to Boost Student Success
Predicting Students’ Course Success Using Machine Learning Approach
Defining Non-Traditional Students for Retention Studies
Linda DeAngelo CIRP Assistant Director for Research
A comparative study of UNA students vs
Regression Forecasting and Model Building
Multiple Regression – Split Sample Validation
What can Google Trends Tell You About Your Institution?
Disproportionate Impact Study
Analysis on Accelerated Learning Cohorts
USG Dual Enrollment Data and Trends
Presentation transcript:

Predicting Second to Third Year Retention 12/1/2018 Predicting Second to Third Year Retention Jinny Case, Ph.D Office of Institutional Research The University of Texas at San Antonio

Outline Overview of UTSA Background Literature review 12/1/2018 Outline Overview of UTSA Background Literature review Predictive modeling process Variables Population Results Application

Overview of UTSA Established 1969 Over 30,000 students Over 4,500 FTIC students in fall 2017 95% in-state (48% Bexar County) HSI Majority minority Over 40% first generation Over 40% Pell recipients Mission of access and excellence

Background Matriculation model First term GPA model Second to third year retention model

Purpose To determine probability of retention to the third year for students who made it to their second year Develop a manageable target list of students likely to leave between their second and third year Work with advising to contact students

Retention Rates Retention Dashboard

Methodology Model Development Model Training Model Evaluation Model Application Model Improvement

Literature Demographic and pre-matriculation variables impacting first year retention also influence second to third year retention (Nora, 2005) Post-matriculation academic, financial, and social variables exert additional influence above and beyond pre-matriculation characteristics (Nora, 2005)

Model Building Development Sample Selection -Historical second-year enrollment (fall 2012-fall 2014) -First time, Full time only Variable Selection -Demographic - Academic - Financial Data Preparation -Data cleaning -Missing Data -Dummy Coding

Variable selection Demographics Academic Preparation 12/1/2018 Variable selection Third Year Enrollment Demographics - Gender - Ethnicity - First Generation - Residency Academic Preparation - High School Rank. - Test Scores (SAT/ACT). - AP - Developmental Courses Financial Variables - Scholarship - Pell Status - Lived on Campus Academic Performance - First year GPA - Degree Sought - Changed Major - Hours Earned - Hours Enrolled

Variable Coding Variable Valid Range Variable Type Reference group First Generation 0=No, 1 = Yes Dichotomous Not first generation Race/Ethnicity Black, Hispanic, Asian, White, Other 0=No, 1=Yes White Sex 0=Male, 1=Female Male Alamo Area 0=No, 1=Yes Not in Alamo Area Program BBA,BS, BA,UND, Other BA AP 0=No,1=Yes No AP credit Class Rank Top ten, next fifteen, second quarter, third quarter, fourth quarter, missing Missing Rank

Variable Coding Variable Valid Range Variable Type Reference group SAT/ACT quartile Top 25, middle fifty, bottom 25, missing 0=No, 1=Yes Dichotomous SAT/ACT Missing Pell paid first year No Pell paid second year Scholarship first year 0+ Continuous On campus Not living on campus Developmental Math 0=No,1=Yes Not in Dev. Math Developmental English Not in Dev. English Changed Major Did not change major

Dependent Variable = Retained to Third Year (0=No,1=Yes) 12/1/2018 Variable Coding Variable 0=0 Valid Range Variable Type Reference group First Year GPA < 1.0, 1.0-1.99,2.0-2.49,2.5-2.99,3.0-3.49,3.5-4.0, Missing 0=No, 1=Yes Dichotomous Missing Hours earned first year < 24, 24-29, 30 Less than 24 hours earned Hours Earned to Hours Attempted Ratio 0-1 Continuous Hours Enrolled 1+ Started as Freshman No Dependent Variable = Retained to Third Year (0=No,1=Yes)

Descriptive Statistics 12/1/2018 Descriptive Statistics   Mean SD RETAINED2YR 0.83 0.380 FIRSTGEN 0.52 0.500 BLACK 0.11 0.314 HISPANIC 0.56 0.496 ASIAN 0.06 0.233 OTHER 0.07 0.261 MALE 0.46 0.498 BBA 0.10 0.306 BS 0.499 UND 0.24 0.427 ALAMO_AREA 0.48 TOP_TEN 0.25 0.434 NEXT_FIFTEEN 0.40 0.490 SECOND_QUARTER 0.21 0.410 THIRD_QUARTER 0.238 FOURTH_QUARTER 0.01 0.082 TOP25 0.2490 0.43247 MIDDLEFIFTY 0.4895 0.49993 BOTTOM25 0.2379 0.42583   Mean SD PELL 0.60 0.489 PELL2 0.56 0.497 ON_CAMPUS 0.36 0.479 THIRTY_HOURS_EARNED 0.22 0.411 HOURS_EARNED24_29 0.50 0.500 EARNED_ATT_RATIO 0.883 0.15408 DEV_MATH 0.26 0.437 DEV_ENG 0.05 0.222 ltONE 0.011 0.10399 ONETOTWO 0.105 0.30716 TWOTOTWOFOURNINE 0.181 0.38492 TWOFIVETOTWONINE 0.256 0.43623 THREETOTHREEFOUR 0.278 0.44785 THREEFIVETOFOUR 0.169 0.37473 ON_PLUS_OFF_CAMPUS1YR 13.64 1.96496 SAME_MAJOR 0.658 0.47460 AP 0.21 0.406 SCHOLARSHIP_YEAR1 1359.67 3483.461

Variance Inflation Factor (VIF) 12/1/2018 Variance Inflation Factor (VIF) Run linear regression in SPSS for this SAT/ACT I had VIFs of over 5 on SAT/ACT groups and Class rank so I combined lowest 25th percentile on SAT/ACT with Missing SAT/ACT and used that as a reference group. I also combined the 11-25 percent in class rank and missing class rank. This I used as a reference group. This resolved multicollinearity problems.

Model Training

Model Checking: Results with Training Data   Exp(B) S.E. Wald Sig. Intercept 0.811 0.395 0.282 0.595 FIRSTGEN 0.969 0.081 0.150 0.699 BLACK 1.495 0.139 8.351 0.004*** HISPANIC 1.518 0.100 17.462 0.000*** ASIAN 1.383 0.178 3.334 0.068 OTHER 1.128 0.154 0.609 0.435 MALE 1.203 0.076 5.897 0.015** BBA 1.187 0.156 1.213 0.271 BS 0.909 0.102 0.867 0.352 UND 0.844 0.113 2.253 0.133 ALAMO_AREA 1.542 0.084 26.557 TOP25 0.588 0.129 16.952 MIDDLEFIFTY 0.835 0.096 3.488 0.062 STARTED_FR 1.145 0.215 0.393 0.531 TOP_TEN 1.100 0.108 0.783 0.376 SECOND_QUARTER 0.853 0.093 2.941 0.086   Exp(B) S.E. Wald Sig. THIRD_QUARTER 0.835 0.145 1.549 0.213 FOURTH_QUARTER 0.771 0.363 0.512 0.474 PELL 0.811 0.126 2.766 0.096 PELL2 1.488 0.121 10.719 0.001** ON_CAMPUS 1.001 0.086 0.000 0.986 THIRTY_HOURS_EARN 1.432 0.134 7.201 0.007** HOURS_EARNED24_29 1.462 15.822 0.000*** DEV_MATH 0.890 0.093 1.555 0.212 DEV_ENG 0.433 0.142 34.628 ltONE 0.041 0.393 66.377 ONETOTWO 0.289 0.164 57.466 TWOTOTWOFOURNINE 0.607 0.146 11.694 0.001*** TWOFIVETOTWONINE 0.980 0.137 0.021 0.884 THREETOTHREEFOUR 1.063 0.132 0.645 On_Off_Campus_YR1 1.126 0.020 35.269 SAME_MAJOR 0.783 0.079 9.502 0.002*** AP 1.363 0.107 8.331 0.004*** SCHOLARSHIP_YEAR1 1.000 1.011 0.315 **p<.05, ***p<.005

Model Training -Subset of full dataset (fall 2012-fall 2013) 12/1/2018 Model Training Training Training Data Set -Subset of full dataset (fall 2012-fall 2013) N=6,221 Model Fitting -Used logistic regression -Estimated coefficients with training data Test Data -Hold-out dataset of 2014 cohort -Used to validate predictive accuracy of training model -Dummy Coding

Model Training: Checking for Outliers Checked for outlying cases with potentially large residuals/high leverage using two techniques: Cook’s distance values greater than 1 Standardized residuals greater than |3| Only eight met the residual criteria and none met Cook’s D, so all cases were included in the final model

Model Training Results Null model correctly classified 82.5% of cases in training data Our model correctly classified 83.8% of cases in training data Homer and Lemeshow is non-significant, indicating good model fit

Model Training: Setting the classification cut point Default logistic regression classification cut-point for most software packages is .50 i.e., if a student’s model-generated probability of second year retention is >=.50, they will be predicted to be retained For instance, this model correctly classifies 98.3% of retained students but only 15% of non-retained students

Model Training: Determine balanced CCR 12/1/2018 Model Training: Determine balanced CCR This procedure determined that the cutoff point to maximize correct classification is .74

Manually adjusting cut point 12/1/2018 Manually adjusting cut point You manually adjust the cut point in your code here at the bottom and also save predicted values to a file that can be used on a new validation dataset

Model Predictive Accuracy 12/1/2018 Model Predictive Accuracy Overall model accuracy with the training data = 80% Overall model accuracy with the test data = 80% Training Model Actually Retained Actually Not Retained Predicted Retained 4492 614 Predicted Not Retained 613 475 Here the overall model predictive accuracy decreased a bit when we manually adjusted the cut point but now the model is correctly classifying 45% of the not retained students and 87% of the retained students. We are erring in favor of accurately predicting students who may drop out because the cost of contacting students who will be retained anyway is negligible. Test Model Actually Retained Actually Not Retained Predicted Retained 2796 410 Predicted Not Retained 387 313

Potential Model Application Future Prediction Apply model to Fall 2015 cohort data Application List of Students Export list of students and their predicted probabilities of being retained to 3rd year Can be used by advising to target students at some risk of not returning

Resources Nora, A. (2005) Student Persistence and Degree Attainment Beyond the First Year in College in Seidman, A. College student retention: formula for student success(pp 129-153). Westport, CT: Praeger Publishers.