Determining Factors of GPA Natalie Arndt Allison Mucha MA 331 12/6/07.

Slides:



Advertisements
Similar presentations
Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
Advertisements

Analysis of Variance The contents in this chapter are from Chapter 15 and Chapter 16 of the textbook. One-Way Analysis of Variance Multiple Comparisons.
Review of Univariate Linear Regression BMTRY 726 3/4/14.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
ANOVA ANALYSIS Eighth-Grade Pupils in the Netherlands.
October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009.
Generalized Linear Models (GLM)
Multiple Regression Predicting a response with multiple explanatory variables.
Zinc Data SPH 247 Statistical Analysis of Laboratory Data.
x y z The data as seen in R [1,] population city manager compensation [2,] [3,] [4,]
Lecture 23: Tues., Dec. 2 Today: Thursday:
SPH 247 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Crime? FBI records violent crime, z x y z [1,] [2,] [3,] [4,] [5,]
Some Analysis of Some Perch Catch Data 56 perch were caught in a freshwater lake in Finland Their weights, lengths, heights and widths were recorded It.
Multiple Regression Analysis. General Linear Models  This framework includes:  Linear Regression  Analysis of Variance (ANOVA)  Analysis of Covariance.
Regression Transformations for Normality and to Simplify Relationships U.S. Coal Mine Production – 2011 Source:
By Jayelle Hegewald, Michele Houtappels and Melinda Gray 2013.
How to plot x-y data and put statistics analysis on GLEON Fellowship Workshop January 14-18, 2013 Sunapee, NH Ari Santoso.
BIOL 582 Lecture Set 19 Matrices, Matrix calculations, Linear models using linear algebra.
PCA Example Air pollution in 41 cities in the USA.
9/14/ Lecture 61 STATS 330: Lecture 6. 9/14/ Lecture 62 Inference for the Regression model Aim of today’s lecture: To discuss how we assess.
Analysis of Covariance Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Use of Weighted Least Squares. In fitting models of the form y i = f(x i ) +  i i = 1………n, least squares is optimal under the condition  1 ……….  n.
Regression and Analysis Variance Linear Models in R.
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Regression Model Building LPGA Golf Performance
FACTORS AFFECTING HOUSING PRICES IN SYRACUSE Sample collected from Zillow in January, 2015 Urban Policy Class Exercise - Lecy.
Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.
Lecture 11 Multicollinearity BMTRY 701 Biostatistical Methods II.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
Lecture 7: Multiple Linear Regression Interpretation with different types of predictors BMTRY 701 Biostatistical Methods II.
Applied Statistics Week 4 Exercise 3 Tick bites and suspicion of Borrelia Mihaela Frincu
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
© Department of Statistics 2012 STATS 330 Lecture 19: Slide 1 Stats 330: Lecture 19.
Lecture 3 Linear Models II Olivier MISSA, Advanced Research Skills.
Linear Models Alan Lee Sample presentation for STATS 760.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
EPP 245 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
The Effect of Race on Wage by Region. To what extent were black males paid less than nonblack males in the same region with the same levels of education.
Nemours Biomedical Research Statistics April 9, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
1 Analysis of Variance (ANOVA) EPP 245/298 Statistical Analysis of Laboratory Data.
Stats Methods at IC Lecture 3: Regression.
Peter Fox and Greg Hughes Data Analytics – ITWS-4600/ITWS-6600
Chapter 12 Simple Linear Regression and Correlation
Résolution de l’ex 1 p40 t=c(2:12);N=c(55,90,135,245,403,665,1100,1810,3000,4450,7350) T=data.frame(t,N,y=log(N));T; > T t N y
Data Analytics – ITWS-4600/ITWS-6600/MATP-4450
Console Editeur : myProg.R 1
Chapter 12 Simple Linear Regression and Correlation
Regression Transformations for Normality and to Simplify Relationships
Multi Linear Regression Lab
Simple Linear Regression
Obtaining the Regression Line in R
ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960
Presentation transcript:

Determining Factors of GPA Natalie Arndt Allison Mucha MA /6/07

Objectives Determine important factors related to a Stevens student’s GPA Make use of methods and analytic techniques discussed in class Observe differences between (or lack thereof) engineering and science students

Initial Variable Ideas Years at school Hours work / week Hours sleep / night Cleanliness rating Which SAT score was higher Number of siblings Expected graduation year

Final Variable Ideas Gender (Primary) major # Semesters # Credits / semester GPA each semester Cumulative # credits Cumulative GPA Gender: ____________Major: ____________ SemesterCreditsGPA for Semester Total credits earned: ______Cumulative GPA: ____

Data Collection Method Voluntary Survey Anonymous Sent out to several subsets of general student body Only full-time (≥12 credits), undergraduate Stevens students considered Alumni who satisfied these conditions during their time at Stevens also considered

Lurking Variables Influence of extracurricular activities Changes in curriculum from year to year certainly a factor Personal issues, medical problems, stressful situations unaccounted for Differences between same course as time passes (professor, size, textbook, etc.) Large variability to begin with

Data Collected 28 students participated in the survey Combined 154 semesters worth of data 18 males, 10 females 19 engineering, 8 science, 1 art GPA ranged from to Credits ranged from 12.0 (imposed) to 25.5 Cumulative credits ranged from 33.0 to 177.0

After Data Was Collected … All names removed, obs category created for relating information for one individual Semester 0 refers to cumulative data Primary major used to create categorical school column Number of credits per semester used to create load category

Data Compilation obsgendermajorschoolsemcreditsloadGPA 2MaleEngineering ManagementE117.0b MaleEngineering ManagementE417.5b MaleEngineering ManagementE218.0c MaleEngineering ManagementE318.5c MaleEngineering ManagementE520.0c MaleEngineering ManagementE0101.0N/A3.947 … 20MaleComputer ScienceS313.0a MaleComputer ScienceS413.0a MaleComputer ScienceS115.0b MaleComputer ScienceS219.0c MaleComputer ScienceS069.0N/A3.884 … 26FemaleElectrical EngineeringE115.0b FemaleElectrical EngineeringE214.0a FemaleElectrical EngineeringE320.0c FemaleElectrical EngineeringE420.0c FemaleElectrical EngineeringE069.0N/A3.592

Preliminary Analysis somewhat normalskewed, left-tailed (by semester)

Initial Regressions GPA = *credits R 2 = GPA = *credits R 2 = semester datacumulative data

Residual Plots semester datacumulative data

Comparisons by Gender semester data cumulative data MaleFemale MaleFemale

Comparisons by School semester datacumulative data EngineeringScience Engineering

Comparisons by Load Load ALoad BLoad CLoad DLoad E

Stepwise Regression > stepwise = step(lm(gpa~credits+school+gender+sem),direction="both") Start: AIC= gpa ~ credits + school + gender + sem Df Sum of Sq RSS AIC - gender sem credits school Step: AIC= gpa ~ credits + school + sem Df Sum of Sq RSS AIC - sem credits school gender Step: AIC= gpa ~ credits + school Df Sum of Sq RSS AIC sem school credits gender Call: lm(formula = gpa ~ credits + school) Coefficients: (Intercept) credits schoolE schoolS > summary(stepwise) Call: lm(formula = gpa ~ credits + school) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) <2e-16 *** credits schoolE schoolS Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 122 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 3 and 122 DF, p-value: > anova(stepwise) Analysis of Variance Table Response: gpa Df Sum Sq Mean Sq F value Pr(>F) credits school Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Important Variables Both forward and stepwise regression return credits and school as most important variables Gender and semester deemed insignificant using AIC Summary returns that credits is marginally significant (10%) Anova returns that school is marginally significant (10%)

Observations & Conclusions Intercept: 2.96 Engineering majors: add 0.09 Science majors: add 0.27 Add 0.02 to GPA per credit Allows us to conclude that the science majors represented by our study average a GPA 0.18 points higher than engineering majors.

Recommendations Create a more refined study that allows us to focus on a specific area, rather than manipulating several variables at once Draw data from a significantly larger sample Find appropriate methodology to remove effect of lurking variables

Questions?