Marathon Miles per Hour at NAPA Valley 2015 Marathon by Age and Gender

Slides:



Advertisements
Similar presentations
Residuals Residuals are used to investigate the lack of fit of a model to a given subject. For Cox regression, there’s no easy analog to the usual “observed.
Advertisements

Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Logit & Probit Regression
HSRP 734: Advanced Statistical Methods July 24, 2008.
Trashball: A Logistic Regression Classroom Activity Christopher Morrell (Joint work with Richard Auer) Mathematics and Statistics Department Loyola University.
Logistic Regression Example: Horseshoe Crab Data
Poisson Regression with Rates Traffic Deaths in Finland on Friday the 13 th and Other Fridays Simo Näyhä (2002). “Traffic Deaths and Superstion.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Model Estimation and Comparison Gamma and Lognormal Distributions 2015 Washington, D.C. Rock ‘n’ Roll Marathon Velocities.
Multinomial Logistic Regression
Statistics for the Social Sciences Psychology 340 Fall 2006 Putting it all together.
Chi Square Test Dealing with categorical dependant variable.
OLS versus MLE Example YX Here is the data:
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
1 B. The log-rate model Statistical analysis of occurrence-exposure rates.
Logistic Regression with “Grouped” Data Lobster Survival by Size in a Tethering Experiment Source: E.B. Wilkinson, J.H. Grabowski, G.D. Sherwood, P.O.
C. Logit model, logistic regression, and log-linear model A comparison.
Logistic regression for binary response variables.
Poisson Regression Caution Flags (Crashes) in NASCAR Winston Cup Races L. Winner (2006). “NASCAR Winston Cup Race Results for ,” Journal.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Linear Model. Formal Definition General Linear Model.
November 5, 2008 Logistic and Poisson Regression: Modeling Binary and Count Data LISA Short Course Series Mark Seiss, Dept. of Statistics.
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
Which systems are important for a runner to succeed at a track meet? Which systems are involved in the excitement the runner feels before the first race?
Negative Binomial Regression NASCAR Lead Changes
 For each given scatter diagram, determine whether there is a relationship between the variables. STAND UP!!!
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Discrepancy between Data and Fit. Introduction What is Deviance? Deviance for Binary Responses and Proportions Deviance as measure of the goodness of.
Chapter 7 – Binary or Zero/one or Dummy Variables.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Logistic regression (when you have a binary response variable)
Beta Regression Proportion of Prize Money for Ford in NASCAR Winston Cup Races – Methodology: S.L.P. Ferrari and F. Cribari-Neto (2004). “Beta.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
The Science of Physics Mathematics. What We Want to Know… How do tables and graphs help understand data? How can we use graphs to understand the relationship.
Logistic Regression and Odds Ratios Psych DeShon.
Response Surface Model NASCAR TRACK Top Qualifying Speeds (Through 2009)
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Response Surface Model
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS
Negative Binomial Regression
Advanced Quantitative Techniques
Eastern Michigan University
Generalized Linear Models
Dr. Siti Nor Binti Yaacob
Caution Flags (Crashes) in NASCAR Winston Cup Races
Doyle M. Cummings, Pharm.D.,FCP, FCCP
Model Estimation and Comparison Gamma and Lognormal Distributions
Poisson Regression with Rates
Wildlife Population Analysis What are those βs anyway?
Practice Mid-Term Exam
Wednesday, September 23 Descriptive v. Inferential statistics.
Response Surface Model
Introduction to Logistic Regression
Logistic Regression with “Grouped” Data
Helen Barraclough, MSc, Ramaswamy Govindan, MD 
Basic Biostatistics Measures of central tendency and dispersion
Modeling Ordinal Associations Bin Hu
Presentation transcript:

Marathon Miles per Hour at NAPA Valley 2015 Marathon by Age and Gender Gamma Regression Marathon Miles per Hour at NAPA Valley 2015 Marathon by Age and Gender

Data Description Y = Race speed (Miles per Hour) at Napa Valley 2015 Marathon (3/1/2015) for 1882 Runners (977 M, 905 F) Predictor Variables: Age in Years (18-76) Gender (1=Male, 0=Female) Age x Gender Interaction Distribution: Gamma (Strictly Positive values, skewed right) Potential Link Functions Inverse Link (Conjugate) Log Link

Gamma Distribution – Likelihood Function

Gamma Distribution – Inverse Link

Gamma Distribution – Inverse Link

Estimating f

Gamma Distribution – Log Link

Results – Inverse Link The main effects of Age and gender are significant, the interaction is not Deviance, Goodness-of-Fit and Likelihood Ratio Tests are Described Below

Results – Log Link The main effects of Age and Gender are significant, the interaction is not. Deviance, Goodness-of-Fit and Likelihood Ratio Tests are Described Below

Deviance Deviance measures the discrepancy between the observed and fitted values for a model Both (link) models provide a good fit (both p > 0.50)

Additive Model Results – Log Link

R Program napaf2015 <- read.csv("http://www.stat.ufl.edu/~winner/data/napa_marathon_fm2015.csv", header=T) attach(napaf2015); names(napaf2015) gender <- factor(Gender) napa.mod1 <- glm(mph~1,family=Gamma); summary(napa.mod1); deviance(napa.mod1) napa.mod2 <- glm(mph~Age,family=Gamma); summary(napa.mod2) napa.mod3 <- glm(mph ~ Age, family=Gamma(link="log")); summary(napa.mod3) napa.mod4 <- glm(mph~gender,family=Gamma); summary(napa.mod4) napa.mod5 <- glm(mph~Age + gender,family=Gamma); summary(napa.mod5) napa.mod6 <- glm(mph ~ Age + gender, family=Gamma(link="log")); summary(napa.mod6) napa.mod7 <- glm(mph~Age*gender,family=Gamma); summary(napa.mod7) napa.mod8 <- glm(mph ~ Age*gender, family=Gamma(link="log")); summary(napa.mod8) age1 <- min(Age):max(Age) yhat.F <- exp(1.8494178 - 0.0018116*age1) yhat.M <- exp((1.8494178+0.1521938) - (0.0018116+0.0013388)*age1) plot(Age,mph,col=gender) lines(age1,yhat.F,col=1) lines(age1,yhat.M,col=2) anova(napa.mod5,napa.mod7,test="Chisq") anova(napa.mod6,napa.mod8,test="Chisq") par(mfrow=c(2,2)) plot(Age[Gender=="F"],log(mph[Gender=="F"])) plot(Age[Gender=="M"],log(mph[Gender=="M"])) plot(Age[Gender=="F"],1/mph[Gender=="F"]) plot(Age[Gender=="M"],1/mph[Gender=="M"])