Trashball: A Logistic Regression Classroom Activity Christopher Morrell (Joint work with Richard Auer) Mathematics and Statistics Department Loyola University.

Slides:



Advertisements
Similar presentations
732G21/732G28/732A35 Lecture computer programmers with different experience have performed a test. For each programmer we have recorded whether.
Advertisements

Simple Logistic Regression
Logistic Regression Example: Horseshoe Crab Data
Simple Linear Regression and Correlation
Objectives (BPS chapter 24)
Multiple Logistic Regression RSQUARE, LACKFIT, SELECTION, and interactions.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Statistics for Business and Economics
Statistics 350 Lecture 1. Today Course outline Stuff Section
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
EPI 809/Spring Multiple Logistic Regression.
REGRESSION AND CORRELATION
An Introduction to Logistic Regression
Generalized Linear Models
Logistic regression for binary response variables.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
STAT E-150 Statistical Methods
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Simple Linear Regression
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Chapter 14 Introduction to Multiple Regression
CHAPTER 14 MULTIPLE REGRESSION
Shonda Kuiper Grinnell College. Statistical techniques taught in introductory statistics courses typically have one response variable and one explanatory.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Introduction to Linear Regression
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Session 10. Applied Regression -- Prof. Juran2 Outline Binary Logistic Regression Why? –Theoretical and practical difficulties in using regular (continuous)
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Analysis of Two-Way Tables Moore IPS Chapter 9 © 2012 W.H. Freeman and Company.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.
Test of Goodness of Fit Lecture 43 Section 14.1 – 14.3 Fri, Apr 8, 2005.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Logistic Regression. Linear Regression Purchases vs. Income.
Multiple Logistic Regression STAT E-150 Statistical Methods.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Logistic regression (when you have a binary response variable)
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Chapter 9 Minitab Recipe Cards. Contingency tests Enter the data from Example 9.1 in C1, C2 and C3.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
BPS - 5th Ed. Chapter 231 Inference for Regression.
BINARY LOGISTIC REGRESSION
John Loucks St. Edward’s University . SLIDES . BY.
Generalized Linear Models
Logistic Regression.
Introduction to Logistic Regression
Regression and Categorical Predictors
Presentation transcript:

Trashball: A Logistic Regression Classroom Activity Christopher Morrell (Joint work with Richard Auer) Mathematics and Statistics Department Loyola University Maryland CAUSE Webinar: February 28, 2012

Background  Statistical methods or linear models courses initially discuss continuous numerical response variables.  Can also consider response variables that are either binary or categorical.  More introductory linear models and statistical methods books now include chapters or sections devoted to logistic regression (for example, see Kleinbaum et al. (2008), Kutner et. al. (2003), Ott, and Longnecker (2010)).

Trashball: The Activity  Students attempt to toss a ball into a trashcan.  The outcome or response variable is whether or not the ball ends up in the trashcan - “ShotMade.”  Success depends on the distance from the trashcan (and other factors).

Equipment

Design and Explanatory Variables  Four factors in the design of the experiment in the Fall of 2003: o distance from the trashcan (from 5 to 12 feet), o orientation of the trashcan, o gender of the student, and o type of ball used (tennis ball or racquetball).

Design and Explanatory Variables (Continued)  The settings of the explanatory variables make up the design of the experiment.  The various factors should be balanced across the experiment so that interaction terms can be estimated.  This may be difficult to achieve; some of the students were absent on the day of the experiment.  To increase the sample size, each student makes three attempts from varying combinations of distance, orientation, and type of ball.  The repeated observations may induce some non- independence and this should be mentioned during the execution of the experiment.

Design Results Gender Male Female Total Raquet Tennis Total Gender Male Female Total Narrow Wide Total

Design Results Orientation Narrow WideTotal Raquet Tennis Total 21 42

Design Results Shot Distance (in feet) Total Raquet Tennis Total

Design Results Shot Distance (in feet) Total Narrow Wide Total

Design Results Shot Distance (in feet) Total Male Female Total

Classroom Presentation  As Trashball is described, the students begin to realize that the response variable has only two outcomes.  Assumptions of linear regression not valid.  Linear regression may lead to predictions that are negative or greater then one.  We are actually trying to model the probability of a success - the results must be values between 0 and 1.  Introduce the logistic regression function and explain its properties.  Enter the data into Minitab as the activity progresses.  The results are immediately displayed using a classroom projection system.

Results of Activity  The activity was conducted in subsequent offerings of Experimental Research Methods, a junior/senior level course taken by majors and minors.  Here I present the actual data collected from the activity in the fall of Figure 1. Shot Made vs. Distance. Jitter is added to the points to show the repeated observations. The Lowess curve is overlaid.

Minitab Output 1. Logistic regression for shot made with distance. Binary Logistic Regression: ShotMade versus Distance Link Function: Logit Response Information Variable Value Count ShotMade 1 25 (Event) 0 17 Total 42 Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant Distance

Minitab Output 1. (Continued) Log-Likelihood = Test that all slopes are zero: G = , DF = 1, P-Value = Goodness-of-Fit Tests Method Chi-Square DF P Pearson Deviance Hosmer-Lemeshow Table of Observed and Expected Frequencies: (See Hosmer-Lemeshow Test for the Pearson Chi-Square Statistic) Group Value Total 1 Obs Exp Obs Exp Total  The goodness of fit tests indicate that this model provides an adequate description of this data.

Fitted Logistic Model P(Shot made | distance x) = Figure 2. The fitted linear and logistic regression models.

Minitab Output 2. Multiple Logistic regression Binary Logistic Regression: ShotMade versus Distance, Ball, Orientation, Gender Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant Distance Ball Orientation Gender Log-Likelihood = Test that all slopes are zero: G = , DF = 4, P-Value = 0.001

Minitab Output 3. Final multiple logistic regression Binary Logistic Regression: ShotMade versus Distance, Orientation Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant Distance Orientation Log-Likelihood = Test that all slopes are zero: G = , DF = 2, P-Value = Goodness-of-Fit Tests Method Chi-Square DF P Pearson Deviance Hosmer-Lemeshow

Observed proportion and modeled probabilities by orientation and distance. Shot Distance (in feet) Observed Probability Wide/ Shallow Narrow/ Deep

Conclusions  Chapters or sections on logistic regression are appearing more frequently in texts on statistical methods/linear models.  Trashball may be used to motivate the use of logistic regression to model a binary response variable.  Our students had fun!  When conducting a mid-semester evaluation of the course, one student responded “More Trashball.”  Suggested modifications: o Use balls that are more different (tennis and table tennis). o Hand student uses to toss the ball (writing hand, other hand). JSE paper: