Copyright © 2011 Pearson Education, Inc. Association between Categorical Variables Chapter 5.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Chapter 4 Sampling Distributions and Data Descriptions.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Introduction to XHTML Programming the World Wide Web Fourth edition.
1
Chapter 3 Demand and Behavior in Markets. Copyright © 2001 Addison Wesley LongmanSlide 3- 2 Figure 3.1 Optimal Consumption Bundle.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Objectives: Generate and describe sequences. Vocabulary:
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Create an Application Title 1A - Adult Chapter 3.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
PP Test Review Sections 6-1 to 6-6
EU market situation for eggs and poultry Management Committee 20 October 2011.
EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.
Bright Futures Guidelines Priorities and Screening Tables
An Application of Linear Programming Lesson 12 The Transportation Model.
Chi-Square and Analysis of Variance (ANOVA)
Bellwork Do the following problem on a ½ sheet of paper and turn in.
2 |SharePoint Saturday New York City
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Copyright © 2013, 2009, 2006 Pearson Education, Inc.
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.5 Dividing Polynomials Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Rational Functions and Models
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 10 Associations Between Categorical Variables.
Analyzing Genes and Genomes
© The McGraw-Hill Companies, Inc., Chapter 12 Chi-Square.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Chapter 8 Estimation Understandable Statistics Ninth Edition
Intracellular Compartments and Transport
PSSA Preparation.
Experimental Design and Analysis of Variance
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Simple Linear Regression Analysis
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Copyright Tim Morris/St Stephen's School
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 5 Association between Categorical Variables.
Copyright © 2011 Pearson Education, Inc. Describing Categorical Data Chapter 3.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Active Learning Lecture Slides For use with Classroom Response Systems Chapter 5 Association between Categorical.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 3 Describing Categorical Data.
Presentation transcript:

Copyright © 2011 Pearson Education, Inc. Association between Categorical Variables Chapter 5

5.1 Contingency Tables Which hosts send more buyers to Amazon.com? To answer this question we must gather data on two categorical variables: Host and Purchase Host identifies the originating site: MSN, RecipeSource, or Yahoo; Purchase indicates whether or not the visit results in a sale Copyright © 2011 Pearson Education, Inc. 3 of 39

5.1 Contingency Tables Consider Two Categorical Variables Simultaneously A table that shows counts of cases on one categorical variable contingent on the value of another (for every combination of both variables) Cells in a contingency table are mutually exclusive Copyright © 2011 Pearson Education, Inc. 4 of 39

5.1 Contingency Tables Contingency Table for Web Shopping Copyright © 2011 Pearson Education, Inc. 5 of 39

5.1 Contingency Tables Marginal and Conditional Distributions Marginal distributions appear in the margins of a contingency table and represent the totals (frequencies) for each categorical variable separately Conditional distributions refer to counts within a row or column of a contingency table (restricted to cases satisfying a condition) Copyright © 2011 Pearson Education, Inc. 6 of 39

5.1 Contingency Tables Conditional Distribution of Purchase for each Host (Column Counts and Percentages) Copyright © 2011 Pearson Education, Inc. 7 of 39

5.1 Contingency Tables Conditional Distribution Reveals the percentage of purchases among visitors from RecipeSource to be much less than for MSN and Yahoo Host and Purchase are associated Copyright © 2011 Pearson Education, Inc. 8 of 39

5.1 Contingency Tables Segmented Bar Charts Used to display conditional distributions Divides the bars in a bar chart into segments that are proportional to the percentage in each category of a second variable Copyright © 2011 Pearson Education, Inc. 9 of 39

5.1 Contingency Tables Contingency Table of Purchase by Region Copyright © 2011 Pearson Education, Inc. 10 of 39

5.1 Contingency Tables Segmented Bar Chart Shows Association Copyright © 2011 Pearson Education, Inc. 11 of 39

5.1 Contingency Tables Mosaic Plots Alternative to segmented bar chart A plot in which the size of each tile is proportional to the count in a cell of a contingency table Copyright © 2011 Pearson Education, Inc. 12 of 39

5.1 Contingency Tables Contingency Table of Shirt Size by Style Copyright © 2011 Pearson Education, Inc. 13 of 39

5.1 Contingency Tables Mosaic Plot Shows Association Copyright © 2011 Pearson Education, Inc. 14 of 39

4M Example 5.1: CAR THEFT Motivation Should insurance companies vary the premiums for different car models (are some cars more likely to be stolen than others)? Copyright © 2011 Pearson Education, Inc. 15 of 39

4M Example 5.1: CAR THEFT Method Data obtained from the National Highway Traffic Safety Administration (NHTSA) on car theft for seven popular models (two categorical variables: type of car and whether the car was stolen). Copyright © 2011 Pearson Education, Inc. 16 of 39

4M Example 5.1: CAR THEFT Mechanics Copyright © 2011 Pearson Education, Inc. 17 of 39

4M Example 5.1: CAR THEFT Mechanics Copyright © 2011 Pearson Education, Inc. 18 of 39

4M Example 5.1: CAR THEFT Message The Dodge Intrepid is more likely to be stolen than other popular models. The data suggest that higher premiums for theft insurance should be charged for models that are more likely to be stolen. Copyright © 2011 Pearson Education, Inc. 19 of 39

5.2 Lurking Variables and Simpsons Paradox Association Not Necessarily Causation Lurking Variable: a concealed variable that affects the apparent relationship between two other variables Simpsons Paradox: a change in the association between two variables when data are separated into groups defined by a third variable Copyright © 2011 Pearson Education, Inc. 20 of 39

4M Example 5.2: AIRLINE ARRIVALS Motivation Does it matter which of two airlines a corporate CEO chooses when flying to meetings if he wants to avoid delays? Copyright © 2011 Pearson Education, Inc. 21 of 39

4M Example 5.2: AIRLINE ARRIVALS Method Data obtained from US Bureau of Transportation Statistics on flight delays for two airlines (two categorical variables: airline and whether the flight arrived on time). Copyright © 2011 Pearson Education, Inc. 22 of 39

4M Example 5.2: AIRLINE ARRIVALS Mechanics Copyright © 2011 Pearson Education, Inc. 23 of 39

4M Example 5.2: AIRLINE ARRIVALS Mechanics – Is destination a lurking variable? Copyright © 2011 Pearson Education, Inc. 24 of 39

4M Example 5.2: AIRLINE ARRIVALS Mechanics – This is Simpsons Paradox Copyright © 2011 Pearson Education, Inc. 25 of 39

4M Example 5.2: AIRLINE ARRIVALS Message The CEO should book on US Airways as it is more likely to arrive on time regardless of destination. Copyright © 2011 Pearson Education, Inc. 26 of 39

5.3 Strength of Association Chi-Squared Statistic A measure of association in a contingency table Calculated based on a comparison of the observed contingency table to an artificial table with the same marginal totals but no association Copyright © 2011 Pearson Education, Inc. 27 of 39

5.3 Strength of Association Contingency Table Copyright © 2011 Pearson Education, Inc. 28 of 39

5.3 Strength of Association Calculating the Chi-Squared Statistic Copyright © 2011 Pearson Education, Inc. 29 of 39

5.3 Strength of Association Calculating the Chi-Squared Statistic Copyright © 2011 Pearson Education, Inc. 30 of 39

5.3 Strength of Association Cramers V Derived from the Chi-Squared Statistic Ranges in value from 0 (variables are not associated) to 1(variables are perfectly associated) Copyright © 2011 Pearson Education, Inc. 31 of 39

5.3 Strength of Association Calculating Cramers V V = 0.20 for our example There is a weak association between group (students or staff) and attitude toward sharing copyrighted music Copyright © 2011 Pearson Education, Inc. 32 of 39

5.3 Strength of Association Checklist: Chi-Squared and Cramers V Verify that variables are categorical Verify that there are no obvious lurking variables Copyright © 2011 Pearson Education, Inc. 33 of 39

4M Example 5.3: REAL ESTATE Motivation Do people who heat their homes with gas prefer to cook with gas as well? What heating systems and appliances should a developer select for newly built homes? Copyright © 2011 Pearson Education, Inc. 34 of 39

4M Example 5.3: REAL ESTATE Method The developer contacts homeowners to obtain the data. Two categorical variables: type of fuel used for home heating (gas or electric) and type of fuel used for cooking (gas or electric). Copyright © 2011 Pearson Education, Inc. 35 of 39

4M Example 5.3: REAL ESTATE Mechanics Chi-Squared = 98.62; Cramers V = 0.47 Copyright © 2011 Pearson Education, Inc. 36 of 39

4M Example 5.3: REAL ESTATE Message Homeowners prefer gas to electric heat by about 2 to 1. The developer should build about two-thirds of new homes with gas heat. Put electric appliances in all homes with electric heat and in half of the homes with gas heat (assuming that buyers for new homes have the same preferences). Copyright © 2011 Pearson Education, Inc. 37 of 39

Best Practices Use contingency tables to find and summarize association between two categorical variables. Be on the lookout for lurking variables. Use plots to show association. Exploit the absence of association. Copyright © 2011 Pearson Education, Inc. 38 of 39

Pitfalls Dont interpret association as causation. Dont display too many numbers in a table. Copyright © 2011 Pearson Education, Inc. 39 of 39