Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bivariate Statistics (continued) & Multivariate Statistics) Chapter 11.

Similar presentations


Presentation on theme: "Bivariate Statistics (continued) & Multivariate Statistics) Chapter 11."— Presentation transcript:

1 Bivariate Statistics (continued) & Multivariate Statistics) Chapter 11

2 Today’s Topics Bivariate Statistics (Measures of Association) Trivariate Statistics (control variables, partials) Constructing tables & interpreting descriptive statistics Different Styles of Presentation Preparation for Lab activities and work on Final Assignment

3 Recall (Lecture 2) *Types of variables* independent variable (cause) dependent variable (effect) intervening variable – (occurs between the independent and the dependent variable temporally) control variable – (temporal occurance varies, illustrations later today)

4 Causal Relationships proposed for testing (NOT like assumptions) 5 characteristics of causal hypothesis (p.128) – at least 2 variables – cause-effect relationship (cause must come before effect) – can be expressed as prediction – logically linked to research question+ a theory – falsifiable

5 Types of Correlations & Causal Relationships between Two Variables X=independent variable Y=dependent variable Positive Correlation (Direct relationship) – when X increases Y increases or vice versa Negative Correlation (Indirect or inverse relationship) – when X increases Y decreases or vice versa Independence – no relationship (null hypothesis) Co-variation – vary together ( a type of association but not necessarily causal) YX - YX +

6 Interpreting a Relationship between two variables Do the patterns in the tables mean that there is a relationship between the two variables? If there is a relationship, how strong is it? Are the results statistically significant? Are the results meaningful ? Methods for studying relationships – Create Cross-tabulations & percentaged tables – Create Graphs, charts (like scattergrams or plots) – Use a set of statistics called measures of association

7 Using Graphs to show relations (royal wedding and electricity use)

8 Five Common Measures of Association between Two Variables

9 General Idea of Statistical Significance In general English ‘significance’ means important or meaningful but this is NOT how the term is used in statistics Tests of statistical significance show you how likely a result is due to chance.

10 Interpreting Statistical Significance Levels “The most common level, used to mean something is good enough to be believed, is.95. This means that the finding has a 95% chance of being true. However, this value is also used in a misleading way. No statistical package will show you "95%" or ".95" to indicate this level. Instead it will show you ".05," meaning that the finding has a five percent (.05) chance of not being true, which is the converse of a 95% chance of being true. To find the significance level, subtract the number shown from one. For example, a value of ".01" means that there is a 99% (1-.01=.99) chance of it being true.”

11 Example Source: http://www.surveysystem.com/signif.htm

12 Discussion of Example “Refer to the preceeding table. The chi (pronounced kie like pie) squares at the bottom of the table show two rows of numbers. The top row numbers of 0.07 and 24.4 are the chi square statistics themselves. The meaning of these statistics may be ignored for the purposes of this article. The second row contains values.795 and.001. These are the significance levels.”

13 Discussion “In this table, there is probably no difference in purchases of gasoline X by people in the city center and the suburbs, because the probability is.795 (i.e., there is only a 20.5% chance that the difference is true). In contrast the high significance level for type of vehicle (.001 or 99.9%) indicates there is almost certainly a true difference in purchases of Brand X by owners of different vehicles in the population from which the sample was drawn. “

14 Correlation Coefficient statistics which can help to describe data sets which contain variables measured at the interval and ratio levels. measures of association between two (or more) variables. tests whether a relationship exists between two variables. indicates both the strength of the association and its direction (direct or inverse). Pearson product-moment co rrelation coefficient, written as r, can describe a linear relationship between two variables.

15 Correlation Coefficient Is there a relationship between: the budget of the police department and the crime rate? the hours of batting practice and a player's batting average? The value of r can range from 0.0, indicating no relationship between the two variables, to positive or negative 1.0, indicating a strong linear relationship between the two variables.

16 Reading and Constructing Tables Some Important Factors in Interpretation of Tables – percentages vs. “raw” frequencies, need to know absolute number of cases (N=) – Importance of direction of calculation of percentages (for bivariate and multivariate statistics) – collapsing categories (recall treatment of missing cases) – Standardization (using rates or ratios) to summarize data – Introducing a third variable (“intervening” or “control variable”) to see what effect this has on apparent relationship between two variables

17 Using Measures of Central Tendancy & Standardization to Study Earnings by Sex Babbie, E. (1995). The practice of social research Belmont, CA: Wadsworth

18 In his book on reading statistical tables, Say it with Figures, Hans Zeisel presents the following data: Automobile Accidents by Sex ------------------------------------------ Per Cent Accident Free Women68% (6,950) Men56% (7,080) ------------------------------------------ Automobile Accidents by Sex and Distance Driven ---------------------------------------------------------------------------- Distance Under 10,000 km10,000 km & Over Per Cent Per Cent Accident Free Women75% 55% (5,035) (1,915) Men75% 51% (2,070) (5,010) ---------------------------------------------------------------------------- Women in this study have fewer accidents than men because women tend to drive shorter distances than men, and people who drive shorter distances tend to have fewer accidents

19 Discussion of other ways to present same data & talk about observations styles of presentation of cross-tabulations – Conventions & ease of interpretation of different ways of calculating percentages and presenting bivariate relations Examination of frequency distributions of univariate statistics Introduction of third variable Examples of ways of describing the relationships in words

20 Table 1a: Experience of Accidents in the Past Year by Gender of Driver (with column frequencies & percentages for independent variable) In this study 68% of the women had been accident free as compared to 56% of the men. Are women better drivers than men?

21 Table 1b: Automobile Accidents by Gender of Driver (with row frequencies & percentages for independent variable)

22 Commentary Based on findings from this survey female drivers are more likely than men to have been accident-free in the past year. The majority of women in the sample (68%) had not had accidents in the past year. A smaller percentage of male drivers (56%) declared they had been accident free in the past year.

23 Table 2a: Experience of Car Accidents by Gender of Driver (with row frequencies & percentages for dependent variable)

24 Table 2b: Experience of Car Accidents by Gender of Driver (with column frequencies & percentages for dependent variable)

25 Comments on Tables 2a & 2b Table 2a and 2b present finding from this survey which indicate that 63% of drivers who experienced at least one car accident in the past year were male and only 37% were female. Does this mean men are more likely to experience car accidents than women?

26 Table 3a: Experience of Car Accidents by Gender of Driver and Distance Driven Annually

27 Comments on Table 3a If we control for “distance driven annually” the differences between the percentage of accident free drivers by gender disappears entirely for people who drive shorter distances annually. In the case of people who drive longer distances annually (10,000 km or more) there is only a small difference between men and women. Overall, people who drive shorter distances are more likely to be accident-free than people who drive longer distances, regardless of gender.

28 continued People who drove less than 10,000 km annually had the same likelihood of being accident free regardless of gender. Three quarters (75%) of female drivers who drive less than 10,000 km annually remained accident-free and three quarters of male drivers (75%),

29 continued Female drivers who drove longer distances were only slightly more likely than male drivers to have remained accident-free: 55% of the females who drove 10,000k m annually or more had remained accident-free as compared to 51% of male drivers who drove 10,000km or more.

30 Table 3b: Experience of car accident(s) by Gender of Driver and Distance Driven Annually (row percentages—but a different breakdown)

31 Comments on Table 3B In this study male drivers who drove longer distances comprise 55% of drivers who had experienced at least one accident in the past year. What would you say about other patterns in this table concerning the trends related to the three variables? (Having at least one accident last year, gender and distance driven?)

32 Table 1a: Frequency Distribution of Gender of Drivers Males represent a slightly higher proportion of the respondents (55%) than females (45%) in the sample survey.

33 Table 1b: Frequency Distribution of Drivers with and without accidents in the past year The majority of drivers (61%) had not had any accidents in the past year.

34 Table1c: Frequency Distribution of Distance Driven Annually Over half of the drivers in the sample (59%) drive 10,000 km or over annually.

35 Table 2c : Experience of Car Accident(s) in Past Year by Distance Driven Annually (row percentages) Driving longer distances annually increased the likelihood of experiencing at least one car accident in the past year. Almost three quarters of drivers who experienced at least one accident in the past year drove longer distances.

36 More comments on 2c Driving shorter distances decreased the likelihood of having had a car accident in the past year. However there were more people in the sample who drove longer distances (13391 in the category “10,000 km or more’ as compared to 9473 who drove less than 10,000 km annually.) The percentage of drivers who drove longer distances was larger (61% of the drivers in the survey drove 10,000 km or more.)

37 Table 2d: Distance Driven Annually by Gender (Column percentages)

38 Comments on Table 2d Women drivers in the sample drove shorter distances annually than men. Two-thirds of women in the sample (66%) drove less than 10,000 km annually, while 78% of men in the sample drove 10,000 km or more.

39 Table 2e: Distance Driven Annually by Gender (row percentages)

40 Table 3: Experience of car accident(s) by Gender of Driver and Distance Driven Annually (column percentages)

41 Table 3: Experience of Car Accidents by Gender of Driver and Distance Driven Annually

42 Comments on 2e The majority of drivers who drove less than 10,000 annually were women (71%). Almost three quarters (74%) of drivers who drove 10,000 were male.

43 Comparison with Ziezel’s Tables Automobile Accidents by Sex ------------------------------------------ Per Cent Accident Free Women68% (6,950) Men56% (7,080) ------------------------------------------ Automobile Accidents by Sex and Distance Driven ---------------------------------------------------------------------------- Distance Under 10,000 km10,000 km & Over Per Cent Per Cent Accident Free Women75% 55% (5,035) (1,915) Men75% 51% (2,070) (5,010) ---------------------------------------------------------------------------- Women in this study have fewer accidents than men because women tend to drive shorter distances than men, and people who drive less frequently tend to have shorter distances.

44 Ways of Presenting of Percentaged Tables Table 1. Percentage in support of strike by type of school Percent supporting Type of School Strike Secondary60% (800) Elementary30% (1000) __________________________________________________________ N = 1800 Serial NumberDescriptive Caption Dependent Variable Independent Variable Categoriess One category of dichotomous dependent variable Marginals for independent variable Total Sample

45 Percentaged Tables (cont’d) Table 2. Percentage who support strike by type of school and sex Sex Female Per cent Male Per cent Type of School supporting strikesupporting strike Secondary60%60% (400) Elementary30%30% (900) (100) __________________________________________________________ Female =.30  : Male =.30N = 1800 Dependent Variable Independent Variable Control variable Control variable Categories of control variable

46 Using SPSS to ConstructTables General portal to SPSS tutorials online: http://www.spsstools.net/spss.htm crosstabs (two variables) and ‘layering’ (introducing variables) http://calcnet.mth.cmich.edu/org/spss/V16_materials/Video_ Clips_v16/14crosstabs/14crosstabs.swf http://calcnet.mth.cmich.edu/org/spss/V16_materials/Video_ Clips_v16/14crosstabs/14crosstabs.swf

47 Example of Raw Data Partials & Reading Marginals Regan, T. (1985). In search of sobriety: Identifying factors contributing to the recovery from alcoholism. Kentville, NS.

48 Another Example Research Question: Is TRUST is related to RACE? TRUST = dependent variable RACE = independent variable Source: http://www.ssric.org/trd/text/chapter8

49 Comments At first glance, RACE differences appear to be very important (overall, 58% of those surveyed said people cannot be trusted, but the epsilon statistic -- the difference between the highest and lowest percentage -- is 22). Also note that few Respondents said “Depends” – most had a definite opinion here.

50 Recoding “Let’s do some recoding: RACE should be recoded into a different variable called RACER (Race Recoded). Whites and Blacks will stay the same, but Other is eliminated by recoding it as missing. Let’s also recode TRUST into a different variable called TRUSTR to eliminate the “Depends” category. Don’t forget to create new value labels after you recode. Now run the crosstabs for TRUSTR and RACER. Your output should look like Figure 8-4. “ (See http://www.ssric.org/trd/text/chapter8 for details)http://www.ssric.org/trd/text/chapter8

51 Example: Race & Trust Recoded

52 Comments “When "pp", a percentage point difference (epsilon) is this high, it’s “interesting” (actually, anything higher than 10-12 is interesting) even if you don't yet know whether it is statistically significant. Here you have a pp difference of 24. And here’s how you might describe what you’ve found so far: “Although most Respondents (62%) say that other people cannot be trusted, over 80% of the Black respondents said this compared to 58% of the Whites in this sample.” Or, “Fewer than one-fifth of Blacks said that people can be trusted, compared to more than two-fifths of Whites.”

53 Question Can you have confidence that race is the causal factor here? While it may indeed be true that race is explanatory, you won't really have confidence in this conclusion until you have failed to account for this variation in any other way.

54 Introducing a Control Variable – In the next example EDUC was recoded as EDUC2 into two categories, those with high school or less (0 12 years), and those with more then high school (13+ years). After these recodes, let's see what happens when we do crosstabs – You still have the two columns of your independent variable (RACER), but you can compare TRUSTR for people who have no college education (0-12 Years) with those who do (13+).

55 Trust by Race Controlling for Education

56 Comments "Whites are more likely than blacks to think people can be trusted holding education constant (50.1% vs. vs. 31.3% and 32.3% vs. 9.3%). Those with more education are more likely than those with less education to say people can be trusted holding race constant (50.1% vs. 32.3% and 31.3% vs. 9.3%). Both education and race are related to trust of people".

57 Comments “So what is more important, race or years of education? Just as you can’t stop with a crosstab of only two variables when you want to test out your hypotheses, you also can’t stop with just one control variable. Some of the other “major demographic variables” that might explain social differences include sex, social class, income, occupation, marital status, age, political ideology, and religion.”

58 Discussion of Lab Activities & Demonstration of Accessing Data from Class Survey


Download ppt "Bivariate Statistics (continued) & Multivariate Statistics) Chapter 11."

Similar presentations


Ads by Google