Presentation is loading. Please wait.

Presentation is loading. Please wait.

Week 3 Lecture Notes PSYC2021: Winter 2019.

Similar presentations


Presentation on theme: "Week 3 Lecture Notes PSYC2021: Winter 2019."— Presentation transcript:

1 Week 3 Lecture Notes PSYC2021: Winter 2019

2 The following program, set of slides, are rated with low intensity in terms of introducing or reviewing statistical symbols. However, viewer’s discretion is advised.

3 Frequency Distribution
The pattern of variation of a variable is called “distribution”. A frequency distribution is an organized tabulation of the number of individuals (number of observations) located in each category on the scale of measurement. A frequency distribution takes a disorganized set of scores and places them in an order from highest to lowest, group together individuals who all have the same score. A frequency distribution can be structured either as a table or as a graph, but in either case, the distribution presents the same two elements: The set of categories that make up the original measurement scale. A record of the frequency, or number of individuals in each category. Thus, a frequency distribution presents a picture of how the individual scores (observations) are distributed on the measurement scale.

4 Describing Categorical Data
Descriptive Statistics: Frequency: counts Relative Frequency (Proportion): Frequency (count) of a category divided by total counts Note: Proportion is a number between 0 and 1 Percentage: proportion (relative frequency) x 100 Table: Frequency Distribution Table (e.g., frequency; relative frequency (proportion); percentage) *additionally, we can obtain cumulative percentages (see an example in upcoming slides) Graphical Display: Bar Chart (separate bars to display distinct categories) Pie Chart (displays categories in pies; fraction of a whole)

5 Describing A Categorical Data: Canadian Data Set Attitudes Toward Learning (ASETS, 2008)
The Access and Support to Education and Training Survey (ASETS) 2008 asked a random sample of Canadians aged 18 to 64 years old: “To what extend do you agree or disagree with the statement: Learning New Things is Fun.” The responses ranged from 1 to 5: (1 = Strongly Agree, 2 = Somewhat Agree, 3 = Somewhat Disagree, 4 = Strongly Disagree, 5 = Neither Agree nor Disagree) We can access this data from: Computing in the Humanities and Social Sciences - U of T: Click on: Access and Support to Education and Training Survey, 2008 (ASETS)

6 A Categorical Data: Attitudes Toward Learning (ASETS, 2008)
Click on Data

7 A Categorical Data: Attitudes Toward Learning (ASETS, 2008)
Click on Codebooks > SDA codebooks

8 A Categorical Data: Attitudes Toward Learning (ASETS, 2008)
Click on Sequential Variable List

9 Attitudes Toward Learning New Things is Fun (ASETS, 2008)
Click on Attitudes Towards Learning

10 Attitudes Toward Learning New Things is Fun (ASETS, 2008)
Click on item: al_g Learning new things is fun What do you think about Canadians’ responses to this item is? Do you think they all agreed? Or some of them agreed?

11 Attitudes Toward Learning New Things is Fun (ASETS, 2008)
Do you expect to obtain the same answers (responses) from different selection of Canadians in year 2008? Do you expect to obtain the same responses from the same selected Canadians in 2018?

12 Summarizing and Describing a Categorical Variable Attitudes Toward Learning New Things is Fun (ASETS, 2008) For Variable Selection: Click on “Attitudes toward learning” and select “al_g02” and then “Copy to Row” On the right-side menu: Change Weight to “No Weight” and for Chart Option, select “Bar chart”. You can also select the “Question text”. Click on “Run the Table”.

13 Frequency Distribution Table Attitudes Toward Learning New Things is Fun (ASETS, 2008)
Majority (66.8%) of the respondents strongly agreed with the statement.

14 Frequency Distribution Table Attitudes Toward Learning New Things is Fun (ASETS, 2008)
Frequency Table: Count the number of cases corresponding to each category and put them into a table. Frequency table records the totals and uses the category names to label each row. The table on the right describes the distribution of Canadian responses to the statement “Learning new things is fun”, because it names the possible categories and tell how frequently each occur (how cases are distributed across the categories). Example: 15,712 participants strongly agreed to the statement Relative Frequency: Divide the count by the total number of cases. This gives fraction (proportion) of the whole. Example: 15712/23519 = 0.668 Multiply the proportions by 100 to obtain the percentages. Example: x 100 = 66.8% Majority (66.8%) of the respondents strongly agreed with the statement.

15 Visualizing Categorical Variable with Bar Chart Bar Chart of Attitudes Toward Learning New Things is Fun (ASETS, 2008) Bar Chart: Display the distribution of a categorical variable. Shows the frequency (count) for each category next to each other for easy comparison. The height of the bar shows the count for its category It is better to have spaces between bars to indicate that these are freestanding bars that could be arranged into any order. The bars are the same width so their heights determine the areas. These areas are proportional to the counts in each category. Note: Bar chart stays true to the Area Principle. Area Principle: The area occupied by a part of the graph should correspond to the magnitude of the value it represents.

16 Visualizing Categorical Variable with Pie Chart Attitudes Toward Learning New Things is Fun (ASETS, 2008) Pie Chart: Display the whole group of cases as a circle. It slices the circle into pieces whose size is proportional to the fraction of a whole. Majority (66.8%) of the respondents strongly agreed with the statement.

17 What could be a possible visual problem (or a confusion) with this bar graph?

18 Important Consideration Regarding Bar Graphs
There should be spaces between adjacent bars. For nominal scale, separate bars emphasize that the scale consists of separate, distinct categories. For ordinal scales, the separate bars are used because you cannot assume that the categories are all the same.

19 Export Learning New Things is Fun Data in a Text File
The link to SDA at CHASS to access this data:

20 Export Learning New Things is Fun Data in a Text File
In data file, choose: “csv file” At the bottom of this page, select variables from two components: Demographic variables Attitudes towards learning Click on continue at the bottom of this page.

21 Export Learning New Things is Fun Data in a Text File
For demographic variables, choose: Sexes: sex of the main respondents For Attitudes toward learning, choose: Al_g02: learning new things is fun. Click on continue at the bottom of this page.

22 Export Learning New Things is Fun Data in a Text File
Click on Create the Files. Note the “Files to create” and the “individual variables specified (including partial groups). You should see the variables of interest to export as CSV file format.

23 Export Learning New Things is Fun Data in a Text File
Right click on Data file, and save link as … Note this format might be different for MAC I saved the file as “Learning_New_Fun”

24 Export Learning New Things is Fun Data in a Text File
Open the saved excel file (CSV file)

25 Export Learning New Things is Fun Data in a Text File
Save the CSV file as a Text file (I believe this works better for MAC users when reading data in R).

26 Export Learning New Things is Fun Data in a Text File
Put the text file “learning_new_fun.txt” into your Rdata folder so that R program in your computer can locate it. Note: I created a sub-folder in my Rdata and named it PSYC. I put my data sets for our course in my Rdata/PSYC path. You don’t need to do this; but, if you do, just make sure that your RStudio is pointing to PSYC folder. In other words, set your working directory in R by going to Tool > Global Options and browse the folder that you store the data sets.

27 Read (Import) Learning New Things is Fun Data into R

28 Frequency Distribution Table in R Responses to Learning New Things is Fun

29 Cleanup the Frequency Distribution Table in R Valid Responses to Learning New Things is Fun

30 Change a Variable Name in R Change Name from al_g02 to Learning. New
Change a Variable Name in R Change Name from al_g02 to Learning.New.Fun

31 Change a Category Names in R Relabel Categories in Learning.New.Fun

32 Obtain Counts: Add Margins to a Table in R

33 Relative Frequency Distribution in R Proportion of Participants Responding
66.81% of respondents (about 67%) strongly agreed with the statement that “learning new things is fun”. 31.11% of respondents (about 31%) somewhat agreed with the statement that “learning new things is fun”. Cumulative Percentage: 97.92% (about 98%) either strongly agreed or somewhat agreed with the statement: “learning new things is fun”.

34 Obtain Bar Plot in R Bar Plot of Responses to Learning New Things is Fun

35 Bar Plot of Responses to Learning New Things is Fun in R using Colors, Labels

36 Obtain Pie Chart in R Pie Chart of Responses to Learning New Things is Fun

37 Summarizing and Describing a Categorical Variable (ASETS, 2008) Sex of the Respondents
The link to SDA at CHASS to access this data: For Variable Selection: Click on “Demographic Variables” and select “sexs” and then “Copy to Row” On the right-side menu: Change Weight to “No Weight” and for Chart Option, select “Bar chart”.

38 Summarizing and Describing a Categorical Variable (ASETS, 2008) SDA output: Frequency Distribution of Sex of the Respondents

39 Frequency Distribution Table in R Sex of the Respondents

40 Bar Plot of Frequency of Sex of the Respondents in R

41 Exploring Relationship Between Two Categorical Variables
Use the either row or column percentages to compare the percentages. That is, find the conditional distribution of one variable within each level of another variable. When the distribution of one variable is different for all categories of another variables, we say that the variables are dependent (the variables are associated; the variables are related). When the distribution of one variable is the same for all categories of another variables, we say that the variables are independent (the variables are not associated; the variables are not related). Note: The points made above are an informal method of comparing distributions. We will see a formal way of testing for independence in chapter 12 (Significance test regarding the independence of two categorical variables: Chi-square test of independence).

42 Joint Frequency Distribution of Responses and Sex of Respondents
Contingency Table: Cross-tabulations Contingency Table: Classification with respect to two categorical variables. Cross-tabulations (Crosstabs) are joint frequency distribution of two categorical variables. One can be considered an explanatory variable, the other a response variable if you like. The data are summarized in the two-way table below. This table is called a 2 x 5 (read as “2-by-5”) contingency table (two rows and three columns). It presents count data classified on two scales, or dimensions, of classification: Sex of respondents, and Attitudes Toward Learning New Things is Fun. Sex Attitudes Toward Learning New Things is Fun Strongly Agree Somewhat Disagree Strongly Agree Neither Agree Nor Disagree Total Male 6745 31812 126 47 112 10842 Female 8967 3505 91 35 79 12677 15172 7317 217 82 191 23519

43 Response variable: attitudes toward learning new things is fun
Association Between Opinion Regarding Attitudes Toward Learning New Things is Fun and the Sex of the Respondents Research Question (the following questions address the same investigation): Is there an association between attitudes toward learning new things is fun and the sex of the respondents? Is there a relationship between attitudes toward learning new things is fun and the sex of the respondents? Do opinion regarding attitudes toward learning new things is fun depend on the sex of the respondents? Do opinion regarding attitudes toward learning new things is fun differ between males and females? Do males and females differ in their conditional distribution on attitudes toward learning new things is fun ? Response variable: attitudes toward learning new things is fun Type: Categorical (Strongly Agree, Somewhat Agree, Somewhat Disagree, Strongly Agree, Neither Agree Nor Disagree) Explanatory variable: Sex of the respondents Type: Categorical (Male, Female)

44 Examine Association Between Two Categorical Variables Example of Conditional Distribution of Responses Consider the conditional distribution of responses regarding learning new things is fun on sex of the respondents. The link to SDA at CHASS to access this data:

45 The SDA output for this Analysis:
Examine Association Between Two Categorical Variables Compare Distribution of Responses for Learning New Things is Fun for Males and Females The SDA output for this Analysis: The conditional distribution of responses regarding learning new things is fun on gender How do the response percentages regarding the statement learning new things is fun differ between males and females? Compare the row percentages.

46 The SDA output for this Analysis:
Examine Association Between Two Categorical Variables Compare Distribution of Responses for Learning New Things is Fun for Males and Females The SDA output for this Analysis: The conditional distribution of responses regarding learning new things is fun on gender How do the response percentages regarding the statement learning new things is fun differ between males and females? Compare the row percentages. Women (70.7%) are more likely to strongly agree with the statement, compared with the men (62.2%). However, men (35.2%) are more likely to somewhat agree with the statement, compared with the women (27.6%). There is not much of a difference between the sexes in the likelihood of somewhat disagreement, strong disagreement, and neither agreement nor disagreement.

47 Construct a Contingency Table in R Conditional Distribution of Responses to Learning New Things is Fun on Sex

48 Add Margins to a Contingency Table in R
The third row shows: Unconditional Distribution of Responses to Learning New Things (Regardless of the sex of the respondents) The last column shows: Unconditional Distribution of Sex – that is the count only for males and females (Regardless of the responses)

49 Construct a Clustered Bar Chart in R Conditional Distribution of Responses to Learning New Things is Fun on Sex

50

51 Construct a Clustered Bar Chart in R Conditional Distribution of Responses to Learning New Things is Fun on Sex

52 Calculate Joint Percentages in R
Example: The percentage of respondents who are male and strongly agreed with the statement learning new things is fun: (6745/23519) = x 100 ≅ 29% There were participants (valid responses).

53 Calculate Row Percentages in R (Row Percentages Add to 100%) Conditional Distribution of Responses to Learning New Things is Fun on Sex Examples: The percentage of male respondents who strongly agreed with the statement is: (6745/10842) = x 100 ≅ 62% Among females, the conditional percentage of respondents who somewhat agreed is: (3505/12677) = x 100 ≅ 28%

54 Calculate Column Percentages in R (Column Percentages Add to 100%) Conditional Distribution of Sex on Responses to Learning New Things is Fun Example: Among respondents who strongly agreed, the percentage that were male is: (6745/15712) = x 100 ≅ 43% Among respondents who somewhat agreed, the conditional percentage that were female: (3505/7317) = x 100 ≅ 48%

55 Compare Distribution of Responses for Learning New Things is Fun for Males and Females Compare the Conditional Distribution (Row Proportions) with the Unconditional Distribution Marginal Proportion

56 Compare Distribution of Responses for Learning New Things is Fun for Males and Females Compare the Conditional Distribution (Row Proportions) with the Unconditional Distribution (Marginal Proportion) Compare the conditional probability distribution of attitudes toward learning new things is fun for females with the unconditional distribution of attitudes toward learning new things. For example, proportion of females who strongly agreed was about 0.71 compared to the proportion of all subjects who strongly agreed was about 0.67 Compare the conditional probability distribution of attitudes toward learning new things is fun for males with the unconditional distribution of attitudes toward learning new things. For example, proportion of males who strongly agreed was about 0.62 compared to the proportion of all subjects who strongly agreed was about 0.67.

57 Distribution of Responses for Learning New Things is Fun for Males and Females
There appears to be no differences of attitudes toward learning new things is fun between males and females. There appears to be no association between attitudes toward learning new things is fun and the sex of the subjects. Attitudes toward learning new things is fun may not depend on the sex of the subject. The sex of respondents does not appear to explain attitudes toward learning new things is fun.

58 Please see weekly practice problems folder on our course page (weebly)
Please see weekly practice problems folder on our course page (weebly). These are not to be handed in. But they are only for your practice.


Download ppt "Week 3 Lecture Notes PSYC2021: Winter 2019."

Similar presentations


Ads by Google