LEVEL OF MEASUREMENT Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way.

Slides:



Advertisements
Similar presentations
Level of Measurement Problems
Advertisements

Sections 1.3 Types of Data.
SW388R7 Data Analysis & Computers II Slide 1 Solving Problems in SPSS The data sets Options for variable lists in statistical procedures Options for variable.
TYPES OF DATA. Qualitative vs. Quantitative Data A qualitative variable is one in which the “true” or naturally occurring levels or categories taken by.
Introduction to Statistics & Measurement
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Detecting univariate outliers Detecting multivariate outliers
Chi-square Test of Independence
Multiple Regression – Basic Relationships
8/2/2015Slide 1 SPSS does not calculate confidence intervals for proportions. The Excel spreadsheet that I used to calculate the proportions can be downloaded.
Assumption of Homoscedasticity
8/9/2015Slide 1 The standard deviation statistic is challenging to present to our audiences. Statisticians often resort to the “empirical rule” to describe.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
SW388R7 Data Analysis & Computers II Slide 1 Analyzing Missing Data Introduction Problems Using Scripts.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect3_1.
SW388R6 Data Analysis and Computers I Slide 1 Chi-square Test of Goodness-of-Fit Key Points for the Statistical Test Sample Homework Problem Solving the.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
Measures of Central Tendency
Sampling Distribution of the Mean Problem - 1
Unit 1 Section 1.2.
8/20/2015Slide 1 SOLVING THE PROBLEM The two-sample t-test compare the means for two groups on a single variable. the The paired t-test compares the means.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 9 Processing the Data.
STA 2023 Chapter 1 Notes. Terminology  Data: consists of information coming from observations, counts, measurements, or responses.  Statistics: the.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of Homoscedasticity Homoscedasticity (aka homogeneity or uniformity of variance) Transformations.
Data Presentation.
9/18/2015Slide 1 The homework problems on comparing central tendency and variability extend the focus central tendency and variability to a comparison.
Section 1.2 Data Classification.
Sections 1-3 Types of Data. PARAMETERS AND STATISTICS Parameter: a numerical measurement describing some characteristic of a population. Statistic: a.
9/23/2015Slide 1 Published reports of research usually contain a section which describes key characteristics of the sample included in the study. The “key”
SW388R6 Data Analysis and Computers I Slide 1 Central Tendency and Variability Sample Homework Problem Solving the Problem with SPSS Logic for Central.
Chi-Square Test of Independence Practice Problem – 1
Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics.
Statistics for the Social Sciences Psychology 340 Spring 2009 Review of SPSS basics.
110/10/2015Slide 1 The homework problems on comparing central tendency and variability extend our focus on central tendency and variability to a comparison.
1 Concepts of Variables Greg C Elvers, Ph.D.. 2 Levels of Measurement When we observe and record a variable, it has characteristics that influence the.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
SW318 Social Work Statistics Slide 1 Compare Central Tendency & Variability Group comparison of central tendency? Measurement Level? Badly Skewed? MedianMeanMedian.
As shown in Table 1, the groups differed in terms of language skills and the type of job last held. The intake form asked the client to indicate languages.
SW388R6 Data Analysis and Computers I Slide 1 Independent Samples T-Test of Population Means Key Points about Statistical Test Sample Homework Problem.
SW318 Social Work Statistics Slide 1 Get ready to work on practice problems 1. Create a directory and subdirectory on your computer named C:\StudentData\SW318_Spring_2004.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
SW318 Social Work Statistics Slide 1 Frequency: Nominal Variable Practice Problem This question asks the frequency of widowed respondents of the survey.
Level of Measurement Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way.
11/16/2015Slide 1 We will use a two-sample test of proportions to test whether or not there are group differences in the proportions of cases that have.
Chi-square Test of Independence
Bell Ringer Using female = 0 and male = 1, calculate the average maleness in this classroom.
SW318 Social Work Statistics Slide 1 Percentile Practice Problem (1) This question asks you to use percentile for the variable [marital]. Recall that the.
SW388R6 Data Analysis and Computers I Slide 1 Percentiles and Standard Scores Sample Percentile Homework Problem Solving the Percentile Problem with SPSS.
SW388R7 Data Analysis & Computers II Slide 1 Detecting Outliers Detecting univariate outliers Detecting multivariate outliers.
Overview and Types of Data
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
1/5/2016Slide 1 We will use a one-sample test of proportions to test whether or not our sample proportion supports the population proportion from which.
Data Classification Lesson 1.2.
Chapter 2: Levels of Measurement. Researchers classify variables according to the extent to which the values of the variable measure the intended characteristics.
SW388R7 Data Analysis & Computers II Slide 1 Incorporating Nonmetric Data with Dummy Variables The logic of dummy-coding Dummy-coding in SPSS.
SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample.
1.2 Data Classification Qualitative Data consist of attributes, labels, or non-numerical entries. – Examples are bigger, color, names, etc. Quantitative.
The frequency distribution
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Starter QUIZ Take scrap paper from little table
Unit 1 Section 1.2.
DATA TYPES.
SPSS For a Beginner CHAR By Adebisi A. Abdullateef
Probability and Statistics
LEVEL OF MEASUREMENT Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way.
Ch 5: Measurement Concepts
§ 1.2 Data Classification.
Presentation transcript:

LEVEL OF MEASUREMENT Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way. To distinguish the different ways in which numbers are used, we traditionally have identified the level of measurement of the variables as: nominal, ordinal, interval, and ratio. Nominal Level Variables For nominal level variables, the numbers are shorthand for the categories of a variable, e.g. 1 represents married persons, 2 represents divorced persons, 3 represents persons who have never been married, etc. The assignment of numbers to the categories is arbitrary and can be changed with no loss of meaning. The only legitimate mathematical operation we can do with nominal level data is count the number of times the different categories appear in the data set. Slide 1

Ordinal Level Variables Ordinal level variables are usually associated with labels as well, but the assignment of numbers to the categories is ordered, from low to high, e.g. 1 is assigned to high school graduates, 2 is assigned to junior college graduates, 3 is assigned to college graduates, 4 is assigned to graduates with a masters degree, etc. or from high to low: 1 is assigned to graduates with a masters degree, 2 is assigned to college graduates, 3 is assigned to junior college graduates, 4 is assigned to high school graduates, etc. The ordering of the numbers tracks the hierarchy of the labels. Though we can change the actual numbers used, the number assigned each higher level degree must consistently be a bigger number than what was used to represent lower ranked degrees (or a smaller number if ordered from high to low) Slide 2

The legitimate mathematical operations we can perform on ordinal data is sorting or ranking, as well as counting. Interval Level Variables Interval level variables have the additional characteristic that the difference between numbers is the same for all possible combinations, e.g.: the difference between 1 and 2 years of age is the same amount as the difference between 21 and 22 years of age, or 50 and 51, or 65 and 66. the difference between a height of 60 inches and a height of 55 inches is the same amount of difference as a height of 72 inches and a height of 67 inches. For interval level variables, it is mathematically legitimate to do arithmetic (add, subtract, multiple, and divide) as well as count the values, and sort or rank the values. Ratio Level Variables Ratio level variables have the additional property of having a true zero value so that ratios between values are meaningful, but practically speaking, ratio level data is treated the same as interval level. The commonly cited example is temperature. Slide 3

Quantitative and Categorical Variables The distinction between nominal and interval levels of variables is substantial. Computing an average marital status (treating a nominal level variable as interval) does not produce a meaningful result, and can be downright embarrassing. Presenting a count of all of the possible ages of the subjects in a data set (using only the nominal level property of an interval variable) does not communicate as much information as saying the average age was 27.5, with a range from 21 to 57. The differences in the use of data at these levels has led many authors to collapse the number of levels of measurement to two, substituting terms like: quantitative or metric level instead of interval categorical, qualitative, or non-metric instead of nominal In practice, ordinal level variable are sometimes treated as quantitative and at other times as categorical. The numeric codes for scale variables (1=disagree, 2=neutral, 3=agree) are generally treated as quantitative data and averaged. The numeric codes for year in school (1=freshman, 2=sophomore, 3=junior, 4=senior) are often not used, and comparisons are made using the categorical labels, e.g. the number (count) of seniors with some characteristic versus the number of juniors with the same characteristic. Slide 4

When ordinal level variables are used as quantitative variables, we are emphasizing the rank order of the categories, e.g. 3 ranks higher than 2 or 1, and 2 ranks higher than 1. Since the ranks themselves are interval level data, it is argued that arithmetic on the ordinal values is acceptable. Multiple Variables Measuring the Same Construct The same construct can be represented by variables at different levels of measurement. Education can be represented as years of school (quantitative), diploma such as high school, college, or post-graduate (categorical, though we could come up with a numbering scheme that made it quantitative) The implication of these different representations is that we cannot base a correct conclusion on the name of the variable or the construct it represents. A correct understanding of a variable’s level of measurement requires that we look at the numbers in the data set and the coding scheme (numeric codes and labels) applied to the variable. The authors of the text for this course use the labels: quantitative and categorical. We will use their terminology in the first set of homework problems. Slide 5

Slide 6 The introductory statement in the question indicates: The data set to use (GSS2000R) The task to accomplish (determining how the variable can be used) The variable to use in the analysis: employment status [wrkstat] We will answer the question based on the way the data is presented in the SPSS data set. We will not consider changing the coding of the variable.

Slide 7 There are two statements for each problem. One or both might be correct.

Slide 8 In the Data View of the data editor, we see that wrkstat contains numbers, but we cannot tell whether they are measures or codes.

Slide 9 First, to see what labels have been assigned to the variable, we click on the Variable View tab. Second, we look to see what numeric codes are used for missing data for the wrkstat row, in the Missing column. These are values that will not be used in the analysis and labels that we ignore in determining the level of measurement.

Slide 10 To see the values have been assigned labels, we look in the Variable View tab, and click in the right side of the cell on the row for wrkstat, in the column called Values.

Slide 11 When we clicked on the right end of the cell, the Value Labels dialog box opened. Ignoring the 0 and 9 which were coded as missing data, we see eight entries for work status.

Slide 12 To use the variable as quantitative, we examine the labels for order. I tried to think of them as describing the amount worked, but that clearly doesn’t work for retired, school, and keeping house. Since I find no plausible order, the variable is categorical rather than quantitative. To close the dialog box, click on the OK button.

Slide 13 The labels for this variable do not imply any order or rank. In fact the numeric codes could be reassigned to different categories with no loss of meaning. The statement that "Employment status can be used as a quantitative variable" is not correct and the check box is not marked. Since the variable has been assigned category labels in SPSS, the researchers who created the data set expected it to be used as a categorical variable. The statement that “Employment status can be used as a categorical variable" is correct and the check box is marked.

Slide 14 The second problem asks the same pair of questions about the variable number of hours worked in the past week [hrs1].

Slide 15 In the visible rows of the Data View, we see values that range from 38 to 60. Based on the variable label and the data values shown, my initial assessment is that this is a quantitative variable.

Slide 16 The variable hrs1 uses three numeric codes for missing data: -1, 98, and 99.

Slide 17 The only numbers assigned labels are the codes for missing data. There are no value labels for hrs1. Hrs1 is not a categorical variable.

Slide 18 Since the variable has not been assigned any category labels in SPSS, the researchers who created the data set expected it to be used as a quantitative variable. The statement that ""Number of hours worked in the past week" can be used as a quantitative variable" is correct and the check box is marked. Since the variable has not been assigned any category labels in SPSS, the researchers who created the data set expected it to be used as a quantitative variable. To use it as a categorical variable would require us to recode the variable into meaningful categories. The statement that ""Number of hours worked in the past week" can be used as a categorical variable" is not correct.

Slide 19 The third problem asks the same pair of questions about the self-employment [wrkslf].

Slide 20 In the Data View, we see very restricted options for values: 2 and 9.

Slide 21 The variable wrkslf uses three numeric codes for missing data: 0, 8, and 9.

Slide 22 If we eliminate the codes for missing data (0, 8, and 9), there are only two valid values (1 and 2). While labels have been assigned to the values for this variable which has only two categories, the variable can be considered ordered (and hence quantitative) if the categories are opposite, e.g. one category implies the possession of a characteristic that is different from the characteristic implied in the second category.

Slide 23 While labels have been assigned to the values for this variable which has only two categories, the variable can be considered ordered if the categories are opposite, e.g. one category implies the possession of a characteristic that is different from the characteristic implied in the second category. The statement that "Self-employment can be used as a quantitative variable" is correct and the check box is marked. Since the variable has been assigned category labels in SPSS, the researchers who created the data set expected it to be used as a categorical variable. The statement that ""Self-employment" can be used as a categorical variable" is correct and the check box is marked.

Slide 24 The fourteenth problem asks the same pair of questions about the how many in family earned money [earnrs].

Slide 25 In the first few rows of the Data View, we see that possible values for earnrs range from 0 through 3, and it is likely that there are higher values.

Slide 26 In the Variable View, we see that the variable earnrs uses only one numeric code for missing data: 9.

Slide 27 If we eliminate the code for missing data, there is only one value with a label, though we found other values for earnrs in the Data View. The labeling for this variable indicates that the highest code is used for 8 or more earners. This is done to eliminate higher number codes that have low frequencies. We can use earnrs as a categorical variable, but we would probably want to assign labels to the other values.

Slide 28 Assigning a label to a single value for a variable does not alter the order or rank of the other values of the variable. The variable can be treated as quantitative. The statement that ""How many in family earned money" can be used as a quantitative variable" is correct. Since the variable has been assigned category labels in SPSS, the researchers who created the data set expected it to be used as a categorical variable. The statement that ""How many in family earned money" can be used as a categorical variable" is correct.

Slide 29 The tenth problem asks the same pair of questions about the highest academic degree [degree].

Slide 30 The variable degree uses three numeric codes for missing data: 7, 8, and 9.

Slide 31 If we eliminate the codes for missing data (7, 8, and 9), there are five valid values. The are ordered by level of academic achievement and the number of years it takes to complete the degree. Graduate degrees take more years of school than bachelor degrees, which take more years of school than junior college degrees, etc.

Slide 32 While labels have been assigned to the values for this variable, the labels follow the order of the numeric codes. The order or rank to the response set supports the use of the variable as quantitative. The statement that "Highest academic degree can be used as a quantitative variable" is correct and the check box is marked. Since the variable has been assigned category labels in SPSS, the researchers who created the data set expected it to be used as a categorical variable. The statement that ""Highest academic degree" can be used as a categorical variable" is correct and the check box is marked.

Slide 33 When we have finished all of the questions, we click on the Submit at the bottom of the assignment.

Slide 34 BlackBoard asks us to verify that we wanted to submit the assignment. Click on the OK button.

Slide 35 Once the assignment is graded, we have the option to review the results. Click on the OK button.

Slide 36 Correct answers are marked with a green check. Incorrect answers are marked with a red X. Feedback is included to help you understand the reasons why the answers were correct or incorrect.