Univariate Description Heibatollah Baghi, Mastaneh Badii, and Farrokh Alemi Ph.D. This lecture was organized by Dr. Alemi. It is based on the work done by Dr. Baghi and Mastaneh Badii
Table 1: Grades from 50 Students We will use the fifty data points. How would you describe these data? What is the 30 second description, the so called elevator speech, you can give about this data? This is what we are going to learn today, how to describe the data.
Levels of Measurement Level of Measurement Nominal-Level data is merely descriptive (e.g. religion, country name, region). Any assigned numerical value is merely for convenience (e.g. Christian = 1, Jewish = 2, Buddhist = 3) An important notion in statistics is LEVEL of measurement. It is important to consider four levels of data, as the level determines what sort of statistical operation can be done with that data. This is VERY IMPORTANT, so will repeat it. The level of data determines what sort of statistical operation can be done with data. NOMINAL level data is . . . Note that, if numbers are assigned to nominal-level descriptive data, what we call coding, it would be THEORETICALLY possible to average the religions of a group, but this would be a meaningless statistic. There is no AVERAGE religion of an diverse group.
Level of Measurement Nominal-Level Levels of Measurement Level of Measurement Nominal-Level Ordinal-Level data has rank order, though intervals between data points cannot be considered equal (e.g. high/medium/low income) ORDINAL level data is . . . It is the most common type of data in fields such as psychology, where there are often no objective means to measure many of the concepts they study. Thus, psychologists ask people’s opinions on what are called Likert scales – strongly agree, agree, indifferent, disagree, strongly disagree
Level of Measurement Nominal-Level Ordinal-Level Levels of Measurement Level of Measurement Nominal-Level Ordinal-Level Interval-Level data has equal intervals between data points Interval level data has meaningful intervals between numerical scores
Level of Measurement Nominal-Level Ordinal-Level Interval-Level Levels of Measurement Level of Measurement Nominal-Level Ordinal-Level Interval-Level Ratio-Level interval data that has a true zero Ratio level is an interval data that has a true zero.
Table 1: Grades from 50 Students So lets go back to our task. How would you describe these numbers?
Ungrouped Frequency Distribution of Heart Rate Scores 56 1 65 4 57 66 3 58 67 2 59 68 60 69 61 70 62 71 63 72 64 73 First, sort and tally the data. To tally the data you need to assume that they are mutually exclusive and collectively exhaustive of all possibilities.
Heart Rate Frequency Distribution Here we see a software doing the same. Frequency is count of data points. Percent is 100 times frequency divided by the total sum of observations. Cumulative percent is sum of percent of data that is equal or falls below a category.
Choose the Type of Chart that Best Describes a Variable Characteristic Bar chart is used for Nominal or Ordinal Data Choose the Type of Chart that Best Describes a Variable Characteristic Bar chart is used for Nominal or Ordinal data.
It consists of a horizontal (X-axis) and vertical dimension (Y-axis) It consists of a horizontal (X-axis) and vertical dimension (Y-axis). Categories are along the X-axis. Frequencies or percentages are displayed on Y-axis.
Histograms Are Used for Interval or Ratio Level Data
Histogram for Student Grades Find the lowest and highest score Find the range of scores Decide on the number of intervals (e.g., 5) Divide the range by number of intervals Determine the lowest class interval List all class intervals. Tally the number of scores that fall in each class interval. Convert each tally to a frequency.
Histogram of Student Grades The lowest score and highest score (51,99) The lowest score and highest score (51,99)
Histogram of Student Grades The lowest score and highest score (51,99) The range of scores (99 – 51 = 48) Number of intervals (5) Range (48 / 5 ~ 10) Lowest class interval (51-60) All class intervals 51-60,61-70,71-80,81-90,91-100 Tally the number of cases that fall within each interval Frequency counts: 2,9,14,21,4 The lowest score and highest score (51,99) The range of scores (99 – 51 = 48) Number of intervals (5) Range (48 / 5 ~ 10) Lowest class interval (51-60) All class intervals 51-60,61-70,71-80,81-90,91-100 Tally the number of cases that fall within each interval Frequency counts: 2,9,14,21,4
The histogram shows the midpoint for each interval (the interval width is = 5). For example, the first interval is 48-52 and the midpoint appeared in the histogram is 50. The second interval is 53-57 with the midpoint of 55 and so on.
Describe Data Central Tendency You can describe data using measures of Central Tendency such as mean or median
Describe Data Variability You can describe data using measures of variability such as range or standard deviation
Describe Data Shape You can describe data using measures of shape such as unimodal, Bimodal, Skewed, J-Shaped, Kurtosis
Normal This is a normal distribution with unimodal shape
Bi-modal
Negatively skewed
Positively skewed
J-shaped
Rectangular
Data can be described by the central tendency, variability, and shape of the frequency distribution Normal distribution is common and allows us to examine how rare an observed value is.