Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anthony J Greene1 Distributions of Variables I.Properties of Variables II.Nominal Data & Bar Charts III.Ordinal Data IV.Interval & Ratio Data, Histograms.

Similar presentations


Presentation on theme: "Anthony J Greene1 Distributions of Variables I.Properties of Variables II.Nominal Data & Bar Charts III.Ordinal Data IV.Interval & Ratio Data, Histograms."— Presentation transcript:

1 Anthony J Greene1 Distributions of Variables I.Properties of Variables II.Nominal Data & Bar Charts III.Ordinal Data IV.Interval & Ratio Data, Histograms & Frequency Distributions V.Cumulative Frequency Distributions & Percentile Ranks

2 Anthony J Greene2 Variables Variable: A characteristic that takes on multiple values. I.e.,varies from one person or thing to another.

3 Anthony J Greene3 Variables Cause and Effect The Independent Variable The Dependent Variable

4 Anthony J Greene4 Distributions The distribution of population data is called the population distribution or the distribution of the variable. The distribution of sample data is called a sample distribution.

5 Anthony J Greene5 Variables

6 Anthony J Greene6 Variables Kinds of Variables (any of which can be an independent or dependent variable) Qualitative variable: A nonnumerically valued variable. Quantitative variable: A numerically valued variable. Discrete Variable: A quantitative variable whose possible values form a finite (or countably infinite) set of numbers. Continuous variable: A quantitative variable whose possible values form some interval of numbers.

7 Anthony J Greene7 Quantitative Variables Discrete data: Data obtained by observing values of a discrete variable. Continuous data: Data obtained by observing values of a continuous variable.

8 Anthony J Greene8 The Four Scales Nominal: Categories Ordinal: Sequence Interval: Mathematical Scale w/o a true zero Ratio: Mathematical Scale with a true zero

9 Anthony J Greene9 The Four Scales Nominal: Classes or Categories. Also called a Categorical scale. E.g., Catholic, Methodist, Jewish, Hindu, Buddhist, … Qualitative Data

10 Anthony J Greene10 The Four Scales Ordinal: Sequential Categories. e.g., 1st, 2nd, 3rd, … with no indication of the distance between classes Discrete Data

11 Anthony J Greene11 The Four Scales Interval: Data where equal spacing in the variable corresponds to equal spacing in the scale. E.g., 1940s, 1950s, 1960s… : or SAT Scores. Discrete or Continuous

12 Anthony J Greene12 The Four Scales Ratio: An interval scale with a mathematically meaningful zero. e.g., latencies of 1252 ms, 1856 ms, ….: mg of Prozac Discrete or Continuous

13 Anthony J Greene13 The Four Scales Nominal: No mathematical operations Ordinal:, = Interval: +, -, and ordinal operations Ratio: , , and interval operations

14 Anthony J Greene14 Nominal Variables Classes: Categories for grouping data. Frequency: The number of observations that fall in a class. Frequency distribution: A listing of all classes along with their frequencies. Relative frequency: The ratio of the frequency of a class to the total number of observations. Relative-frequency distribution: A listing of all classes along with their relative frequencies.

15 Anthony J Greene15 Frequencies of Nominal Variables

16 Anthony J Greene16 Sample Pie Charts and Bar Charts of Nominal Data

17 Anthony J Greene17 Frequency Bar Charts Frequency bar chart: A graph that displays the independent variable on the horizontal axis -- categories -- and the frequencies -- dependent variable -- on the vertical axis. The frequency is represented by a vertical bar whose height is equal to the frequency of cases that fall within a given class of the I.V.

18 Anthony J Greene18 Frequency Charts of Nominal Data

19 Anthony J Greene19 Relative Frequency Bar Charts Relative-frequency bar chart: A graph that displays the I.V. on the horizontal axis -- categories -- and the relative frequencies -- D.V. -- on the vertical axis. The relative frequency of each class is represented by a vertical bar whose height is equal to the relative frequency of the class. The difference between this and a frequency bar chart is that the proportion or percentage (always between zero and one) is listed instead of the numbers that fall into a given class.

20 Anthony J Greene20 Relative Frequency Charts of Nominal Data

21 Anthony J Greene21 Probability Distribution and Probability Bar Chart Frequency Distributions and Charts for a whole population Probability distribution: A listing of the possible values and corresponding probabilities of a discrete random variable; or a formula for the probabilities. Probability bar chart: A graph of the probability distribution that displays the possible values of a discrete random variable on the horizontal axis and the probabilities of those values on the vertical axis. The probability of each value is represented by a vertical bar whose height is equal to the probability.

22 Anthony J Greene22 Probability Charts of Nominal Data

23 Anthony J Greene23 Bar Chart

24 Anthony J Greene24 The Bar Graph: Nominal Data

25 Anthony J Greene25 Sum of the Probabilities of a Discrete Random Variable For any discrete random variable, X, the sum of the probabilities of its possible values equals 1; in symbols, we have  P(X = x) = 1. For example Republicans: 32.5%, Democrats 45.0%, Other 22.5% 0.325 + 0.450 + 0.225 = 1.00 or 100%

26 Anthony J Greene26 Ordinal Variables Note that “Rank” is the ordinal variable. “Mortality” is a ratio variable but can easily be downgraded to an ordinal variable with a loss of information

27 Anthony J Greene27 Distributions and Charts for Ordinal Data Frequency distributions, relative frequency distribution, and probability distributions are done exactly as they were for Nominal Data Bar charts are used.

28 Anthony J Greene28 Distribution of Education Level LevelP(x) Elementary0.03 High School0.45 Associates0.12 Bachelors0.28 Masters0.10 Doctorate0.02

29 Anthony J Greene29 Interval and Ratio Data Frequency: The number of observations that fall in a class. Frequency distribution: A listing of all classes along with their frequencies. Relative frequency: The ratio of the frequency of a class to the total number of observations. Relative-frequency distribution: A listing of all classes along with their relative frequencies.

30 Anthony J Greene30 Histograms Frequency histogram: A graph that displays the independent variable on the horizontal axis and the frequencies -- dependent variable -- on the vertical axis. The frequency is represented by a vertical bar whose height is equal to the frequency of cases that fall within a given range of the I.V.

31 Anthony J Greene31 Interval and Ratio Variables Years of Education Avg. Income (in thousands)

32 Anthony J Greene32 Enrollment in Milwaukee Public Elementary Schools

33 Anthony J Greene33 Relative Frequency distribution of Enrollments in MPS

34 Anthony J Greene34 Probability distribution of a randomly selected elementary- school student

35 Anthony J Greene35 Probability distribution of the age of a randomly selected student

36 Anthony J Greene36 Probability Histogram

37 Anthony J Greene37 Another Example

38 Anthony J Greene38 Frequency vs. Relative Frequency

39 Anthony J Greene39 Frequency vs. Relative Frequency This is also the Probability Distribution

40 Anthony J Greene40 More Examples: Frequency Histogram

41 Anthony J Greene41 More Examples: Grouped Frequency Histogram

42 Anthony J Greene42 Grouped Frequency Histogram

43 Anthony J Greene43

44 Anthony J Greene44 Proportions and Frequency

45 Anthony J Greene45 Frequency Groupings 9 intervals with each interval 5 points wide. The frequency column (f) lists the number of individuals with scores in each of the class intervals.

46 Groupings: There had to be a catch What to do with the in-betweens? Only a concern for continuous variables Real Limits -- those in the “14” bar are really from 13.5 to 14.5 Upper Real Limits & Lower Real Limits: For the case of whole numbers, simply add 0.5 to the high score and subtract 0.5 from the lowest observed score (these observed scores are the “apparent limits”)

47 Anthony J Greene47 Understanding Real Limits

48 Anthony J Greene48 Real Limits & Apparent Limits

49 Anthony J Greene49 Frequency & Cumulative Frequency I.Q. RangeReal LimitsFrequencyCuml. Freq. < 520 – 52.511 52-6752.5-67.545 68-8467.5-84.51116 85-10084.5-100.53450 101-116100.5-116.53484 117-132116.5-132.51195 133-148132.5-148.5499 >148148.5 +1100

50 Frequency (Normal Distribution)

51 Cumulative Frequency (Ogive)

52 Anthony J Greene52 Computing Percentile Ranks Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

53 Anthony J Greene53 Computing Percentile Ranks Remember that each value has real limits What is the 90 th %ile? Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

54 Anthony J Greene54 Computing Percentile Ranks Remember that each value has real limits What is the 90 th %ile? 2.5 because at or below “2” are 90% of the scores, but “2” includes all from 1.5 to 2.5 Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

55 Anthony J Greene55 Computing Percentile Ranks Remember that each value has real limits What is the 90 th %ile? 2.5 because at or below “2” are 90% of the scores, but “2” includes all from 1.5 to 2.5 What is the 20 th %ile? Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

56 Anthony J Greene56 Computing Percentile Ranks Remember that each value has real limits What is the 90 th %ile? 2.5 because at or below “2” are 90% of the scores, but “2” includes all from 1.5 to 2.5 What is the 20 th %ile? 0.5 Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

57 Anthony J Greene57 Computing Percentile Ranks What about the in-betweens? What is the 80 th %ile? What %ile corresponds to 2 lbs? Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

58 Anthony J Greene58 Linear Interpolation

59 Anthony J Greene59 Linear Interpolation And Percentiles What is the 80th %ile? Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

60 Anthony J Greene60 Linear Interpolation And Percentiles What is the 80th %ile? Where’s the 80 th %ile? 17.5/27.5 = 0.63. The interval is 1.0 lb, so 1.5 + 1(0.63) = 2.13 Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

61 Anthony J Greene61 Linear Interpolation And Percentiles What is the 80th %ile? Where’s the 80 th %ile? 17.5/27.5 = 0.63. The interval is 1.0 lb, so 1.5 + 1(0.63) = 2.13 What %ile corresponds to 2 lbs? Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

62 Anthony J Greene62 Linear Interpolation And Percentiles What is the 80th %ile? Where’s the 80 th %ile? 17.5/27.5 = 0.63. The interval is 1.0 lb, so 1.5 + 1(0.63) = 2.13 What %ile corresponds to 2 lbs? 2 lbs. Is halfway into the interval (0.5). So its halfway between 62.5 - 90.0. Since 27.5% of the scores are in this interval we need to go up 0.5(27.5%) = 13.75%. 62.5% + 13.75% = 76.25% Pounds x Real Limits Freq f Relative Freq. Cuml. Freq. %ile 00-0.580.200 20.0 10.5-1.5170.4250.62562.5 21.5-2.5110.2750.90090.0 32.5-3.530.0750.97597.5 43.5-4.510.0251.000100

63 Anthony J Greene63 The Stem & Leaf Diagram

64 Anthony J Greene64 Stem & Leaf Plots

65 Anthony J Greene65 Comparison of Frequency Histogram vs. Stem & Leaf Diagram

66 Anthony J Greene66 The Blocked Frequency Histogram

67 Anthony J Greene67 The Frequency Distribution Polygon –or– Line Graph

68 Anthony J Greene68 Grouped Frequency Polygon

69 Anthony J Greene69 The Normal Distribution

70 Anthony J Greene70 Variants on the Normal Distribution

71 Anthony J Greene71 Comparing Two Distributions Number of Sentences recalled from each category

72 Anthony J Greene72 Comparing Distributions

73 Anthony J Greene73 Distributions

74 Anthony J Greene74 Variables and Distributions In Class Exercise

75 Anthony J Greene75 The Math You’ll Need To Know Calculate: ΣX = ΣX 2 = (ΣX) 2 = X 1 2 0 4

76 Anthony J Greene76 The Math You’ll Need To Know Calculate: ΣX = 7 ΣX 2 = (ΣX) 2 = X 1 2 0 4

77 Anthony J Greene77 The Math You’ll Need To Know Calculate: ΣX = 7 ΣX 2 = 21 (ΣX) 2 = X 1 2 0 4

78 Anthony J Greene78 The Math You’ll Need To Know Calculate: ΣX = 7 ΣX 2 = 21 (ΣX) 2 = 49 X 1 2 0 4

79 Anthony J Greene79 The Math You’ll Need To Know Calculate: ΣX = ΣY = ΣX ΣY = ΣXY = ΣX 2 = (ΣX) 2 = ΣY 2 = (ΣY) 2 = XY 13 31 0-2 2-4

80 Anthony J Greene80 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = ΣX ΣY = ΣXY = ΣX 2 = (ΣX) 2 = ΣY 2 = (ΣY) 2 = XY 13 31 0-2 2-4

81 Anthony J Greene81 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = -2 ΣX ΣY = ΣXY = ΣX 2 = (ΣX) 2 = ΣY 2 = (ΣY) 2 = XY 13 31 0-2 2-4

82 Anthony J Greene82 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = -2 ΣX ΣY = -12 ΣXY = ΣX 2 = (ΣX) 2 = ΣY 2 = (ΣY) 2 = XY 13 31 0-2 2-4

83 Anthony J Greene83 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = -2 ΣX ΣY = -12 ΣXY = -2 ΣX 2 = (ΣX) 2 = ΣY 2 = (ΣY) 2 = XY 13 31 0-2 2-4

84 Anthony J Greene84 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = -2 ΣX ΣY = -12 ΣXY = -2 ΣX 2 = 14 (ΣX) 2 = ΣY 2 = (ΣY) 2 = XY 13 31 0-2 2-4

85 Anthony J Greene85 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = -2 ΣX ΣY = -12 ΣXY = -2 ΣX 2 = 14 (ΣX) 2 = 36 ΣY 2 = (ΣY) 2 = XY 13 31 0-2 2-4

86 Anthony J Greene86 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = -2 ΣX ΣY = -12 ΣXY = -2 ΣX 2 = 14 (ΣX) 2 = 36 ΣY 2 = 30 (ΣY) 2 = XY 13 31 0-2 2-4

87 Anthony J Greene87 The Math You’ll Need To Know Calculate: ΣX = 6 ΣY = -2 ΣX ΣY = -12 ΣXY = -2 ΣX 2 = 14 (ΣX) 2 = 36 ΣY 2 = 30 (ΣY) 2 = 4 XY 13 31 0-2 2-4

88 Anthony J Greene88 The Math You’ll Need To Know The Mean Σx/n = M where n = sample size X 1 4 8 3

89 Anthony J Greene89 The Math You’ll Need To Know Calculate: Σ(x-M) = Σ(x-M) 2 = Σ(x 2 –M 2 ) = XM = 4 1 4 8 3

90 Anthony J Greene90 The Math You’ll Need To Know Calculate: Σ(x-M) = 0 Σ(x-M) 2 = Σ(x 2 –M 2 ) = XM = 4 1 4 8 3

91 Anthony J Greene91 The Math You’ll Need To Know Calculate: Σ(x-M) = 0 Σ(x-M) 2 = 26 Σ(x 2 –M 2 ) = XM = 4 1 4 8 3

92 Anthony J Greene92 The Math You’ll Need To Know Calculate: Σ(x-M) = 0 Σ(x-M) 2 = 26 Σ(x 2 –M 2 ) = 26 XM = 4 1 4 8 3

93 Anthony J Greene93 The Math You’ll Need To Know Calculate: s p = 13 n 1 = 8 n 2 = 10

94 Anthony J Greene94 The Math You’ll Need To Know Calculate: s p = 13 n 1 = 8 n 2 = 10

95 Anthony J Greene95 The Math You’ll Need To Know Calculate: s p = 13 n 1 = 8 n 2 = 10

96 Anthony J Greene96 What Type of Data? Years Spent in the Military

97 Anthony J Greene97 What Type of Data? Military Rank: Lieutenant Captain Major Lt. Colonel Colonel General

98 Anthony J Greene98 What Type of Data? Branch of Service: Army Air Force Navy Marine Corps Coast Guard

99 Anthony J Greene99 What Type of Data? Time taken to complete a 30 mile bicycle race

100 Anthony J Greene100 What Type of Data? Finishing place in a 30 mile bicycle race

101 Anthony J Greene101 Frequency Dist. & Percentile Raw Scores: 15, 18, 21, 23, 27, 33, 33, 35, 36, 36, 39, 41 44, 47, 49, 50

102 Anthony J Greene102 Frequency Dist. & Percentile Xf 10-192 20-293 30-396 40-494 50-591

103 Anthony J Greene103 Frequency Dist. & Percentile Compute the 52%ile Xf 10-192 20-293 30-396 40-494 50-591

104 Anthony J Greene104 Frequency Dist. & Percentile Compute the 52%ile XfCum f 10-1922 20-2935 30-39611 40-49415 50-59116

105 Anthony J Greene105 Frequency Dist. & Percentile Compute the 52%ile The 52%ile is somewhere between 30-39. XfCum f 10-1922 0.125 20-2935 0.3125 30-39611 0.6875 40-49415 0.9375 50-59116 1.0

106 Anthony J Greene106 Frequency Dist. & Percentile Compute the 52%ile The 52%ile is somewhere between 30-39. That interval is from 0.3125 – 0.6875 XfCum f 10-1922 0.125 20-2935 0.3125 30-39611 0.6875 40-49415 0.9375 50-59116 1.0

107 Anthony J Greene107 Frequency Dist. & Percentile Compute the 52%ile The 52%ile is somewhere between 30- 39. That interval is from 0.3125 – 0.6875 That interval is 0.375 wide XfCum f 10-1922 0.125 20-2935 0.3125 30-39611 0.6875 40-49415 0.9375 50-59116 1.0

108 Anthony J Greene108 Frequency Dist. & Percentile Compute the 52%ile The 52%ile is somewhere between 20- 29. That interval is from 0.3125 – 0.6875 That interval is 0.375 wide To get from 0.3125 to 0.52 we go 0.2075 into the interval XfCum f 10-1922 0.125 20-2935 0.3125 30-39611 0.6875 40-49415 0.9375 50-59116 1.0

109 Anthony J Greene109 Frequency Dist. & Percentile That interval is from 0.3125 – 0.6875 That interval is 0.375 wide To get from 0.3125 to 0.52 we go 0.2075 into the interval That’s 0.553 of the way into the interval (0.2075/0.375) XfCum f 10-1922 0.125 20-2935 0.3125 30-39611 0.6875 40-49415 0.9375 50-59116 1.0

110 Anthony J Greene110 Frequency Dist. & Percentile That’s 0.553 of the way into the interval (0.2075/0.375) The real limits are from 19.5 to 29.5 (a range of 10) 52%ile is 29.5 + 5.53 = 35.03 This Process is called Linear Interpolation XfCum f 10-1922 0.125 20-2935 0.3125 30-39611 0.6875 40-49415 0.9375 50-59116 1.0


Download ppt "Anthony J Greene1 Distributions of Variables I.Properties of Variables II.Nominal Data & Bar Charts III.Ordinal Data IV.Interval & Ratio Data, Histograms."

Similar presentations


Ads by Google