Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summarizing Data Graphical Methods. Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot.

Similar presentations


Presentation on theme: "Summarizing Data Graphical Methods. Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot."— Presentation transcript:

1 Summarizing Data Graphical Methods

2 Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot

3 Measure of Central Location 1.Mean 2.Median

4 Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation

5 Descriptive techniques for Multivariate data In most research situations data is collected on more than one variable (usually many variables)

6 Graphical Techniques The scatter plot The two dimensional Histogram

7 The Scatter Plot For two variables X and Y we will have a measurements for each variable on each case: x i, y i x i = the value of X for case i and y i = the value of Y for case i.

8 To Construct a scatter plot we plot the points: ( x i, y i ) for each case on the X-Y plane. ( x i, y i ) xixi yiyi

9 Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement 186941.11.7 21041031.51.7 386921.51.9 41051002.02.0 51181151.93.5 6961021.42.4 790871.51.8 8951001.42.0 9105961.71.7 1084801.61.7 1194871.61.7 121191161.73.1 1382911.21.8 1480931.01.7 151091241.82.5 161111191.43.0 1789941.61.8 18991171.62.6 1994931.41.4 20991101.42.0 2195971.51.3 221021041.73.1 23102931.61.9

10

11 (84,80)

12

13 Some Scatter Patterns

14

15

16 Circular No relationship between X and Y Unable to predict Y from X

17

18

19 Ellipsoidal Positive relationship between X and Y Increases in X correspond to increases in Y (but not always) Major axis of the ellipse has positive slope

20

21 Example Verbal IQ, MathIQ

22

23 Some More Patterns

24

25

26 Ellipsoidal (thinner ellipse) Stronger positive relationship between X and Y Increases in X correspond to increases in Y (more freqequently) Major axis of the ellipse has positive slope Minor axis of the ellipse much smaller

27

28 Increased strength in the positive relationship between X and Y Increases in X correspond to increases in Y (almost always) Minor axis of the ellipse extremely small in relationship to the Major axis of the ellipse.

29

30

31 Perfect positive relationship between X and Y Y perfectly predictable from X Data falls exactly along a straight line with positive slope

32

33

34 Ellipsoidal Negative relationship between X and Y Increases in X correspond to decreases in Y (but not always) Major axis of the ellipse has negative slope slope

35

36 The strength of the relationship can increase until changes in Y can be perfectly predicted from X

37

38

39

40

41

42 Some Non-Linear Patterns

43

44

45 In a Linear pattern Y increase with respect to X at a constant rate In a Non-linear pattern the rate that Y increases with respect to X is variable

46 Growth Patterns

47

48

49 Growth patterns frequently follow a sigmoid curve Growth at the start is slow It then speeds up Slows down again as it reaches it limiting size

50 Measures of strength of a relationship (Correlation) Pearson’s correlation coefficient (r) Spearman’s rank correlation coefficient (rho,  )

51 Assume that we have collected data on two variables X and Y. Let ( x 1, y 1 ) ( x 2, y 2 ) ( x 3, y 3 ) … ( x n, y n ) denote the pairs of measurements on the on two variables X and Y for n cases in a sample (or population)

52 From this data we can compute summary statistics for each variable. The means and

53 The standard deviations and

54 These statistics: give information for each variable separately but give no information about the relationship between the two variables

55 Consider the statistics:

56 The first two statistics: are used to measure variability in each variable they are used to compute the sample standard deviations and

57 The third statistic: is used to measure correlation If two variables are positively related the sign of will agree with the sign of

58 When is positive will be positive. When x i is above its mean, y i will be above its mean When is negative will be negative. When x i is below its mean, y i will be below its mean The product will be positive for most cases.

59

60 This implies that the statistic will be positive Most of the terms in this sum will be positive

61 On the other hand If two variables are negatively related the sign of will be opposite in sign to

62 When is positive will be negative. When x i is above its mean, y i will be below its mean When is negative will be positive. When x i is below its mean, y i will be above its mean The product will be negative for most cases.

63 Again implies that the statistic will be negative Most of the terms in this sum will be negative

64 Pearsons correlation coefficient is defined as below:

65 The denominator: is always positive

66 The numerator: is positive if there is a positive relationship between X ad Y and negative if there is a negative relationship between X ad Y. This property carries over to Pearson’s correlation coefficient r

67 Properties of Pearson’s correlation coefficient r 1.The value of r is always between –1 and +1. 2.If the relationship between X and Y is positive, then r will be positive. 3.If the relationship between X and Y is negative, then r will be negative. 4.If there is no relationship between X and Y, then r will be zero. 5.The value of r will be +1 if the points, ( x i, y i ) lie on a straight line with positive slope. 6.The value of r will be -1 if the points, ( x i, y i ) lie on a straight line with negative slope.

68 r =1

69 r = 0.95

70 r = 0.7

71 r = 0.4

72 r = 0

73 r = -0.4

74 r = -0.7

75 r = -0.8

76 r = -0.95

77 r = -1

78 Computing formulae for the statistics:

79

80 To compute first compute Then

81 Example Verbal IQ, MathIQ

82 Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement 186941.11.7 21041031.51.7 386921.51.9 41051002.02.0 51181151.93.5 6961021.42.4 790871.51.8 8951001.42.0 9105961.71.7 1084801.61.7 1194871.61.7 121191161.73.1 1382911.21.8 1480931.01.7 151091241.82.5 161111191.43.0 1789941.61.8 18991171.62.6 1994931.41.4 20991101.42.0 2195971.51.3 221021041.73.1 23102931.61.9

83

84 Now Hence

85 Thus Pearsons correlation coefficient is:

86 Thus r = 0.769 Verbal IQ and Math IQ are positively correlated. If Verbal IQ is above (below) the mean then for most cases Math IQ will also be above (below) the mean.

87 Is the improvement in reading achievement (RA) related to either Verbal IQ or Math IQ? improvement in RA = Final RA – Initial RA

88 The Data Correlation between Math IQ and RA Improvement Correlation between Verbal IQ and RA Improvement

89 Scatterplot: Math IQ vs RA Improvement

90 Scatterplot: Verbal IQ vs RA Improvement


Download ppt "Summarizing Data Graphical Methods. Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot."

Similar presentations


Ads by Google