Presentation is loading. Please wait.

Presentation is loading. Please wait.

Organizing and describing Data. Instructor:W.H.Laverty Office:235 McLean Hall Phone:966-6096 Lectures: M W F 11:30am - 12:20pm Arts 143 Lab: M 3:30 -

Similar presentations


Presentation on theme: "Organizing and describing Data. Instructor:W.H.Laverty Office:235 McLean Hall Phone:966-6096 Lectures: M W F 11:30am - 12:20pm Arts 143 Lab: M 3:30 -"— Presentation transcript:

1 Organizing and describing Data

2 Instructor:W.H.Laverty Office:235 McLean Hall Phone:966-6096 Lectures: M W F 11:30am - 12:20pm Arts 143 Lab: M 3:30 - 4:20 Thorv105 Evaluation: Assignments, Labs, Term tests - 40% Every 2nd Week (approx) – Term Test Final Examination - 60%

3 Techniques for continuous variables Continuous variables are measurements that vary over a continuum (Weight, Blood Pressure, etc.) (as opposed to categorical variables Gender, religion, Marital Status etc.)

4 The Grouped frequency table: The Histogram

5 To Construct A Grouped frequency table A Histogram

6 1.Find the maximum and minimum of the observations. 2.Choose non-overlapping intervals of equal width (The Class Intervals) that cover the range between the maximum and the minimum. 3.The endpoints of the intervals are called the class boundaries. 4.Count the number of observations in each interval (The cell frequency - f). 5.Calculate relative frequency relative frequency = f/N

7 Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement 186941.11.7 21041031.51.7 386921.51.9 41051002.02.0 51181151.93.5 6961021.42.4 790871.51.8 8951001.42.0 9105961.71.7 1084801.61.7 1194871.61.7 121191161.73.1 1382911.21.8 1480931.01.7 151091241.82.5 161111191.43.0 1789941.61.8 18991171.62.6 1994931.41.4 20991101.42.0 2195971.51.3 221021041.73.1 23102931.61.9

8 In this example the upper endpoint is included in the interval. The lower endpoint is not.

9 Histogram – Verbal IQ

10 Histogram – Math IQ

11 Example In this example we are comparing (for two drugs A and B) the time to metabolize the drug. 120 cases were given drug A. 120 cases were given drug B. Data on time to metabolize each drug is given on the next two slides

12 Drug A

13 Drug B

14 Grouped frequency tables

15 Histogram – drug A (time to metabolize)

16 Histogram – drug B (time to metabolize)

17 The Grouped frequency table: The Histogram

18 To Construct A Grouped frequency table A Histogram

19 1.Find the maximum and minimum of the observations. 2.Choose non-overlapping intervals of equal width (The Class Intervals) that cover the range between the maximum and the minimum. 3.The endpoints of the intervals are called the class boundaries. 4.Count the number of observations in each interval (The cell frequency - f). 5.Calculate relative frequency relative frequency = f/N To Construct - A Grouped frequency table

20 Draw above each class interval: A vertical bar above each Class Interval whose height is either proportional to The cell frequency (f) or the relative frequency (f/N) To draw - A Histogram Class Interval frequency (f) or relative frequency (f/N)

21 Some comments about histograms The width of the class intervals should be chosen so that the number of intervals with a frequency less than 5 is small. This means that the width of the class intervals can decrease as the sample size increases

22 If the width of the class intervals is too small. The frequency in each interval will be either 0 or 1 The histogram will look like this

23 If the width of the class intervals is too large. One class interval will contain all of the observations. The histogram will look like this

24 Ideally one wants the histogram to appear as seen below. This will be achieved by making the width of the class intervals as small as possible and only allowing a few intervals to have a frequency less than 5.

25 As the sample size increases the histogram will approach a smooth curve. This is the histogram of the population

26 N = 25

27 N = 100

28 N = 500

29 N = 2000

30 N = ∞

31 Comment: the proportion of area under a histogram between two points estimates the proportion of cases in the sample (and the population) between those two values.

32 Example: The following histogram displays the birth weight (in Kg’s) of n = 100 births

33 Find the proportion of births that have a birthweight less than 0.34 kg.

34 Proportion = (1+1+3+10+11+19+17)/100 = 0.62

35 The Characteristics of a Histogram Central Location (average) Spread (Variability, Dispersion) Shape

36 Central Location

37 Spread, Dispersion, Variability

38 Shape – Bell Shaped (Normal)

39 Shape – Positively skewed

40 Shape – Negatively skewed

41 Shape – Platykurtic

42 Shape – Leptokurtic

43 Shape – Bimodal

44 The Stem-Leaf Plot An alternative to the histogram

45 Each number in a data set can be broken into two parts – A stem – A Leaf

46 Example Verbal IQ = 84 84 –Stem = 10 digit = 8 – Leaf = Unit digit = 4 Leaf Stem

47 Example Verbal IQ = 104 104 –Stem = 10 digit = 10 – Leaf = Unit digit = 4 Leaf Stem

48 To Construct a Stem- Leaf diagram Make a vertical list of “all” stems Then behind each stem make a horizontal list of each leaf

49 Example The data on N = 23 students Variables Verbal IQ Math IQ Initial Reading Achievement Score Final Reading Achievement Score

50 Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement 186941.11.7 21041031.51.7 386921.51.9 41051002.02.0 51181151.93.5 6961021.42.4 790871.51.8 8951001.42.0 9105961.71.7 1084801.61.7 1194871.61.7 121191161.73.1 1382911.21.8 1480931.01.7 151091241.82.5 161111191.43.0 1789941.61.8 18991171.62.6 1994931.41.4 20991101.42.0 2195971.51.3 221021041.73.1 23102931.61.9

51 We now construct: a stem-Leaf diagram of Verbal IQ

52 A vertical list of the stems 8 9 10 11 12 We now list the leafs behind stem

53 8 9 10 11 12 8610486 511896909510584 94119828010911189999499 95102 2

54 8 9 10 11 12 8610486 511896909510584 94119828010911189999499 95102 2

55 86 6 4 2 0 9 96 0 5 4 9 4 9 5 104 5 5 9 2 2 118 9 1 12

56 8 0 2 4 6 6 9 9 0 4 4 5 5 6 9 9 10 2 2 4 5 5 9 11 1 8 9 12 The leafs may be arranged in order

57 8 0 2 4 6 6 9 9 0 4 4 5 5 6 9 9 10 2 2 4 5 5 9 11 1 8 9 12 The stem-leaf diagram is equivalent to a histogram

58 8 0 2 4 6 6 9 9 0 4 4 5 5 6 9 9 10 2 2 4 5 5 9 11 1 8 9 12 The stem-leaf diagram is equivalent to a histogram

59 Rotating the stem-leaf diagram we have 8090100110120

60 The two part stem leaf diagram Sometimes you want to break the stems into two parts for leafs 0,1,2,3,4 * for leafs 5,6,7,8,9

61 Stem-leaf diagram for Initial Reading Acheivement 1.01234444455556666677789 2.0 This diagram as it stands does not give an accurate picture of the distribution

62 We try breaking the stems into two parts 1.*012344444 1. 55556666677789 2.* 0 2.

63 The five-part stem-leaf diagram If the two part stem-leaf diagram is not adequate you can break the stems into five parts for leafs 0,1 tfor leafs 2,3 ffor leafs 4, 5 s for leafs 6,7 *for leafs 8,9

64 We try breaking the stems into five parts 1.*01 1.t23 1.f444445555 1.s66666777 1. 89 2.* 0

65 Stem leaf Diagrams Verbal IQ, Math IQ, Initial RA, Final RA

66 Some Conclusions Math IQ, Verbal IQ seem to have approximately the same distribution “bell shaped” centered about 100 Final RA seems to be larger than initial RA and more spread out Improvement in RA Amount of improvement quite variable

67 Next Topic Numerical Measures - Location


Download ppt "Organizing and describing Data. Instructor:W.H.Laverty Office:235 McLean Hall Phone:966-6096 Lectures: M W F 11:30am - 12:20pm Arts 143 Lab: M 3:30 -"

Similar presentations


Ads by Google