Presentation is loading. Please wait.

Presentation is loading. Please wait.

Last Time Hypothesis Testing –1-sided vs. 2-sided Paradox Big Picture Goals –Hypothesis Testing –Margin of Error –Sample Size Calculations Visualization.

Similar presentations


Presentation on theme: "Last Time Hypothesis Testing –1-sided vs. 2-sided Paradox Big Picture Goals –Hypothesis Testing –Margin of Error –Sample Size Calculations Visualization."— Presentation transcript:

1 Last Time Hypothesis Testing –1-sided vs. 2-sided Paradox Big Picture Goals –Hypothesis Testing –Margin of Error –Sample Size Calculations Visualization –Histograms

2 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Excel notation to avoid actual calculation –So no computers or calculators Bring sheet of formulas, etc.

3 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Excel notation to avoid actual calculation –So no computers or calculators Bring sheet of formulas, etc. No blue books needed

4 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Excel notation to avoid actual calculation –So no computers or calculators Bring sheet of formulas, etc. No blue books needed (will just write on my printed version)

5 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Material Covered: HW 1 – HW 5

6 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Material Covered: HW 1 – HW 5 –Note: due Thursday, Feb. 19

7 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Material Covered: HW 1 – HW 5 –Note: due Thursday, Feb. 19 –Will ask grader to return Mon. Feb. 23

8 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Material Covered: HW 1 – HW 5 –Note: due Thursday, Feb. 19 –Will ask grader to return Mon. Feb. 23 –Can pickup in my office (Hanes 352)

9 Administrative Matters Midterm I, coming Tuesday, Feb. 24 Material Covered: HW 1 – HW 5 –Note: due Thursday, Feb. 19 –Will ask grader to return Mon. Feb. 23 –Can pickup in my office (Hanes 352) –So today’s HW not included

10 Reading In Textbook Approximate Reading for Today’s Material: Pages 261-262, 9-14, 270-276, 30-34 Approximate Reading for Next Class: Pages 279-282, 34-43

11 Big Picture Hypothesis Testing (Given dist’n, answer “yes-no”) Margin of Error (Find dist’n, use to measure error) Choose Sample Size (for given amount of error) Need better prob. tools

12 Big Picture Margin of Error Choose Sample Size Need better prob tools Start with visualizing probability distributions (key to “alternate representation”)

13 Histograms Idea: show rectangles, where area represents

14 Histograms Idea: show rectangles, where area represents: (a)Distributions: probabilities (b)Lists (of numbers): # of observations

15 Histograms Idea: show rectangles, where area represents: (a)Distributions: probabilities (b)Lists (of numbers): # of observations Note: will studies these in parallel for a while (several concepts apply to both)

16 Histograms Idea: show rectangles, where area represents: (a)Distributions: probabilities (b)Lists (of numbers): # of observations Caution: There are variations not based on areas, see bar graphs in text But eye perceives area, so sensible to use it

17 Histograms Steps for Constructing Histograms: 1.Pick class intervals that contain full dist’n

18 Histograms Steps for Constructing Histograms: 1.Pick class intervals that contain full dist’n a. Prob. dist’ns: If possible values are: x = 0, 1, …, n, get good picture from choice: [-½, ½), [½, 1.5), [1.5, 2.5), …, [n-½, n+½) where [1.5, 2.5) is “all #s ≥ 1.5 and < 2.5” (called a “half open interval”)

19 Histograms Steps for Constructing Histograms: 1.Pick class intervals that contain full dist’n a. Prob. dist’ns b. Lists: e.g. 2.3, 4.5, 4.7, 4.8, 5.1 Start with [1,3), [3,7) As above use half open intervals (to break ties)

20 Histograms Steps for Constructing Histograms: 1.Pick class intervals that contain full dist’n a. Prob. dist’ns b. Lists: e.g. 2.3, 4.5, 4.7, 4.8, 5.1 Start with [1,3), [3,7) Can use anything for class intervals But some choices better than others…

21 Histograms Steps for Constructing Histograms: 1.Pick class intervals that contain full dist’n 2.Find “probabilities” or “relative frequencies” for each class (a) Probs: use f(x) for [x-½, x+½), etc. (b) Lists: [1,3): rel. freq. = 1/5 = 20% [3,7): rel. freq. = 4/5 = 80%

22 Histograms Steps for Constructing Histograms: 1.Pick class intervals that contain full dist’n 2.Find “probabilities” or “relative frequencies” for each class 3.Above each interval, draw rectangle where area represents class frequency

23 Histograms 3.Above each interval, draw rectangle where area represents class frequency (a) Probs: If width = 1, then area = width x height = height So get area = f(x), by taking height = f(x)

24 Histograms 3.Above each interval, draw rectangle where area represents class frequency (a) Probs: If width = 1, then area = width x height = height So get area = f(x), by taking height = f(x) E.g. Binomial Distribution

25 Binomial Prob. Histograms From Class Example 5 http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg5.xls Construct Prob. Histo: Create column of x values Compute f(x) values Make bar plot

26 Binomial Prob. Histograms Make bar plot –“Insert” tab –Choose “Column” –Right Click – Select Data (Horizontal – x’s, “Add series”, Probs) –Resize, and move by dragging –Delete legend –Click and change title –Right Click on Bars, Format Data Series: Border Color, Solid Line, Black Series Options, Gap Width = 0

27 Binomial Prob. Histograms From Class Example 5 http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg5.xls Construct Prob. Histo: Create column of x values Compute f(x) values Make bar plot Make several, for interesting comparison

28 Binomial Prob. Histograms From Class Example 5a

29 Binomial Prob. Histograms From Class Example 5a Compare Different p

30 Binomial Prob. Histograms From Class Example 5a Compare Different p: Surprisingly similar “mound” shape

31 Binomial Prob. Histograms From Class Example 5a Compare Different p: Surprisingly similar “mound” shape (will exploit this fact)

32 Binomial Prob. Histograms From Class Example 5a Compare Different p: Centerpoint moves as p grows

33 Binomial Prob. Histograms From Class Example 5a Compare Different p: Centerpoint moves as p grows (will quantify, and use this, too)

34 Binomial Prob. Histograms Important point: Binomial shows common shape across p

35 Binomial Prob. Histograms Important point: Binomial shows common shape across p Mound Shape (like dumping dirt out of a truck)

36 Binomial Prob. Histograms Important point: Binomial shows common shape across p Mound Shape (like dumping dirt out of a truck) What about n?

37 Binomial Prob. Histograms From Class Example 5b Compare Different n

38 Binomial Prob. Histograms From Class Example 5b Compare Different n: Again very similar mound shape

39 Binomial Prob. Histograms From Class Example 5b Compare Different n: Again very similar mound shape (will exploit this fact)

40 Binomial Prob. Histograms From Class Example 5b Compare Different n: Center does not appear to move

41 Binomial Prob. Histograms From Class Example 5b Compare Different n: Center does not appear to move, but check axes!

42 Binomial Prob. Histograms From Class Example 5b Compare Different n: Center does not appear to move, but check axes! (will quantify, and use this, too)

43 Binomial Prob. Histograms From Class Example 5b Compare Different n: But width of bump does seem to change

44 Binomial Prob. Histograms From Class Example 5b Compare Different n: But width of bump does seem to change (will quantify, and use this, too)

45 Binomial Prob. Histograms Important point: Binomial shows common shape across p & n Mound Shape (like dumping dirt out of a truck)

46 Binomial Prob. Histograms Important point: Binomial shows common shape across p & n Mound Shape (like dumping dirt out of a truck) Question for later: How can we put this work?

47 And now for something (sort of) different Recall survey from first class meeting

48 And now for something (sort of) different Recall survey from first class meeting Display Results?

49 And now for something (sort of) different Recall survey from first class meeting Display Results? Use “bar graph”

50 And now for something (sort of) different Bar Graph from Survey, on major

51 And now for something (sort of) different Bar Graph from Survey, on major Business biggest (true for many years)

52 And now for something (sort of) different Bar Graph from Survey, on major Business biggest Biology 2 nd (fairly new)

53 And now for something (sort of) different Bar Graph from Survey, on major Business biggest Biology 2 nd Variety of others Welcome!

54 And now for something (sort of) different Bar Graph from Survey, on major Labels, not Class Intervals

55 And now for something (sort of) different Bar Graph from Survey, on major Thin bars Now OK

56 And now for something (sort of) different Bar Graph from Survey, on major Study Counts, not rel. freq.

57 And now for something (sort of) different Bar Graph from Survey, on major Study Counts, not rel. freq. (not areas)

58 And now for something (sort of) different Bar Graph from Survey, on year

59 And now for something (sort of) different Bar Graph from Survey, on year Distribution makes sense?

60 And now for something (sort of) different Bar Graph from Survey, on year Different color stresses different data

61 And now for something (sort of) different Bar Graph from Survey, on year Shorter & fewer labels appear as horizontal

62 Histograms Steps for Constructing Histograms: 1.Pick class intervals that contain full dist’n 2.Find “probabilities” or “relative frequencies” for each class 3.Above each interval, draw rectangle where area represents class frequency

63 Histograms HW: 5.21b (make & print an Excel plot)

64 Histograms 3.Above each interval, draw rectangle where area represents class frequency (a) Probs

65 Histograms 3.Above each interval, draw rectangle where area represents class frequency (a) Probs (b) Lists

66 Histograms 3.Above each interval, draw rectangle where area represents class frequency (a) Probs (b) Lists: e.g. 2.3, 4.5, 4.7, 4.8, 5.1 same e.g. as above

67 Histograms 3.Above each interval, draw rectangle where area represents class frequency (a) Probs (b) Lists: e.g. 2.3, 4.5, 4.7, 4.8, 5.1

68 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1 1 2 3 4 5 6 7

69 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 1 2 3 4 5 6 7

70 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) From above discussion 1 2 3 4 5 6 7

71 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) From above discussion (will see: not very good) 1 2 3 4 5 6 7

72 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 1 2 3 4 5 6 7

73 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 1 2 3 4 5 6 7

74 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 15 10 5 1 2 3 4 5 6 7

75 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 15So each is 20% 10 5 1 2 3 4 5 6 7

76 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 1520% = Area 10 5 1 2 3 4 5 6 7

77 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 1520% = Area = 2 * height 10 5 1 2 3 4 5 6 7

78 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 1520% = Area = 2 * ht = 2 * (10% / unit) 10 5 1 2 3 4 5 6 7

79 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 1520% = Area = 2 * ht = 2 * (10% / unit) 10 5 1 2 3 4 5 6 7 % per unit

80 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 1520% = Area = 4 * ht 10 5 1 2 3 4 5 6 7 % per unit

81 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 1520% = Area = 4 * ht = 4 * (5% / unit) 10 5 1 2 3 4 5 6 7 % per unit

82 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20Total Frequency = 100% 1520% = Area = 4 * ht = 4 * (5% / unit) 10 5 1 2 3 4 5 6 7 % per unit

83 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20 20% = Area = 4 * ht = 4 * (5% / unit) 15 10 5 1 2 3 4 5 6 7 % per unit

84 Histograms Rectangles - area represents class frequency 2.3, 4.5, 4.7, 4.8, 5.1, Class Intervals [1,3), [3,7) 20 15 10 5 1 2 3 4 5 6 7 % per unit

85 Histograms Note: This histogram hides structure in data: 2.3, 4.5, 4.7, 4.8, 5.1 20 15 10 5 1 2 3 4 5 6 7 % per unit

86 Histograms Quite sparse region 2.3, 4.5, 4.7, 4.8, 5.1 20 15 10 5 1 2 3 4 5 6 7 % per unit

87 Histograms Quite dense region 2.3, 4.5, 4.7, 4.8, 5.1 20 15 10 5 1 2 3 4 5 6 7 % per unit

88 Histograms Endpoints way off 2.3, 4.5, 4.7, 4.8, 5.1 20 15 10 5 1 2 3 4 5 6 7 % per unit

89 Histograms General Major Challenge: Choice of Class Intervals 20 15 10 5 1 2 3 4 5 6 7 % per unit

90 Histograms Try for “better” choice : 2.3, 4.5, 4.7, 4.8, 5.1 1 2 3 4 5 6 7

91 Histograms Try for “better” choice : 2.3, 4.5, 4.7, 4.8, 5.1 [2,4) [4,5) [5,6) 1 2 3 4 5 6 7

92 Histograms Now build histogram as above (areas): 2.3, 4.5, 4.7, 4.8, 5.1 60 30 1 2 3 4 5 6 7 % per unit

93 Histograms Now build histogram as above (areas): 2.3, 4.5, 4.7, 4.8, 5.1 60 30 1 2 3 4 5 6 7 % per unit

94 Histograms Now build histogram as above (areas): 2.3, 4.5, 4.7, 4.8, 5.1 60 30 1 2 3 4 5 6 7 % per unit

95 Histograms Now build histogram as above (areas): 2.3, 4.5, 4.7, 4.8, 5.1 60 30 1 2 3 4 5 6 7 % per unit

96 Histograms Now build histogram as above (areas): 2.3, 4.5, 4.7, 4.8, 5.1 60 30 1 2 3 4 5 6 7 % per unit

97 Histograms Note: much better visual impression 2.3, 4.5, 4.7, 4.8, 5.1 60 30 1 2 3 4 5 6 7 % per unit

98 Histograms Note: much better visual impression Histogram better reflects “structure in data” 60 30 1 2 3 4 5 6 7 % per unit

99 Histograms General Comments: Total area under histogram is 100%

100 Histograms General Comments: Total area under histogram is 100% So label vertical axis as “% per unit”

101 Histograms General Comments: Total area under histogram is 100% So label vertical axis as “% per unit” Synonym for “Class Interval” is “bin”

102 Histograms General Comments: Total area under histogram is 100% So label vertical axis as “% per unit” Synonym for “Class Interval” is “bin” (think of relative frequency as counting observations that “fall into bins”)

103 Histograms General Comments: Total area under histogram is 100% So label vertical axis as “% per unit” Synonym for “Class Interval” is “bin” (think of relative frequency as counting observations that “fall into bins”) Choice of bins is critical

104 Histograms General Comments: Total area under histogram is 100% So label vertical axis as “% per unit” Synonym for “Class Interval” is “bin” (think of relative frequency as counting observations that “fall into bins”) Choice of bins is critical Common Simplification: Equally spaced

105 Histograms General Comments: Choice of bins is critical Common Simplification: Equally spaced But still have choice of binwidth (also very challenging)

106 Histograms HW: C15 For the data: 0.8, 2.1, 2.6, 0.9, 2.2, 0.8, 2.2, 0.9 a)Make histograms using the bins: i.[0,1), [1,2), [2,3) ii.[0.5,1.5), [1.5,2.5), [2.5,3.5) iii.[0,1), 1,3) (Interesting to look at differences)

107 Histograms HW: C15 For the data: 0.8, 2.1, 2.6, 0.9, 2.2, 0.8, 2.2, 0.9 a)Make histograms using the bins: i.[0,1), [1,2), [2,3) ii.[0.5,1.5), [1.5,2.5), [2.5,3.5) iii.[0,1), 1,3) b) Why are bins [0,2), [1,3) inappropriate here? c) Why are bins [1,2), [2,5) inappropriate here?

108 Histogram Real Data Example Buffalo Snow Fall Data Annual totals (in inches)

109 Histogram Real Data Example Buffalo Snow Fall Data Annual totals (in inches) For Buffalo, N.Y.

110 Histogram Real Data Example Buffalo Snow Fall Data Annual totals (in inches) For Buffalo, N.Y. 63 years, ranging from ~30 to ~120

111 Histogram Real Data Example Buffalo Snow Fall Data Annual totals (in inches) For Buffalo, N.Y. 63 years, ranging from ~30 to ~120 A lot of snow, due to “lake effect”

112 Histogram Real Data Example Buffalo Snow Fall Data Annual totals (in inches) For Buffalo, N.Y. 63 years, ranging from ~30 to ~120 A lot of snow, due to “lake effect” Any patterns in data?

113 Histogram Real Data Example Buffalo Snow Fall Data Data Available in Class Example 6 Left hand column of spreadsheet: http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg6.xls

114 Histogram Real Data Example Buffalo Snow Fall Data Data Available in Class Example 6 Left hand column of spreadsheet: http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg6.xls Now do histogram analysis Using Excel

115 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Data Tab

116 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Data Tab Push Data Analysis Button

117 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Data Tab Push Data Analysis Button Pulls up:

118 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Data Tab Push Data Analysis Button Pulls up: Choose:

119 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Pulls Up:

120 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Pulls Up: Link input data

121 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Pulls Up: Link input data Empty for default

122 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Pulls Up: Link input data Empty for default Choose here

123 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Pulls Up: Link input data Empty for default Choose here And location

124 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Pulls Up: Link input data Empty for default Choose here And location Get Histo Plot

125 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Manually Chart Result???

126 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Manually Chart Result??? Twiddle Output (similar to above): Delete Series Legend

127 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Manually Chart Result??? Twiddle Output (similar to above): Delete Series Legend Format Data Series – Gap Width  0

128 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Manually Chart Result??? Twiddle Output (similar to above): Delete Series Legend Format Data Series – Gap Width  0 Format Data Series – Border Color  Black

129 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Manually Chart Result??? Twiddle Output (similar to above): Delete Series Legend Format Data Series – Gap Width  0 Format Data Series – Border Color  Black Chart Tools – Design – Choose Titled

130 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Manually Chart Result??? Twiddle Output (similar to above): Delete Series Legend Format Data Series – Gap Width  0 Format Data Series – Border Color  Black Chart Tools – Design – Choose Titled Type in Title

131 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Result:

132 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Result: Unround numbers for bin edges

133 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Result: Unround numbers for bin edges Hard to interpret

134 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Data centered around 90

135 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Data centered around 90 Most data between 50 and 130

136 Histogram Real Data Example Buffalo Snow Fall Data – Excel Default Histo Data centered around 90 Most data between 50 and 130 Assymetric Distribution

137 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth

138 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth

139 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth Chosen by me Binwidth = 5, << ~13 from EXCEL default

140 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth Chosen by me Binwidth = 5, << ~13 from EXCEL default Nicer edge numbers

141 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth Chosen by me Binwidth = 5, << ~13 from EXCEL default Nicer edge numbers Data centered around 84 (now more precise)

142 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth Chosen by me Binwidth = 5, << ~13 from EXCEL default Nicer edge numbers Data centered around 84 (now more precise) Bar graph rougher (fewer points in each bin)

143 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth Chosen by me Binwidth = 5, << ~13 from EXCEL default Nicer edge numbers Data centered around 84 (now more precise) Bar graph rougher (fewer points in each bin) Suggests 3 main groups

144 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth Chosen by me Binwidth = 5, << ~13 from EXCEL default Nicer edge numbers Data centered around 84 (now more precise) Bar graph rougher (fewer points in each bin) Suggests 3 main groups (called “modes” or “clusters”)

145 Histogram Real Data Example Buffalo Snow Fall Data – Smaller binwidth Chosen by me Binwidth = 5, << ~13 from EXCEL default Nicer edge numbers Data centered around 84 (now more precise) Bar graph rougher (fewer points in each bin) Suggests 3 main groups (called “modes” or “clusters”) (can’t see this above: bin width is important)

146 Histogram Real Data Example Buffalo Snow Fall Data – Larger binwidth

147 Histogram Real Data Example Buffalo Snow Fall Data – Larger binwidth

148 Histogram Real Data Example Buffalo Snow Fall Data – Larger binwidth Chosen by me Binwidth = 30, >> ~13 from EXCEL default

149 Histogram Real Data Example Buffalo Snow Fall Data – Larger binwidth Chosen by me Binwidth = 30, >> ~13 from EXCEL default Bar graph is “smooth” (since many points in each bin)

150 Histogram Real Data Example Buffalo Snow Fall Data – Larger binwidth Chosen by me Binwidth = 30, >> ~13 from EXCEL default Bar graph is “smooth” (since many points in each bin) Only one mode (cluster)???

151 Histogram Real Data Example Buffalo Snow Fall Data – Larger binwidth Chosen by me Binwidth = 30, >> ~13 from EXCEL default Bar graph is “smooth” (since many points in each bin) Only one mode (cluster)??? Quite symmetric?

152 Histogram Real Data Example Buffalo Snow Fall Data – Larger binwidth Chosen by me Binwidth = 30, >> ~13 from EXCEL default Bar graph is “smooth” (since many points in each bin) Only one mode (cluster)??? Quite symmetric? (different from above: bin width is important)

153 Histogram Real Data Example HW: 1.28 [data in ta01_005.xls] ((c) loses bump near 50) 1.36 [data in ex01_036.xls] ((a) 4 (b) 2 (c) 1) 1.37 1.39

154 Research Corner Histo Bin Width (serious issue)

155 Research Corner Histo Bin Width (serious issue) Interesting Data Set: Hidalgo Stamps

156 Research Corner Histo Bin Width (serious issue) Interesting Data Set: Hidalgo Stamps Famous among postage stamp collectorsFamous among postage stamp collectors

157 Research Corner Histo Bin Width (serious issue) Interesting Data Set: Hidalgo Stamps Famous among postage stamp collectorsFamous among postage stamp collectors Printed in Mexico, 1800’s, over ~70 yearsPrinted in Mexico, 1800’s, over ~70 years

158 Research Corner Histo Bin Width (serious issue) Interesting Data Set: Hidalgo Stamps Famous among postage stamp collectorsFamous among postage stamp collectors Printed in Mexico, 1800’s, over ~70 yearsPrinted in Mexico, 1800’s, over ~70 years Very different paper thicknesses…Very different paper thicknesses…

159 Research Corner Histo Bin Width (serious issue) Interesting Data Set: Hidalgo Stamps Famous among postage stamp collectorsFamous among postage stamp collectors Printed in Mexico, 1800’s, over ~70 yearsPrinted in Mexico, 1800’s, over ~70 years Very different paper thicknesses…Very different paper thicknesses… How many paper sources?How many paper sources?

160 Research Corner Histo Bin Width (serious issue) Interesting Data Set: Hidalgo Stamps Famous among postage stamp collectorsFamous among postage stamp collectors Printed in Mexico, 1800’s, over ~70 yearsPrinted in Mexico, 1800’s, over ~70 years Very different paper thicknesses…Very different paper thicknesses… How many paper sources?How many paper sources? Unknown, since records are lostUnknown, since records are lost

161 Research Corner Histo Bin Width (serious issue) Interesting Data Set: Hidalgo Stamps Famous among postage stamp collectorsFamous among postage stamp collectors Printed in Mexico, 1800’s, over ~70 yearsPrinted in Mexico, 1800’s, over ~70 years Very different paper thicknesses…Very different paper thicknesses… How many paper sources?How many paper sources? Unknown, since records are lostUnknown, since records are lost Study histogram of stamp thicknessesStudy histogram of stamp thicknesses

162 Research Corner Movie over binwidth

163 Research Corner Movie over binwidth Shows very wide range

164 Research Corner Movie over binwidth Shows very wide range (much different (much different visual impressions) visual impressions)

165 Research Corner Movie over binwidth Shows very wide range (much different (much different visual impressions) visual impressions) How many bumps?

166 Research Corner Movie over binwidth Shows very wide range (much different (much different visual impressions) visual impressions) How many bumps? Answer published in literature: 2, 3, 5, 7, 10

167 Research Corner Movie over binwidth Shows very wide range (much different (much different visual impressions) visual impressions) How many bumps? Answer published in literature: 2, 3, 5, 7, 10 Very challenging question

168 Research Corner How many bumps? Believe in 2?

169 Research Corner How many bumps? Believe in 3?

170 Research Corner How many bumps? Believe in 5?

171 Research Corner How many bumps? Believe in 7?

172 Research Corner How many bumps? Believe in 10?

173 Big Picture Margin of Error Choose Sample Size Need better prob tools Start with visualizing probability distributions

174 Big Picture Margin of Error Choose Sample Size Need better prob tools Start with visualizing probability distributions, Next exploit constant shape property of Bi

175 Big Picture Start with visualizing probability distributions, Next exploit constant shape property of Binom’l

176 Big Picture Start with visualizing probability distributions, Next exploit constant shape property of Binom’l Centerpoint feels p

177 Big Picture Start with visualizing probability distributions, Next exploit constant shape property of Binom’l Centerpoint feels p Spread feels n

178 Big Picture Start with visualizing probability distributions, Next exploit constant shape property of Binom’l Centerpoint feels p Spread feels n

179 Big Picture Start with visualizing probability distributions, Next exploit constant shape property of Binom’l Centerpoint feels p Spread feels n Now quantify these ideas, to put them to work

180 Notions of Center Will later study “notions of spread”

181 Notions of Center Textbook: Sections 4.4 and 1.2

182 Notions of Center Textbook: Sections 4.4 and 1.2 Recall parallel development: (a)Probability Distributions (b)Lists of Numbers

183 Notions of Center Textbook: Sections 4.4 and 1.2 Recall parallel development: (a)Probability Distributions (b)Lists of Numbers Study 1 st, since easier

184 Notions of Center (b)Lists of Numbers “Average” or “Mean”

185 Notions of Center (b)Lists of Numbers “Average” or “Mean” of x 1, x 2, …, x n Mean = =

186 Notions of Center (b)Lists of Numbers “Average” or “Mean” of x 1, x 2, …, x n Mean = = common notation

187 Notions of Center (b)Lists of Numbers “Average” or “Mean” of x 1, x 2, …, x n Mean = = (as before) Greek sigma for sum means “sum over I = 1,…,n”

188 Notions of Center HW: C16: for the data of 1.57, find the mean using the Excel function AVERAGE (10.03)

189 Notions of Center Generalization of Mean: “Weighted Average”

190 Notions of Center Generalization of Mean: “Weighted Average” Idea: allow non-equal weights on s:

191 Notions of Center Generalization of Mean: “Weighted Average” Idea: allow non-equal weights on s:

192 Notions of Center Generalization of Mean: “Weighted Average” Idea: allow non-equal weights on s: Where,

193 Notions of Center Generalization of Mean: “Weighted Average” E.g.: ordinary mean has each

194 Notions of Center Generalization of Mean: “Weighted Average” E.g.: ordinary mean has each (constant weights)

195 Notions of Center Generalization of Mean: “Weighted Average” Intuition: Corresponds to finding balance point of weights on number line

196 Notions of Center Generalization of Mean: “Weighted Average” Intuition: Corresponds to finding balance point of weights on number line

197 Notions of Center Generalization of Mean: “Weighted Average” Intuition: Corresponds to finding balance point of weights on number line

198 Notions of Center Generalization of Mean: “Weighted Average” Intuition: Corresponds to finding balance point of weights on number line

199 Notions of Center HW: C17: Calculate (and think about as “balance point”) weighted average of 1, 2, 3, 10 for the weights: a.¼, ¼, ¼, 1/4, (ordinary avg.)(4) b.0.1, 0.1, 0.1, 0.7 (more on 10)(7.6) c.0.3, 0.3, 0.3, 0.1 (less on 10)(2.8) d.1/3, 1/3, 1/3, 0 (none on 10)(2) e.0, 1, 0, 0 (all on 2)(2)


Download ppt "Last Time Hypothesis Testing –1-sided vs. 2-sided Paradox Big Picture Goals –Hypothesis Testing –Margin of Error –Sample Size Calculations Visualization."

Similar presentations


Ads by Google