Download presentation
Presentation is loading. Please wait.
Published byこうだい おおかわち Modified over 5 years ago
1
Quantitative Data Who? Cans of cola. What? Weight (g) of contents.
368, 351, 355, 367, 352, 369, 370, 369 370, 355, 354, 357, 366, 353, 373, 365 355, 356, 362, 354, 353, 378, 368, 349
2
Weight of Contents Note that a histogram is a special kind of bar chart where the bar heights are the frequency (number of items, in this example cans) with weights that fall between the values given on the horizontal number line.
3
Weight of Contents If we choose to split the stem on the stem plot and have bins of width 5 g instead of 10 g we get a much different picture of the distribution of the data. We have a bi-modal shape. This is often an indication that there are two distinct phenomena involved.
4
Weight of Contents Who? What? Cans of cola. Weight of contents (g)
Type of cola (Regular or Diet) In fact there are two different kinds of cola, regular and diet. There are twelve cans of each.
5
Weight of Contents Regular Cola 36 2 36* 003 37* 8 Diet Cola 34 34* 9 35* 55567 Above are stem plots for the two types of cola. With two types of cola, we would like to compare the two types.
6
Comparing Distributions
How do the distributions compare in terms of Shape? Center? Spread?
7
Comparing Groups Regular Diet Min: 362 g QL: 366.5 g Med: 368.5 g
QU: 370 g Max: 378 g Diet Min: 349 g QL: g Med: 354 g QU: 355 g Max: 357 g Here are the five number summaries for each group.
8
Comparing Groups From the box plots, Regular cans’ contents tend to weigh more than contents of Diet cola. Note that there is a potential outlier for the Regular cans’, the weight of 378 g is separated from the rest of the box. Where are the boundaries? Upper quartile + 1.5*IQR = *3.5 = Note that 378 is above that upper boundary and so is highlighted. The whisker on the box extends out to the highest value below (373 g). The lower boundary for regular is Lower quartile – 1.5*IQR = – 5.25 = The minimum value (362) is above this boundary and so is the end of the lower whisker. For Diet cola, no points fall outside the boundaries.
9
Comparing Groups Regular Diet Med: 368.5 g Mean: 368.8 g Range: 16 g
IQR: g Std dev: g Diet Med: g Mean: g Range: 8 g IQR: g Std dev: g From the numerical summaries. On average, Regular cola contents has a higher weight than Diet cola. The various measures of spread give a somewhat mixed message. The sample range for Regular cola is twice that of Diet. The sample standard deviation for Regular cola is almost twice that of Diet. However, the IQR for Regular is not that much greater than that for Diet. What is happening? The one potential outlier for Regular (378) is affecting the Range and the Standard deviation but is not influencing the IQR that much. If you go back to the box plots, the widths of the boxes are fairly similar (the IQRs are similar) however the one large Regular value makes the Range and Standard deviation larger.
10
JMP The data table is arranged so that rows are cases (Who?) and columns are variables (What?). Before entering data into JMP answer the questions Who? and What?
11
JMP – Data Table Weight Type of Cola 1 368 R 2 367 11 378 12 13 351 D
14 355 23 353 24 349 There are 24 cases (cans of cola) and so there are 24 rows of data. There are two variables for each can; weight and type. Weight is a numerical (quantitative) variable (JMP puts numbers on the right of columns). Type is a categorical (character/nominal in JMP speak) variable (JMP puts characters on the left of columns).
12
JMP – Analyze Analyze – Distribution Y, Columns: Weight
13
JMP – Output Distribution Weight Stack
Display Options: Horizontal Layout Histogram Options: Count Axis
14
JMP – Output JMP will automatically select the bins. You can change these by Right click on Weight axis; Axis Settings Minimum: 340 Maximum: 380 Increment: 10
15
Note that JMP computes quartiles a little differently. This is ok
Note that JMP computes quartiles a little differently. This is ok. If you are asked to use JMP or given JMP output, use the values that JMP gives you. Rounding summaries. Round summaries to one more decimal place than what the original data have. For the standard deviation, round to two more decimal places than the original data. Round final answers only. If you wanted to compute the sample standard deviation by hand (why I have no idea) you should not round any intermediate steps in the calculation to avoid round off error.
16
JMP – Analyze Analyze – Distribution Y, Columns: Weight
By: Type of Cola
17
JMP – Output Distribution Weight Uniform Scaling Stack
Display Options: Horizontal Layout Histogram Options: Count Axis
18
Note that JMP computes quartiles a little differently. This is ok
Note that JMP computes quartiles a little differently. This is ok. If you are asked to use JMP or given JMP output, use the values that JMP gives you. Rounding summaries. Round summaries to one more decimal place than what the original data have. For the standard deviation, round to two more decimal places than the original data. Round final answers only. If you wanted to compute the sample standard deviation by hand (why I have no idea) you should not round any intermediate steps in the calculation to avoid round off error.
19
JMP – Analyze Analyze – Fit Y by X Note: Y is numerical/continuous
Y, Response: Weight X, Factor: Type of Cola Note: Y is numerical/continuous X is character/nominal
20
JMP – Output One way analysis of Weight by Type of Cola
Display Options – Box Plots, Mean Lines, Grand Mean Highlight (click on, hold down shift if more than one) potential outliers Means and Std Dev
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.