Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Methods to take large amounts of data and present it in a concise form › Want to present height of females and males in STA 220 › Could measure everyone.

Similar presentations


Presentation on theme: " Methods to take large amounts of data and present it in a concise form › Want to present height of females and males in STA 220 › Could measure everyone."— Presentation transcript:

1

2  Methods to take large amounts of data and present it in a concise form › Want to present height of females and males in STA 220 › Could measure everyone and graph results › More interested in that describes the most likely representation of the height of the students in the class  This is called 2

3  Once you have your measure of centrality may want or need to know  Is the data repeatable? › This would be 3

4  3 common measures of centrality › 4

5  Mean › Mathematical average of all the data 5

6  Example › Suppose Suzy is taking Chemistry. There is a lab quiz every other week. Near the end of the semester, Suzy wants to determine her quiz average. Her quiz scores are: 78, 92, 83, 95, 98, 87 and 93. 6

7  Mathematical shorthand: › Data points are often referred to as x i where i is 1…n, n being › For Suzy’s quiz scores, n = 7 and x 1 = 78, x 2 = 92, x 3 = 83, x 4 = 95, x 5 = 98, x 6 = 87, and x 7 = 93. › The mean would be denoted by, called x-bar.  For Suzy’s quizzes, 7

8  The median is the of the dataset, such that half of all data points are to that value AND half of all data points are to that value. 8

9  To find the median: 1. Rearrange data from smallest to largest 2. If n is odd, calculate 3. If n is even, calculate 4. Count the sorted data set until you get to the data point in the position you calculated in part 2 or 3 5. If the number of data points, n, was odd, then you are done. If n is even, then compute the mean of the data point in the position and position. 9

10  Example › Given the following salary information from a group of engineers, determine the median salary: $75,400; $83,600; $45,700; $43,900; $62,100; $90,500; $55,800. › First reorder the data in increasing order:  43,900; 45,700; 55,800; 62,100; 75,400; 83,600; 90,500 › Since n = 7 is odd, compute  = (7+1)/2 = 4  43,900; 45,700; 55,800; ; 75,400; 83,600; 90,500 10

11  Example › A group of students are taking the following number of credit hours: 12, 17, 15, 14, 9, 16, 18, 16, 14, 12. Find the median number of credit hours being taken by this group of students. › Put the data in increasing order:  9, 12, 12, 14, 14, 15, 16, 16, 17, 18 › Since n = 10 is even, compute  = 10/2 = 5 › Next, identify the data points in the fifth and sixth position  9, 12, 12, 14, 14, 15, 16, 16, 17, 18 › Compute the mean of the fifth and sixth data points  =14.5 11

12  The mode is the number that appears the most often in the data set.  Example: Here are the number of cavities found in a class of 1 st graders: › 0,1, 0,1, 0,5,5,3,4, 0, 0,2, 0,1, 0,3,2,4,7,1. Find the mode. › 0 occurs times, while 1 occurs times, 2, 3, 4, and 5 occurs, and 7 occurs once. As 0 occurs the most often, it is. 12

13  Comparing Mean, Median, Mode › Mean  Strong Points   Uses all of the data   Weak Points  Sensitive to extremes. Test scores: 34, 92, 95, 94, 89 have a mean of 80.8. If the professor dropped the lowest test score, 34, then the mean would be  May not be an actual, observable value. For example, the average family has 1.6 children. What does it mean to have 0.6 of a child? 13

14  Comparing Mean, Median and Mode › Median and Mode  Strong Points  Not sensitive to. In test score example from before the median would be 34, 89, 92, 94, 95.  The mode is an observable value; the median is an observable value  Weak Points  The value may not be unique. In the case of the mode, it is possible to have several values that appear the most.  Both do not use actual/all data values. The mode keys in on frequency, while the median just looks at the middle of the data set. 14

15  In 1995, the mean salary of a MLB player was $1,080,000 while the median salary of a MLB player was $275,000. › Recall the median is the point where half of the data points are above and half are below – Thus, at least half of the players in the MLB earned less than › A mean of $1,080,000 tells you that there are players earning millions of dollars – but this may not be the number of all players in the MLB 15

16  The Corps of Engineers wants to dredge a harbor in Hackensack, NJ. The EPA has these guidelines for harbor dredging: › The sediment is tested for the presence of PCBs. › If PCBs < 25 parts per billion, then its OK to dredge and dump. › If 25 ppb ≤ PCBs ≤ 50 ppb, then its OK to dredge and dump, but then a cap must be placed on the dump pile. › If PCBs ≥ 50 ppb, then the harbor can not be dredged and dumped. 16

17  6 samples are taken, and the average PCBs was 46.5 ppb. The Corps of Engineers should be allowed to dredge and dump the harbor, then cap the dump site…or should they?  The actual samples were: 66, 74, 81, 55, 1, 2. › The average is › The median is 17

18  Measures of variability describe the of the data  All measures of variability are greater than or equal to › Measures close to indicate that the data is highly consistent and repeatable  4 measures of variability:, Average deviation,, Standard Deviation 18

19  Range › Difference between the largest data point in the dataset and the smallest data point in the dataset › or Range =  Example › Suppose the daily low temperatures for the past week have been -3, -7, -2, 0, 2, 4. What is the range? › Range = = 11 19

20  Average Deviation › The average deviation of the data from its mean value. › There are 4 steps: 1. Compute the of the data set, x-bar 2. Calculate the absolute value of the between each data point, x i, and the mean value, x-bar 3. Add up all of the values calculated in step 2 4. Divide the sum from step 3 by 20

21  Average Deviation, Example › Suppose you have the following four data points in your dataset: 1,2,4,5. Find the average deviation. 21

22  Average Deviation › In mathematical shorthand, the average deviation can be expressed as: › Good method is to make a table: |X i – (x-bar)|Result 1|1-3|2 2|2-3|1 4|4-3|1 5|5-3|2 12/4 = 36/4 = 1.5 22

23  Variance › Similar to average deviation 1. Compute the mean of the dataset, x-bar 2. Calculate the difference between each data point, x i, and the mean value, x-bar 3. all of the values in step 2 4. Add up all the values in step 3 5. Divide the sum in step 4 by the total number of data points 23

24  Variance, Example › Good idea to make a table similar to the one we used for average deviation XiXi X i – (x-bar) 11-3-24 22-31 44-311 55-324 12/4 = 3 24

25  Variance › Mathematical shorthand: 25

26  Standard Deviation › The standard deviation is just the › By taking the square root, the units of the standard deviation are the same as the original units of the data › In the previous example: 26


Download ppt " Methods to take large amounts of data and present it in a concise form › Want to present height of females and males in STA 220 › Could measure everyone."

Similar presentations


Ads by Google