Chapter 3: Data Description

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

3.3 Measures of Position Measures of location in comparison to the mean. - standard scores - percentiles - deciles - quartiles.
Chapter 3 Data Description
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Slides by JOHN LOUCKS St. Edward’s University.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
1 Copyright © 2015 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R T H R E E DATA DESCRIPTION.
Chapter 3 Data Description 1 Copyright © 2012 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
MGQ 201 WEEK 4 VICTORIA LOJACONO. Help Me Solve This Tool.
 IWBAT summarize data, using measures of central tendency, such as the mean, median, mode, and midrange.
Unit 3 Section 3-3 – Day : Measures of Variation  Range – the highest value minus the lowest value.  The symbol R is used for range.  Variance.
Chapter 3 Averages and Variations
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Unit 3 Sections 3-1 & 3-2. What we will be able to do throughout this chapter…  Use statistical methods to summarize data  The most familiar method.
© The McGraw-Hill Companies, Inc., Chapter 3 Data Description.
Review Measures of central tendency
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Page 1 Chapter 3 Variability. Page 2 Central tendency tells us about the similarity between scores Variability tells us about the differences between.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Measures of Dispersion How far the data is spread out.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
 IWBAT summarize data, using measures of central tendency, such as the mean, median, mode, and midrange.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.
Chapter 3 Data Description 1 © McGraw-Hill, Bluman, 5 th ed, Chapter 3.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Unit 2 Section 2.3. What we will be able to do throughout this part of the chapter…  Use statistical methods to summarize data  The most familiar method.
MODULE 3: DESCRIPTIVE STATISTICS 2/6/2016BUS216: Probability & Statistics for Economics & Business 1.
Chapter 3: Section 2 Measures of Variance. Paint Comparison: How many months will they last??? Brand ABrand B Average for Brand.
CHAPTER 3 DATA DESCRIPTION © MCGRAW-HILL, BLUMAN, 5 TH ED, CHAPTER 3 1.
Data Description Note: This PowerPoint is only a summary and your main source should be the book. Lecture (8) Lecturer : FATEN AL-HUSSAIN.
Chapter 3 Section 3 Measures of variation. Measures of Variation Example 3 – 18 Suppose we wish to test two experimental brands of outdoor paint to see.
CHAPTET 3 Data Description. O UTLINE : Introduction. 3-1 Measures of Central Tendency. 3-2 Measures of Variation. 3-3 Measures of Position. 3-4 Exploratory.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Descriptive Statistics ( )
Exploratory Data Analysis
Methods for Describing Sets of Data
Descriptive Statistics Measures of Variation
DATA DESCRIPTION C H A P T E R T H R E E
Business and Economics 6th Edition
Statistics 1: Statistical Measures
One-Variable Statistics
Data Description Chapter(3) Lecture8)
© McGraw-Hill, Bluman, 5th ed, Chapter 3
Chapter 3 Describing Data Using Numerical Measures
Describing, Exploring and Comparing Data
McGraw-Hill, Bluman, 7th ed, Chapter 3
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Copyright © 2012 The McGraw-Hill Companies, Inc.
NUMERICAL DESCRIPTIVE MEASURES
Chapter 3 Describing Data Using Numerical Measures
Section 3.2 Measures of Spread.
Numerical Descriptive Measures
CHAPTET 3 Data Description.
Measures of Position Section 3.3.
Refer to Ex 3-18 on page Record the info for Brand A in a column. Allow 3 adjacent other columns to be added. Do the same for Brand B.
Chapter 3 Data Description
CHAPTER 1 Exploring Data
St. Edward’s University
Business and Economics 7th Edition
Chapter 2 Describing, Exploring, and Comparing Data
NUMERICAL DESCRIPTIVE MEASURES
Presentation transcript:

Chapter 3: Data Description

Chapter 2 Organize Data into Charts/Graphs Chapter 3 Take those charts and graphs and come up with a summary: Mean, Median, Mode, and Mid-range, etc. Measures of Central tendency Measures of Variation (range, variance and standard deviation) Measures of Position (percentiles, quartiles and deciles)

3-1: Measures of Central Tendency Summarizing data using ‘middle’ values: Mean, Median, Mode and Mid-Range

What do we mean by the AVERAGE? Think on your own first and jot down a couple of ideas Now, let’s discuss Why are there so many different options?

Statistic: a value obtained by using a SAMPLE Parameter: a value obtained by using a POPULATION Symbols for these to come in the on the next slide

MEAN: aka arithmetic average or average Symbol 𝑋 −𝑠𝑎𝑚𝑝𝑙𝑒 μ - population Sum of all the values divided by the total number of values (n: sample N: population)

MEAN: (continued) Rounding rule: round to one more decimal place than what occurs in the raw data. Ex…if all data values are to the tenths, then the mean should be rounded to the hundredth See p. 112, ex. #1 and 2

MEDIAN: is the halfway point in the data set symbol: MD Arrange the data in order from smallest to largest and find the middle What do you do if there are an even # of data values? Take the 2 middle values and average them together

MODE: the values that occurs most often. There may be no mode, one mode (unimodal), two modes (bimodal), or many modes (multimodal). No mode: when NO value occurs more often. We do not say it is 0.

MIDRANGE: Find the highest and the lowest values and divide by 2. Symbol: MR

EXAMPLE: these values represent the # of short-term parking spaces at 15 different airports. Let’s find the Mean, Median, Mode and Midrange 750 3400 1962 700 203 900 8662 260 1479 5905 9239 690 9822 2516

Mean: 3145.9 Median: 1479 Mode: 700 Midrange: 5012.5 Show calculator method

WEIGHTED MEAN: multiply each value by its corresponding weight and sum, then divide by the sum of the weights. Ex…Grades in college classes

Grouped Frequency Tables On calculator (otherwise LOTS of work) Find the midpoint of each class That’s what goes in the first column Second column: enter in the frequency values Stat →Calc→1-Variable Stats(leave list 1 and list 2 alone) →Calculate Let’s try one and evaluate the results (p.124: 13)

3-2: Measures of Variation We all need to know more than JUST the mean

Two experimental brands of outdoor paint are tested to see how long each will last before fading. Six cans of each brand constitute a small population. The results (in months) are shown. Find the mean and range of each group.

The average for both brands is the same, but the range for Brand A is much greater than the range for Brand B. Think about…Which brand would you buy? Why? I would buy… Look at the graph on p. 128

Range, Variance and Standard Deviation are used to help us make better decisions about our data RANGE: highest value - lowest value How can that be helpful to us? The larger the range the larger the spread of the values…

Range is useful, but can’t tell us everything… Variance and standard deviation are usually MORE helpful They are based on the distance EACH value is from the MEAN

To determine the spread of the data Uses of the Variance and Standard Deviation To determine the spread of the data To determine the consistency of a variable Let’s try one by hand because…

Let’s go back to our paint problem.. Let’s use column 2 Mean Value – Mean (Value – Mean)2 Population symbols: Var = sum of the last column: σ2 = Standard Deviation = SD = σ2

Rounding Rule: same as for mean – one more decimal place past what is given in the data. Why do we square the Value – Mean column… Symbols for Population: Symbols for Sample:

How to do it on the calculator… If it is one list of data – use 1 VAR stats - If it is two lists of data and you want them separately – use 1 VAR stats(L2) Frequency table is the only time (so far) where you use 2 VAR stats

Showed grouped on calculator… do #20 p. 144 together Coefficient of Variation Used when you want to compare 2 different sets of data with different units (all you need is mean & SD) Cvar = 𝑠 𝑋 100 −𝑠𝑎𝑚𝑝𝑙𝑒𝑠 = σ μ 100 − 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑠 4th hour: 𝑋 = 6.8 and 1.2 7th hour: 𝑋 = 7.2 , S = 1.5 , 1.2 6.8 ∗ 100 =17.6% 1.5 7.2 ∗ 100 =20.8% 7th hour had more variation than 4th

Range rule of thumb: used to ESTIMATE the standard deviation s ≈ 𝑟𝑎𝑛𝑔𝑒 4 (this is ONLY an approximation) Chebyshev’s Theorem: The proportion of values from a data set that will fall within k standard deviations of the mean will be at least 1 - 1 k2 , where k is a number greater than 1. (What does this mean?)

At least ¾ of the data values will fall within 2 standard deviations of the mean. OR: 𝑋 ± 2(SD) = the upper/lower boundary that holds 2 standard deviations Let’s look at an ex...

The mean price of houses in a certain neighborhood is $50,000, and the standard deviation is $10,000. Find the price range for which at least 75% of the houses will sell. Chebyshev’s Theorem states that at least 75% of a data set will fall within 2 standard deviations of the mean. 50,000 – 2(10,000) = 30,000 50,000 + 2(10,000) = 70,000 OR….sometimes you have to find k first

𝑘= 𝑣𝑎𝑙𝑢𝑒 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 And then use that in 1 - 1 k2 ….look at example on p.141

Empirical Rule: If the data is bell shaped then it follows the given pattern.

3-3: Measures of Position -Standard Scores (z – scores) and Percentiles

Used to locate RELATIVE position within a data set In your past: doctor visits when you were young or Iowa Assessment Scores Median = 50th percentile (not the same as %)

Z – scores (Standard Scores) Heard: can’t compare apples to oranges, but with this you can…sort of Z – score = 𝑣𝑎𝑙𝑢𝑒 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (have you seen this before?) EX: You scored a 78 on a test in English that a mean = 70 with a sd = 3.5. In Science you scored a 85 on a test that had a mean of 80 with a sd = 2. Compare your relative position within each class. Z-score (English) = 2.29 Z – score (Science) = 2.5 Comparatively, you scored better in your science class.

If you turn ALL the data in to z-scores, then you have ‘adjusted’ the mean = 0 & the sd = 1. Therefore, the z-score = the # of sd’s the value is from the mean (Chebyshev’s thm) Very useful to us in the future Percentiles: measures of position most often used in the education and health care fields Helps compare an individual to a group Divides the data set in to 100 equal parts

Percentile graphs can be constructed, but we aren’t going to do that b/c computers can do that for us We do need to be able to find percentile values though – use the formula below

Step 1: Arrange the data in order from smallest to largest A teacher gives a 20-point test to 10 students. Find the percentile rank of a score of 12. 18, 15, 12, 6, 8, 2, 3, 5, 20, 10 Step 1: Arrange the data in order from smallest to largest Step 2: Plug into the formula A student whose score was 12 did better than 65% of the class. = 65th percentile

The value 5 corresponds to the 25th percentile. What about if we need to go backwards? A teacher gives a 20-point test to 10 students. Find the value corresponding to the 25th percentile. 18, 15, 12, 6, 8, 2, 3, 5, 20, 10 Step 1: Order the data from smallest to largest Step 2: Use the formula… Step 3: Always round up…this is the position in your data, not the ANSWER The value 5 corresponds to the 25th percentile.

Percentiles/Deciles/Quartiles – Relationship P1 – P100 D1 - D10 = Q1 – Q4 = Interquartile Range = Q3 – Q1 Identifying Outliers: Anything outside of the ±IQR(1.5) Add to Q3 Subtract from Q1 Anything outside of that will be considered an outlier

3-4: Exploratory Data Analysis: BOXPLOTS or Stem and Leaf Plots Already looked at Stem and Leaf Boxplots (review) 5 important values Minimum Q1 Median Q3 Maximum

Boxplot – aka Box and Whisker Plot It looks like a box with whiskers (segments) attached to each end of the box Let’s draw one…