Unit 1 – Descriptive Statistics Throughout the course of these lectures we will work within this same scenario: We are a team of junior climate scientists.

Slides:



Advertisements
Similar presentations
Describing Distributions With Numbers
Advertisements

Analyzing Data Unit 3 Statistics Common Core Standard:
Descriptive Measures MARE 250 Dr. Jason Turner.
3.3 Measures of Position Measures of location in comparison to the mean. - standard scores - percentiles - deciles - quartiles.
Mathematics Mrs. Sharon Hampton. VOCABULARY Lower extreme: the minimum value of the data set Lower quartile: Q1 the median of the lower half of the data.
STATISTICS. SOME BASIC STATISTICS MEAN (AVERAGE) – Add all of the data together and divide by the number of elements within that set of data. MEDIAN –
Boxplots.
Unit 8 Quiz Clear your desk except for a calculator & pencil! Keep your HW & Stamp sheet out as well.
Objectives 1.2 Describing distributions with numbers
Section 7.7 Statistics and Statistical Graphs. Data Collection Pulse Rates (beats per minute) of Our Class.
Table of Contents 1. Standard Deviation
The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Lesson Describing Distributions with Numbers adapted from Mr. Molesky’s Statmonkey website.
Holt McDougal Algebra 2 Measures of Central Tendency and Variation Measures of Central Tendency and Variation Holt Algebra 2Holt McDougal Algebra.
The table below shows the number of students who are varsity and junior varsity athletes. Find the probability that a student is a senior given that he.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Warm-up The number of deaths among persons aged 15 to 24 years in the United States in 1997 due to the seven leading causes of death for this age group.
Exploring Data 1.2 Describing Distributions with Numbers YMS3e AP Stats at LSHS Mr. Molesky 1.2 Describing Distributions with Numbers YMS3e AP Stats at.
Foundations of Math I: Unit 3 - Statistics
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Created by: Tonya Jagoe. Measures of Central Tendency mean median mode.
Lesson 25 Finding measures of central tendency and dispersion.
Measures of Center & Spread. Measures of Center.
Summary Statistics, Center, Spread, Range, Mean, and Median Ms. Daniels Integrated Math 1.
 Boxplot  TI-83/84 Calculator  5 number summary  Do you have an outlier  Modified Boxplot.
Holt McDougal Algebra Measures of Central Tendency and Variation Recall that the mean, median, and mode are measures of central tendency—values.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
Shoe Size  Please write your shoe size on the board.  Girls put yours on the girl’s chart  Boys put yours on the boy’s chart.
Analyzing Data Week 1. Types of Graphs Histogram Must be Quantitative Data (measurements) Make “bins”, no overlaps, no gaps. Sort data into the bins.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
CHAPTER 4 NUMERICAL METHODS FOR DESCRIBING DATA What trends can be determined from individual data sets?
Unit 1 – Data AnalysisNewton - AP Statistics Introduction: Making Sense of Data 1.1: Analyzing Categorical Data 1.2: Displaying Quantitative Data with.
Statistics Descriptive Statistics. Statistics Introduction Descriptive Statistics Collections, organizations, summary and presentation of data Inferential.
One-Variable Statistics. Descriptive statistics that analyze one characteristic of one sample  Where’s the middle?  How spread out is it?  How do different.
6.1 - Measures of Central Tendency and Variation
One-Variable Statistics
Boxplots.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Reasoning in Psychology Using Statistics
Warm Up Convert to degrees a) 3
Measures of Central Tendency
Do-Now-Day 2 Section 2.2 Find the mean, median, mode, and IQR from the following set of data values: 60, 64, 69, 73, 76, 122 Mean- Median- Mode- InterQuartile.
The Practice of Statistics, Fourth Edition.
Measures of central tendency
Box and Whisker Plots Algebra 2.
1.2 Describing Distributions with Numbers
Click the mouse button or press the Space Bar to display the answers.
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
Measure of Center And Boxplot’s.
Calculating IQR and Identifying Outliers
Measure of Center And Boxplot’s.
Box Plots and Outliers.
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
Common Core Math I Unit 2: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Displaying and Summarizing Quantitative Data
Measures of central tendency
pencil, red pen, highlighter, GP notebook, graphing calculator
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Chapter 1 Warm Up .
Good morning! Please get out your homework for a check.
Comparing Statistical Data
Measures of Spread And Outliers.
Measures of Spread And Outliers.
pencil, red pen, highlighter, GP notebook, graphing calculator
Probability and Statistics
Describing Data Coordinate Algebra.
Warm up Honors Algebra 2 3/14/19
Presentation transcript:

Unit 1 – Descriptive Statistics Throughout the course of these lectures we will work within this same scenario: We are a team of junior climate scientists who have been tasked by our superiors to gather and analyze the yearly temperature data for region CA105 (Tracy, CA). Our first task was to gather daily temperature measures for 15 consecutive days using precisely calibrated monitoring equipment at 1:00pm each day.

Unit 1 – Descriptive Statistics Our first task was to gather daily temperature measures for 15 consecutive days using precisely calibrated monitoring equipment at 1:00pm each day. Data Set 1: Temperature (F) at 1:00pm for region CA105 (June 1 – June 15, 2015)

Unit 1 – Descriptive Statistics Lecture Notes – Part 1 MeanRange MedianInterquartile Range ModeStandard Deviation

Measures of Center Mean (Average) The mean is the average of the data values. That is, if the amount were evenly divided into the same number of points, how much each would get. X-bar is the symbol we use for the mean. To quickly calculate the mean, enter the data set into L1, then press STAT ►CALC ►1-Var Stats

Measures of Center Median (Middle) The Median is the Middle data point or, in the case of a data set with an even number of data points, the average of the two middle data points. M is the symbol we use for the median. To quickly calculate the Median, enter the data set into L1, then press STAT ►CALC ►1-Var Stats

Measures of Center Mode (Most Common) The Mode is the most frequent data point(s). The Mode is unique because there can be more than one in a given data set. The Mode is pretty much useless. There isn’t a short cut to find the mode, however, you can sort a list which helps you find them faster. To sort List 1 Ascending: STAT ►EDIT ►SortA(L1)

Measures of Spread Range (Spread) The Range is the simplest way to measure the spread of a data set. To quickly calculate the Range, use the 1-Var Stats printout and subtract maxX – minX.

Measures of Spread Interquartile Range (IQR) The Interquartile Range is the distance between Quartiles 1 and The best way to think of this is that Q1 and Q3 are the “Medians of the Median” which is easy to find by hand sometimes and sometimes it’s a little complicated (even number of data points). Use the 1-Var Stats printout as a shortcut.

Measures of Spread Standard Deviation (σ “sigma”) The Standard Deviation is the most common measure of spread. Notice that in the 1-Var Stats printout, s is the symbol for Standard Deviation, rather than sigma. We will discuss why at a later date.

Measures of Spread Standard Deviation (σ “sigma”)

Unit 1 – Descriptive Statistics Lecture Notes – Part 2 Outliers 1.5 IQR Test Resistant Measure Not Resistant

Outliers Outliers are data points which are far enough away from the rest of the data set to be considered abnormal. The test that is typically applied to determine if a data point is an outlier is called the 1.5 IQR Test

1.5 IQR Test To conduct the 1.5 IQR Test, first find the IQR (Interquartile Range). IQR = Q3 – Q1. IQR = 100 – 87 = 13 Next, multiply the IQR by x 13 =

1.5 IQR Test cont. Now take that value (19.5) and do this: 1 st : Subtract it from Q1: 87 – 19.5 = nd : Add it to Q3: = Any data point that falls on this interval will not be an outlier. Data points which fall outside of this interval will be considered an outlier

Resistant vs. Not Resistant Outliers are important because they can influence the behavior of other statistics. Some Statistical measures are “Resistant” – that is, they are not influenced by an outlier. Some are “Not Resistant” – they are influenced by outliers

Resistant vs. Not Resistant The following statistical measures ARE resistant: Median IQR The following statistical measures are NOT resistant: Mean Range Standard Deviation

Resistant vs. Not Resistant The following statistical measures ARE resistant: Median IQR The Median and the IQR simply are not impacted by the presence of an outlier. Try changing 120 to a different value, for example, 110, and note that both the Median and IQR remain the same. This is because these values are both a measure of “middleness” of the data set. Changing the extremes has no impact on them

Resistant vs. Not Resistant The following statistical measures are NOT resistant: Mean Range Standard Deviation All 3 of these values are impacted by the presence of an outlier but we typically don’t worry much about the Range. The impact on the Mean and Standard Deviation are the most important. Try changing our outlier to 110 to see what happens to both the mean and standard deviation

Resistant vs. Not Resistant Why does this matter? Outliers cause “skew” in our data set, which will be discussed later. For now, try looking back at the other 3 data sets we have worked with. Do any of those data sets have outliers? Do any have no outliers? What do you notice about the relationship between the Median and the Mean when there is an outlier vs. when there isn’t?

Resistant vs. Not Resistant You should notice that for a data set with no outliers, the Median and Mean are very close together. In a data set with a high outlier, the Mean > Median. In a data set with a low outlier, the Mean < Median. Talk to your neighbor about why this is the case. In either case, what will be the impact of the outlier on standard deviation?

Unit 1 – Descriptive Statistics Lecture Notes – Part IQR Test Shortcut Additive Transformations

1.5 IQR Shortcut We’ll learn more about Box and Whisker Plots later but we might as well see them now. Steps: 1.►STAT PLOT 2.Stat Plot 1 ► Turn On ► Type: Modified Box Plot 3.►Zoom ►

1.5 IQR Shortcut Modified Box Plot Now press Trace. The following will be displayed: Min Q1 Med Q3 Max Outlier(s)

Additive Transformation We just got bad news from our project manager – apparently our equipment wasn’t calibrated correctly. After some testing, it was found that all of the temperature readings were 4 degrees too high. To adjust our data set, we simply use the formula: y = x – 4 Where x is the old data and y is the new data.

Additive Transformation y = x – Predict: What will happen to each measure? Center:Spread: MeanRange MedianIQR ModeStandard Deviation What will happen to the outliers?

Additive Transformation y = x – Mean = decreases by 4 Median = decreases by 4 Mode = decrease(s) by 4 Range = no change IQR = no change Standard Deviation = no change Outliers = decreases by 4

Unit 1 – Descriptive Statistics Lecture Notes – Part 4 Multiplicative Transformation

We just got even worse news from our project manager – apparently our equipment was really acting up. After some additional testing, it was found that all of the temperature readings were 10% too high and need to be multiplied by.9 to correct for the error. To adjust our data set, we simply use the formula: y =.9x

Multiplicative Transformation Predict: What will happen to each measure? Center:Spread: MeanRange MedianIQR ModeStandard Deviation What will happen to the outliers?

Multiplicative Transformation Mean = decreases by 10% ► 82.5 Median = decreases by 10% 91 ► 81.9 Mode = decrease(s) by 10% 91 and 96 ► 81.9 and 86.4 Range = decreases by 10% 40 ► 36 IQR = decreases by 10% 13 ► 11.7 Standard Deviation = decreases by 10% ► Outliers = decreases by 10% 116 ► 104.4

Unit 1.1 Concept Check Using Flashcards, Notes, Warmups, Homeworks, etc. check with a partner for the remainder of the period that you each understand all of the following concepts. Center vs. Spread1.5 IQR TestCalculator Skills MeanBox and Whisker PlotUsing Lists MedianAdditive TransformationsUnarchiving Lists Mode+Impact on each measureSorting Lists RangeMultiplicative Transformation1-Var Stats IQR+Impact on each measureStat Plots Standard DeviationModified Box Plot OutlierTrace Resistant vs. Not ResistantStatZoom Outliers’ affect on the…Side by Side Box Plots Mean Median Mode Range IQR Standard Deviation