Unit 1 – Data AnalysisNewton - AP Statistics Introduction: Making Sense of Data 1.1: Analyzing Categorical Data 1.2: Displaying Quantitative Data with.

Slides:



Advertisements
Similar presentations
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Advertisements

Chapter 1: Exploring Data AP Stats, Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my.
The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Homework Questions. Measures of Center and Spread Unit 5, Statistics.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
MMSI – SATURDAY SESSION with Mr. Flynn. Describing patterns and departures from patterns (20%–30% of exam) Exploratory analysis of data makes use of graphical.
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Unit 2 Test Topics: Statistics Honors Analysis. Conditional Probability Geometric Probability ◦ triangles ◦ triangles ◦Basic area formulas.
Unit 1 – Descriptive Statistics Throughout the course of these lectures we will work within this same scenario: We are a team of junior climate scientists.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.
Introduction to Statistics
UNIT ONE REVIEW Exploring Data.
Introduction to Statistics
CHAPTER 1 Exploring Data
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
1st Semester Final Review Day 1: Exploratory Data Analysis
Measures of Central Tendency
Statistical Reasoning
Do-Now-Day 2 Section 2.2 Find the mean, median, mode, and IQR from the following set of data values: 60, 64, 69, 73, 76, 122 Mean- Median- Mode- InterQuartile.
The Practice of Statistics, Fourth Edition.
Jeopardy Final Jeopardy Chapter 1 Chapter 2 Chapter 3 Chapter 4
Common Core Math I Unit 6 One-Variable Statistics Introduction
Common Core Math I Unit 6 One-Variable Statistics Introduction
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 5: Describing Distributions Numerically
Please take out Sec HW It is worth 20 points (2 pts
1.2 Describing Distributions with Numbers
Click the mouse button or press the Space Bar to display the answers.
Describing Distributions with Numbers
Measure of Center And Boxplot’s.
Common Core Math I Unit 6 One-Variable Statistics Introduction
Measure of Center And Boxplot’s.
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
CHAPTER 1 Exploring Data
Common Core Math I Unit 2: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Displaying and Summarizing Quantitative Data
Chapter 1: Exploring Data
pencil, red pen, highlighter, GP notebook, graphing calculator
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Exploratory Data Analysis
Welcome!.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Good morning! Please get out your homework for a check.
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
pencil, red pen, highlighter, GP notebook, graphing calculator
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

Unit 1 – Data AnalysisNewton - AP Statistics Introduction: Making Sense of Data 1.1: Analyzing Categorical Data 1.2: Displaying Quantitative Data with Graphs Unit 1 Quiz on August 18th 1.3: Describing Quantitative Data with Numbers Unit 1 Test on September 1st

Unit 1 – Data AnalysisNewton - AP Statistics Introduction: Making Sense of Data Key Terms: Data Analysis Individuals Variables - Categorical Variables - Quantitative Variables - Distribution - Inference

DayLowHighHumidityPrecipitationAir Quality Monday637845%LightGood Tuesday668813%NoFair Wednesday589010%NoPoor Thursday59928%NoPoor Friday609718%NoFair Saturday639615%NoFair Sunday64988%NoPoor Unit 1 – Data AnalysisNewton - AP Statistics Introduction: Making Sense of Data Identify the individuals and the variables.

DayLowHighHumidityPrecipitationAir Quality Monday637845%LightGood Tuesday668813%NoFair Wednesday589010%NoPoor Thursday59928%NoPoor Friday609718%NoFair Saturday639615%NoFair Sunday64988%NoPoor Unit 1 – Data AnalysisNewton - AP Statistics Introduction: Making Sense of Data Classify each variable as categorical or quantitative.

Unit 1 – Data AnalysisNewton - AP Statistics 1.1: Analyzing Categorical Data Key Terms: Frequency Relative Frequency Bar Graphs Pie Charts Two-way Tables - Marginal Distributions - Conditional Distributions - Side-by-side Bar Graphs - Segmented Bar Graphs Association

Unit 1 – Data AnalysisNewton - AP Statistics 1.1: Analyzing Categorical Data

Unit 1 – Data AnalysisNewton - AP Statistics 1.1: Analyzing Categorical Data

Unit 1 – Data AnalysisNewton - AP Statistics 1.1: Analyzing Categorical Data

Unit 1 – Data AnalysisNewton - AP Statistics 1.2: Displaying Quantitative Data with Graphs Key Terms: Dot plot“SOCS” Stem Plots- Shape [Skew Left or Right, Symmetric] Histograms- Outliers [1.5 IQR Test] - Center [Mean, Median, Mode] - Spread [Range, IQR, Standard Deviation] Additional Topics: Bimodal or Multimodal

Unit 1 – Data AnalysisNewton - AP Statistics 1.2: Displaying Quantitative Data with Graphs

Unit 1 – Data AnalysisNewton - AP Statistics 1.2: Displaying Quantitative Data with Graphs

Unit 1 – Data AnalysisNewton - AP Statistics 1.2: Displaying Quantitative Data with Graphs

Unit 1 – Data AnalysisNewton - AP Statistics 1.2: Displaying Quantitative Data with Graphs

DayLowHighHumidityPrecipitationAir Quality Monday637845%LightGood Tuesday668813%NoFair Wednesday589010%NoPoor Thursday59928%NoPoor Friday609718%NoFair Saturday639615%NoFair Sunday64988%NoPoor Unit 1 – Data AnalysisNewton - AP Statistics 1.2: Displaying Quantitative Data with Graphs For each column, identify the most appropriate graphing technique.

Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers Key Terms:Organizing a Statistics Problem: Measures of CenterI. State the question you are answering. - MeanII. Plan how to answer the question with tools. - MedianIII. Create graphs and do calculations. 1.5 IQR TestIV. Conclude using the problem’s setting. Five Number Summary - Quartiles Boxplot Measures of Spread (Variability)Additional Topics: - Range and IQRResistant Measures - Standard Deviation and VarianceTransformations

Mean (Average) The mean is the average of the data values. That is, if the amount were evenly divided into the same number of points, how much each would get. X-bar is the symbol we use for the mean. To quickly calculate the mean, enter the data set into L1, then press STAT ►CALC ►1-Var Stats Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Median (Middle) The Median is the Middle data point or, in the case of a data set with an even number of data points, the average of the two middle data points. M is the symbol we use for the median. To quickly calculate the Median, enter the data set into L1, then press STAT ►CALC ►1-Var Stats Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Mode (Most Common) The Mode is the most frequent data point(s). The Mode is unique because there can be more than one in a given data set. The Mode is pretty much useless. There isn’t a short cut to find the mode, however, you can sort a list which helps you find them faster. To sort List 1 Ascending: STAT ►EDIT ►SortA(L1) Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Range (Spread) The Range is the simplest way to measure the spread of a data set. To quickly calculate the Range, use the 1-Var Stats printout and subtract maxX – minX Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Interquartile Range (IQR) The Interquartile Range is the distance between Quartiles 1 and The best way to think of this is that Q1 and Q3 are the “Medians of the Median” which is easy to find by hand sometimes and sometimes it’s a little complicated (even number of data points). Use the 1-Var Stats printout as a shortcut Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Standard Deviation ( σ “sigma”) The Standard Deviation is the most common measure of spread. Notice that in the 1-Var Stats printout, s is the symbol for Standard Deviation, rather than sigma. We will discuss why at a later date Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Standard Deviation ( σ “sigma”) Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Outliers Outliers are data points which are far enough away from the rest of the data set to be considered abnormal. The test that is typically applied to determine if a data point is an outlier is called the 1.5 IQR Test Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

1.5 IQR Test To conduct the 1.5 IQR Test, first find the IQR (Interquartile Range). IQR = Q3 – Q1. IQR = 100 – 87 = 13 Next, multiply the IQR by x 13 = Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Now take that value (19.5) and do this: 1 st : Subtract it from Q1: 87 – 19.5 = nd : Add it to Q3: = Any data point that falls on this interval will not be an outlier. Data points which fall outside of this interval will be considered an outlier Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

1.5 IQR Shortcut Steps: 1.►STAT PLOT 2.Stat Plot 1 ► Turn On ► Type: Modified Box Plot 3.►Zoom ► Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Modified Box Plot Now press Trace. The following will be displayed: Min Q1 Med Q3 Max Outlier(s) Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Resistant vs. Not Resistant Outliers are important because they can influence the behavior of other statistics. Some Statistical measures are “Resistant” – that is, they are not influenced by an outlier. Some are “Not Resistant” – they are influenced by outliers Unit 1 – Data AnalysisNewton - AP Statistics 1.3: Describing Quantitative Data with Numbers

Additive Transformation We just got bad news from our project manager – apparently our equipment wasn’t calibrated correctly. After some testing, it was found that all of the temperature readings were 4 degrees too high. To adjust our data set, we simply use the formula: y = x – 4 Where x is the old data and y is the new data.

Resistant vs. Not Resistant The following statistical measures ARE resistant: Median IQR The following statistical measures are NOT resistant: Mean Range Standard Deviation

Resistant vs. Not Resistant The following statistical measures ARE resistant: Median IQR The Median and the IQR simply are not impacted by the presence of an outlier. Try changing 120 to a different value, for example, 110, and note that both the Median and IQR remain the same. This is because these values are both a measure of “middleness” of the data set. Changing the extremes has no impact on them

Resistant vs. Not Resistant The following statistical measures are NOT resistant: Mean Range Standard Deviation All 3 of these values are impacted by the presence of an outlier but we typically don’t worry much about the Range. The impact on the Mean and Standard Deviation are the most important. Try changing our outlier to 110 to see what happens to both the mean and standard deviation

Resistant vs. Not Resistant Why does this matter? Outliers cause “skew” in our data set, which will be discussed later. For now, try looking back at the other 3 data sets we have worked with. Do any of those data sets have outliers? Do any have no outliers? What do you notice about the relationship between the Median and the Mean when there is an outlier vs. when there isn’t?

Resistant vs. Not Resistant You should notice that for a data set with no outliers, the Median and Mean are very close together. In a data set with a high outlier, the Mean > Median. In a data set with a low outlier, the Mean < Median. Talk to your neighbor about why this is the case. In either case, what will be the impact of the outlier on standard deviation?

Multiplicative Transformation We just got even worse news from our project manager – apparently our equipment was really acting up. After some additional testing, it was found that all of the temperature readings were 10% too high and need to be multiplied by.9 to correct for the error. To adjust our data set, we simply use the formula: y =.9x

Additive Transformation y = x – Predict: What will happen to each measure? Center:Spread: MeanRange MedianIQR ModeStandard Deviation What will happen to the outliers?

Additive Transformation y = x – Mean = decreases by 4 Median = decreases by 4 Mode = decrease(s) by 4 Range = no change IQR = no change Standard Deviation = no change Outliers = decreases by 4

Multiplicative Transformation Predict: What will happen to each measure? Center:Spread: MeanRange MedianIQR ModeStandard Deviation What will happen to the outliers?

Multiplicative Transformation Mean = decreases by 10% ► 82.5 Median = decreases by 10% 91 ► 81.9 Mode = decrease(s) by 10% 91 and 96 ► 81.9 and 86.4 Range = decreases by 10% 40 ► 36 IQR = decreases by 10% 13 ► 11.7 Standard Deviation = decreases by 10% ► Outliers = decreases by 10% 116 ► 104.4