Introduction to the Practice of Statistics Fifth Edition Chapter 1: Looking at Data—Distributions Copyright © 2005 by W. H. Freeman and Company Modifications.

Slides:



Advertisements
Similar presentations
The Normal distributions BPS chapter 3 © 2006 W.H. Freeman and Company.
Advertisements

Chapter 1 Introduction Individual: objects described by a set of data (people, animals, or things) Variable: Characteristic of an individual. It can take.
CHAPTER 3: The Normal Distributions Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
Summarizing Quantitative Data MATH171 - Honors. Bar graph sorted by rank  Easy to analyze Top 10 causes of death in the U.S., 2001 Sorted alphabetically.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
1.1 Displaying Distributions with Graphs
Chapter 1: Exploring Data AP Stats, Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my.
Introduction to the Practice of Statistics Fifth Edition Chapter 1: Looking at Data—Distributions Copyright © 2005 by W. H. Freeman and Company Modifications.
Section 7.1 The STANDARD NORMAL CURVE
The Normal distributions BPS chapter 3 © 2006 W.H. Freeman and Company.
Stat 1510: Statistical Thinking and Concepts 1 Density Curves and Normal Distribution.
Practice of Statistics
CHAPTER 3: The Normal Distributions ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Essential Statistics Chapter 31 The Normal Distributions.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 2 Modeling Distributions of Data 2.2 Density.
CHAPTER 3: The Normal Distributions
Density Curves Section 2.1. Strategy to explore data on a single variable Plot the data (histogram or stemplot) CUSS Calculate numerical summary to describe.
Section 1.3 Density Curves and Normal Distributions.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 2 Modeling Distributions of Data 2.2 Density.
2.1 Density Curves & the Normal Distribution. REVIEW: To describe distributions we have both graphical and numerical tools.  Graphically: histograms,
Essential Statistics Chapter 31 The Normal Distributions.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Numerical descriptors BPS chapter 2 © 2006 W.H. Freeman and Company.
Section 2.1 Density Curves. Get out a coin and flip it 5 times. Count how many heads you get. Get out a coin and flip it 5 times. Count how many heads.
Density Curves & Normal Distributions Textbook Section 2.2.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Section 1.2 Describing Distributions with Numbers.
Chapter 2: Modeling Distributions of Data
Section 2.1 Density Curves
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Describing Location in a Distribution
Good Afternoon! Agenda: Knight’s Charge-please wait for direction
Good Afternoon! Agenda: Knight’s Charge-please get started Good things
Ninth grade students in an English class were surveyed to find out about how many times during the last year they saw a movie in a theater. The results.
Chapter 2: Modeling Distributions of Data
CHAPTER 3: The Normal Distributions
Chapter 2: Modeling Distributions of Data
Density Curves and Normal Distribution
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2 Data Analysis Section 2.2
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Summary (Week 1) Categorical vs. Quantitative Variables
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 3: The Normal Distributions
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 3: The Normal Distributions
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Presentation transcript:

Introduction to the Practice of Statistics Fifth Edition Chapter 1: Looking at Data—Distributions Copyright © 2005 by W. H. Freeman and Company Modifications and Additions by M. Leigh Lunsford, David S. Moore George P. McCabe

Technology Requirements TI-83 SPSS Excel Data Analysis Excel Macros Data Sets in SPSS and Excel Format on CD See my website for more details:

The Science of Learning from Data The Collection and Analysis of Data What is Statistics?? Experimental Design Chapter 3 Descriptive Statistics (Data Exploration) Chapters 1, 2 Inferential Statistics Chapters Probability Chapter 4

Chapter 1 - Looking at Data 1.1 Displaying Distributions with Graphs 1.2 Describing Distributions with Numbers 1.3 Density Curves and Normal Distributions

Section 1.1 Displaying Distributions with Graphs

Data Basics

Variable Types

An Example (p. 5)

Graphs for Categorical Vars. Bar Graphs Pie Charts Educational Level Example (page 7): –A Bar Graph by Hand –A Pie Chart by Hand Homework: Try to do these in Excel!

Graphs for Quantitative Data Stemplots (Stem and Leaf Plots) –Generally for small data sets Histograms Time Plots (if applicable) Let’s look at an example to see what types of questions one may ask and how these plots help to visualize the answers!

Example 1.7 Page 14 Descriptive and Inferential Stats 1.What percent of the 60 randomly chosen fifth grade students have an IQ score of at least 120? 2.Based on this data, approximately what percent of all fifth grade students have an IQ score of at least 120? 3.What is the average IQ score of the fifth grade students in this sample? 4.Based on this data, what is the average IQ score of all fifth grade students (i.e. the population) from which the sample was drawn? Inferential? Descriptive? 2 and 41 and 3

Let’s Make a Stemplot! An Example (Ex. 1.7 p.14) Data in Table 1.3 p. 14 (and on next slide)

Stem and Leaf Plot for Example IQ Test Scores for 60 Randomly Chosen 5 th Grade Students Generated Using the Descriptive Statistics Menu on Megastat Stem and Leaf plot foriq stem unit =10 leaf unit =1 FrequencyStem Leaf

Now Let’s Make a Histogram! Use the Same Data in Example 1.7 (Data in Table 1.3) We will start by hand….using class widths of 10 starting at 80… Let’s try using Megastat (Excel file on Disk)! Compare the Stemplot to the Histogram!

Histogram for Example iq cumulative lower upper midpointwidth frequencypercent frequencypercent 80< < < < < < < Compare this Histogram to the Stem & Leaf Plot we Generated Earlier!

Recall Our Earlier Question 1 1. What percent of the 60 randomly chosen fifth grade students have an IQ score of at least 120? Numerically? How to Represent Graphically? 18.3%+15%+3.3%=36.6% (11+9+2)/60=.367 or 36.7% Grey Shaded Region corresponds to this 36.6% of data

What is Different From the Histogram we Generated In Class??

Let’s Look at the Distribution we Just Created: Overall Pattern: Shape (modes, tails (skewness), symmetry) Center (mean, median) Spread (range, IQR, standard deviation) Deviations: Outliers Descriptors we will be interested in for data and population distributions.

Overall Pattern: Shape, Center, Spread? Deviations: Outliers? Example 1.9 page 18-19

Data Analysis – An Interesting Example (Example 1.10, p. 9-10) 80 Calls

Overall Pattern: Shape, Center, Spread? Deviations: Outliers?

Time Plots – For Data Collected Over Time… Example: Mississippi River Discharge p.19 (data p. 21)

Example – Dealing with Seasonal Variation

Extra Slides from Homework Problem 1.19 Problem 1.20 Problem 1.21 Problem 1.31 Problem 1.36 Problem

Problem 1.19, page 30

Problem 1.20, page 31

Problem 1.21, page 31

Problem 1.31, page 36

Problem 1.36, page 38

Problems 1.37 – 1.39

Section 1.2 Describing Distributions with Numbers

Types of Measures Measures of Center: –Mean, Median, Mode Measures of Spread: –Range (Max-Min), Standard Deviation, Quartiles, IQR

Means and Medians Consider the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 What is the Average (or Mean) Test Score? What is the Median Test Score?

Consider the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Draw a Stem and Leaf Plot (Shape, Center, Spread?) Find the Mean and the Median Let’s Use our TI-83 Calculators! –Enter data into a list via Stat|Edit –Stat|Calc|1-Var Stats What happens to the Mean and Median if the lowest score was 20 instead of 65? What happens to the Mean and Median if a low score of 20 is added to the data set (so we would now have 11 data points?) What can we say about the Mean versus the Median?

Quartiles: Measures of Position

A Graphical Representation of Position of Data (It really gives us an indication of how the data is spread among its values!)

Using Measures of Position to Get Measures of Spread And what was the range again???

5 Number Summary, IQR, Box Plot, and where Outliers would be for Test Score Data: 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 What do we notice about symmetry?

Histograms of Flower Lengths Problem 1.58 Generated via Minitab

Box Plot and 5-Number Summary for Flower Length Data Generated via Box Plot Macro for Excel BihaiRedYellow Median Q Min or In Fence Max or In Fence Q Outliers?

Remember this histogram from the Service Call Length Data on page 9? How do you expect the Mean and Median to compare for this data? Mean 196.6, Median 103.5

Box Plot for Call Length Data

More on Measures of Spread Data Range (Max – Min) IQR (75% Quartile minus 25% Quartile 2, range of middle 50% of data) Standard Deviation (Variance) –Measures how the data deviates from the mean….hmm…how can we do this? Recall the Sample Test Score Data: 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Recall the Sample Mean (X bar) was 78.8…

Computing Variance and Std. Dev. by Hand and Via the TI83: Recall the Sample Test Score Data: 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Recall the Sample Mean (X bar) was What does the number 4.2 measure? How about -13.8?

Consider (again!) the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 What happens to the standard deviation and the location of the 1 st and 3 rd quartiles if the lowest score was 20 instead of 65? What happens to the standard deviation and the location of the 1st and 3rd quartiles if a low score of 20 is added to the data set (so we would now have 11 data points?) What can we say about the effect of outliers on the standard deviation and the quartiles of a data set? Effects of Outliers on the Standard Deviation

Example 1.18: Stemplots of Annual Returns for Stocks (a) and Treasury bills (b) On page 53 of text. What are the stem and leaf units????

Consider (again!) the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Xbar=78.8 s=10.2 (rounded) Suppose we “curve” the grades by adding 5 points to every test score (i.e. Xnew=Xold+5). What will be new mean and standard deviation? Suppose we “curve” the grades by multiplying every test score times 1.5 (i.e. Xnew=1.5*Xold). What will be the new mean and standard deviation? Suppose we “curve” the grades by multiplying every test score times 1.5 and adding 5 points (i.e. Xnew=1.5*Xold+5). What will be the new mean and standard deviation? Effects of Linear Transformations on the Mean And Standard Deviation

Box Plots for Problems

Section 1.3 Density Curves and Normal Distributions

Basic Ideas One way to think of a density curve is as a smooth approximation to the irregular bars of a histogram. It is an idealization that pictures the overall pattern of the data but ignores minor irregularities. Oftentimes we will use density curves to describe the distribution of a single quantitative continuous variable for a population (sometimes our curves will be based on a histogram generated via a sample from the population). –Heights of American Women –SAT Scores The bell-shaped normal curve will be our focus!

Shape? Center? Spread? Density Curve Page 64 Sample Size =105

Shape? Center? Spread? Density Curve Page 65 Sample Size=72 Guinea pigs

1.What proportion (or percent) of seventh graders from Gary, Indiana scored below 6? 2. What is the probability (i.e. how likely is it?) that a randomly chosen seventh grader from Gary, Indiana will have a test score less than 6? Two Different but Related Questions! Example 1.22 Page 66 Sample Size = 947

Relative “area under the curve” VERSUS Relative “proportion of data” in histogram bars. Page 67 of text

Shape? Center? Spread? The classic “bell shaped” Density curve.

A “skewed” density curve. Median separates area under curve into two equal areas (i.e. each has area ½) What is the geometric interpretation of the mean?

The mean as “center of mass” or “balance point” of the density curve

The normal density curve! Shape? Center? Spread? Area Under Curve? How does the magnitude of the standard deviation affect a density curve? How does the standard deviation affect the shape of the normal density curve? Assume Same Scale on Horizontal and Vertical (not drawn) Axes.

The distribution of heights of young women (X) aged 18 to 24 is approximately normal with mean mu=64.5 inches and standard deviation sigma=2.5 inches (i.e. X~N(64.5,2.5)). Lets draw the density curve for X and observe the empirical rule! (aka the “Empirical Rule”)

Example 1.23, page 72 How many standard deviations from the mean height is the height of a woman who is 68 inches? Who is 58 inches?

The Standard Normal Distribution (mu=0 and sigma=1) Horizontal axis in units of z-score! Notation: Z~N(0,1)

Let’s find some proportions (probabilities) using normal distributions! Example 1.25 (page 75) Example 1.26 (page 76) (slides follow) Let’s draw the distributions by hand first!

Example 1.25, page 75 TI-83 Calculator Command: Distr|normalcdf Syntax: normalcdf(left, right, mu, sigma) = area under curve from left to right mu defaults to 0, sigma defaults to 1 Infinity is 1E99 (use the EE key), Minus Infinity is -1E99

Example 1.26, page 76 Let’s find the same probabilities using z-scores! On the TI-83: normalcdf(720,820,1026,209)

The Inverse Problem: Given a normal density proportion or probability, find the corresponding z-score! What is the z-score such that 90% of the data has a z-score less than that z-score? (1)Draw picture! (2)Understand what you are solving for! (3)Solve approximately! (we will also use the invNorm key on the next slide) Now try working Example 1.30 page 79! (slide follows)

TI-83: Use Distr|invNorm Syntax: invNorm(area,mu,sigma) gives value of x with area to left of x under normal curve with mean mu and standard deviation sigma. invNorm(0.9,505,110)=? invNorm(0.9)=? Page 79 How can we use our TI-83s to solve this??

How can we tell if our data is “approximately normal?” Box plots and histograms should show essentially symmetric, unimodal data. Normal Quantile plots are also used!

Histogram and Normal Quantile Plot for Breaking Strengths (in pounds) of Semiconductor Wires (Pages 19 and 81 of text)

Histogram and Normal Quantile Plot for Survival Time of Guinea Pigs (in days) in a Medical Experiment (Pages 38 (data table), 65 and 82 of text)

Using Excel to Generate Plots Example Problem 1.30 page 35 –Generate Histogram via Megastat –Get Numerical Summary of Data via Megastat or Data Analysis Addin –Generate Normal Quantile Plot via Macro (plot on next slide)

Normal Quantile plot for Problem 1.30 page 35

Extra Slides from Homework Problem 1.80 Problem 1.82 Problem Problem Problem Problem Problem Problem 1.135

Problem 1.80 page 84

Problem 1.83 page 85

Problem page 90

Problem page 90

Problem page 92

Problem page 92

Problem page 94

Problem page 95-96