Presentation is loading. Please wait.

Presentation is loading. Please wait.

Practice of Statistics

Similar presentations


Presentation on theme: "Practice of Statistics"— Presentation transcript:

1 Practice of Statistics
Introduction to the Practice of Statistics Fourth Edition Chapter 1: Exploring Data

2 Resources/supplies: Our textbook is The Practice of Statistics (Starnes, Yates, Moore, 4th ed.). Pay careful attention to the examples, the calculator procedures, and the AP Exam tips that are located in the page margins. I will refer to this book as TPS4e throughout the year. This textbook is well-aligned to the AP Statistics curriculum and the sample problems and activities will prepare you well for the AP Statistics exam. Companion Web Site:

3 Resources/supplies: You will need a TI-84+ graphing calculator. (I have a class set to use in the classroom). I will be demonstrating problems using the TI-84 all year and tips on how to use this calculator are provided throughout the TPS4e textbook. The textbook also explains how to use the TI-89 as well as the TI Inspire. You will receive a packet with instructions for the TI-84 graphing calculator . Keep it on your binder since you will refer to it throughout the year. I recommend a large 2 ½” binder since I will provide a large number of AP Practice problems, handouts , and additional documents that you will find at my website and that you may find helpful to print.

4 Resources/supplies: Vocab Flash Cards
Free Study Resources for AP Tests Textbook Website Free Response Questions Online Writing Lab, Quick Writing Reference Matching types of inference Khan Academy

5 This Exam is made of 2 Sections for a total of 3 hours.
The AP Exam I will be preparing you for the Advanced Placement Statistics Exam taking place on Thursday May 12, 2016 at 12 m. This Exam is made of 2 Sections for a total of 3 hours. Section I: 40 MC, 90 minutes, 50 % of the exam score. No penalty for guessing. Section II: 6 Free Response (FR), 90 minutes, 50 % of the exam score. Questions 1-5 take about 13 minutes each and count for 75% of the Section II. The last question is an “Investigative Task” should take about 25 min and is worth 25% of the Section II score.

6 I need you to provide the effort...
The AP Exam You CAN be successful on this exam IF you put forth the effort ALL YEAR LONG. I will provide you with LOTS of preparation materials as well as insight from the grading of the exam. I need you to provide the effort...

7 Topic Outline: The topics for the AP Statistics are divided into 4 major themes: 1. Exploratory Analysis( %) 2. Planning and Conducting a Study (10-15%) 3. Probability ( 20-30%) 4. Statistical Inference (30-40%)

8 The Science of Learning from Data The Collection and Analysis of Data
What is Statistics? The Science of Learning from Data The Collection and Analysis of Data Descriptive Statistics (Data Exploration) Chapters 1, 2, 3 Experimental Design Chapter 4 Probability Chapter 5, 6, 7 Inferential Statistics Chapters 8-12

9 Branches of Statistics:
Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions. There are two branches: Inferential Statistics: involves using a sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability Descriptive Statistics: involves the organization, summarization, and display of data

10 Chapter 1: Exploring Data
Introduction Data Analysis: Making Sense of Data The Practice of Statistics, 4th edition - For AP* STARNES, YATES, MOORE

11 Chapter 1 Exploring Data
Introduction: Data Analysis: Making Sense of Data 1.1 Analyzing Categorical Data 1.2 Displaying Quantitative Data with Graphs 1.3 Describing Quantitative Data with Numbers

12 Introduction Data Analysis: Making Sense of Data
Learning Objectives After this section, you should be able to… DEFINE “Individuals” and “Variables” DISTINGUISH between “Categorical” and “Quantitative” variables DEFINE “Distribution” DESCRIBE the idea behind “Inference”

13 Statistics is the science of data.
Data Analysis is the process of organizing, displaying, summarizing, and asking questions about data. Data Analysis Definitions: Individuals – objects (people, animals, things) described by a set of data Variable - any characteristic of an individual Categorical Variable – places an individual into one of several groups or categories. Quantitative Variable – takes numerical values for which it makes sense to find an average.

14 Dotplot of MPG Distribution
A variable generally takes on many different values. In data analysis, we are interested in how often a variable takes on each value. Data Analysis Definition: Distribution – tells us what values a variable takes and how often it takes those values Example Dotplot of MPG Distribution Variable of Interest: MPG

15 How to Explore Data Data Analysis Examine each variable by itself.
Then study relationships among the variables. Start with a graph or graphs Add numerical summaries

16 Data Analysis Population Sample
A population is the collection of all outcomes, responses, measurements, or counts that are of interest. A sample is a subset of a population Data Analysis Population Sample Collect data from a representative Sample... Make an Inference about the Population. Perform Data Analysis, keeping probability in mind…

17 Activity: Hiring Discrimination
Follow the directions on Page 5 Perform 5 repetitions of your simulation. Turn in your results to your teacher. Teacher: Right-click (control-click) on the graph to edit the counts. Data Analysis

18 Introduction Data Analysis: Making Sense of Data
Summary In this section, we learned that… A dataset contains information on individuals. For each individual, data give values for one or more variables. Variables can be categorical or quantitative. The distribution of a variable describes what values it takes and how often it takes them. Inference is the process of making a conclusion about a population based on a sample set of data.

19 Looking Ahead… In the next Section…
We’ll learn how to analyze categorical data. Bar Graphs Pie Charts Two-Way Tables Conditional Distributions We’ll also learn how to organize a statistical problem.

20 CW # 1. pg. 7 Exc. 2, 4, 6 HW # 2. pg. 7 Exc. 1,3,5,7,8 Practice:

21 Recall Our Earlier Question 1
1. What percent of the 60 randomly chosen fifth grade students have an IQ score of at least 120? Numerically? How to Represent Graphically? Grey Shaded Region corresponds to this 36.6% of data 18.3%+15%+3.3%=36.6% (11+9+2)/60=.367 or 36.7%

22 the Histogram we Generated
What is Different From the Histogram we Generated In Class??

23 Let’s Look at the Distribution we Just Created:
Overall Pattern: Shape (modes, tails (skewness), symmetry) Center (mean, median) Spread (range, IQR, standard deviation) Deviations: Outliers Descriptors we will be interested in for data and population distributions.

24 Overall Pattern: Shape, Center, Spread? Deviations: Outliers? Example 1.9 page 18-19

25 Data Analysis – An Interesting Example (Example 1.10, p. 9-10)
80 Calls

26 Overall Pattern: Shape, Center, Spread? Deviations: Outliers?

27

28 Time Plots – For Data Collected Over Time…
Example: Mississippi River Discharge p.19 (data p. 21)

29

30

31

32

33 Example – Dealing with Seasonal Variation

34

35

36 Extra Slides from Homework
Problem 1.19 Problem 1.20 Problem 1.21 Problem 1.31 Problem 1.36 Problem

37 Problem 1.19, page 30

38 Problem 1.20, page 31

39 Problem 1.21, page 31

40 Problem 1.31, page 36

41 Problem 1.36, page 38

42 Problems 1.37 – 1.39

43 Section 1.2 Describing Distributions with Numbers

44 Types of Measures Measures of Center: Mean, Median, Mode
Measures of Spread: Range (Max-Min), Standard Deviation, Quartiles, IQR

45 Means and Medians Consider the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 What is the Average (or Mean) Test Score? What is the Median Test Score?

46

47

48 Draw a Stem and Leaf Plot (Shape, Center, Spread?)
Consider the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Draw a Stem and Leaf Plot (Shape, Center, Spread?) Find the Mean and the Median Let’s Use our TI-83 Calculators! Enter data into a list via Stat|Edit Stat|Calc|1-Var Stats What happens to the Mean and Median if the lowest score was 20 instead of 65? What happens to the Mean and Median if a low score of 20 is added to the data set (so we would now have 11 data points?) What can we say about the Mean versus the Median?

49 Quartiles: Measures of Position

50

51 A Graphical Representation of Position of Data
(It really gives us an indication of how the data is spread among its values!)

52 Using Measures of Position to Get Measures of Spread
And what was the range again???

53

54 5 Number Summary, IQR, Box Plot, and where Outliers would be for Test Score Data:
65, 65, 70, 75, 78, 80, 83, 87, 91, 94 What do we notice about symmetry?

55 Histograms of Flower Lengths Problem 1.58 Generated via Minitab

56 Box Plot and 5-Number Summary for Flower Length Data Generated via Box Plot Macro for Excel
Bihai Red Yellow Median 47.12 39.16 36.11 Q1 46.71 38.07 35.45 Min or In Fence 46.34 37.4 34.57 Max or In Fence 50.26 43.09 38.13 Q3 48.245 41.69 36.82 Outliers?

57 Remember this histogram from the Service Call Length Data on page 9
Remember this histogram from the Service Call Length Data on page 9? How do you expect the Mean and Median to compare for this data? Mean 196.6, Median 103.5

58 Box Plot for Call Length Data

59 More on Measures of Spread
Data Range (Max – Min) IQR (75% Quartile minus 25% Quartile 2, range of middle 50% of data) Standard Deviation (Variance) Measures how the data deviates from the mean….hmm…how can we do this? Recall the Sample Test Score Data: 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Recall the Sample Mean (X bar) was 78.8…

60 Computing Variance and Std. Dev. by Hand and Via the TI83:
Recall the Sample Test Score Data: 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Recall the Sample Mean (X bar) was 78.8 78.8 What does the number 4.2 measure? How about -13.8? 65 83 4.2 -13.8 65 70 75 80 85 90 95

61

62 Effects of Outliers on the Standard Deviation
Consider (again!) the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 What happens to the standard deviation and the location of the 1st and 3rd quartiles if the lowest score was 20 instead of 65? What happens to the standard deviation and the location of the 1st and 3rd quartiles if a low score of 20 is added to the data set (so we would now have 11 data points?) What can we say about the effect of outliers on the standard deviation and the quartiles of a data set?

63

64

65 Stemplots of Annual Returns for Stocks (a) and Treasury bills (b)
Example 1.18: Stemplots of Annual Returns for Stocks (a) and Treasury bills (b) On page 53 of text. What are the stem and leaf units????

66

67

68 Effects of Linear Transformations on the Mean And Standard Deviation
Consider (again!) the following sample of test scores from one of Dr. L.’s recent classes (max score = 100): 65, 65, 70, 75, 78, 80, 83, 87, 91, 94 Xbar= s=10.2 (rounded) Suppose we “curve” the grades by adding 5 points to every test score (i.e. Xnew=Xold+5). What will be new mean and standard deviation? Suppose we “curve” the grades by multiplying every test score times 1.5 (i.e. Xnew=1.5*Xold). What will be the new mean and standard deviation? Suppose we “curve” the grades by multiplying every test score times 1.5 and adding 5 points (i.e. Xnew=1.5*Xold+5). What will be the new mean and standard deviation?

69 Box Plots for Problems 1.62-1.64

70 Section 1.3 Density Curves and Normal Distributions

71 The bell-shaped normal curve will be our focus!
Basic Ideas One way to think of a density curve is as a smooth approximation to the irregular bars of a histogram. It is an idealization that pictures the overall pattern of the data but ignores minor irregularities. Oftentimes we will use density curves to describe the distribution of a single quantitative continuous variable for a population (sometimes our curves will be based on a histogram generated via a sample from the population). Heights of American Women SAT Scores The bell-shaped normal curve will be our focus!

72 Density Curve Shape? Center? Spread? Page 64 Sample Size =105

73 Page 65 Density Curve Shape? Center? Spread? Sample Size=72 Guinea pigs

74 2. What is the probability (i.e. how likely is it?)
that a randomly chosen seventh grader from Gary, Indiana will have a test score less than 6? Two Different but Related Questions! What proportion (or percent) of seventh graders from Gary, Indiana scored below 6? Example 1.22 Page 66 Sample Size = 947

75 Relative “area under the curve” VERSUS
Relative “proportion of data” in histogram bars. Page 67 of text

76

77 The classic “bell shaped”
Density curve. Shape? Center? Spread?

78 Median separates area under curve into two equal areas
(i.e. each has area ½) A “skewed” density curve. What is the geometric interpretation of the mean?

79 The mean as “center of mass” or “balance point” of the density curve

80

81 The normal density curve!
Assume Same Scale on Horizontal and Vertical (not drawn) Axes. The normal density curve! Shape? Center? Spread? Area Under Curve? How does the standard deviation affect the shape of the normal density curve? How does the magnitude of the standard deviation affect a density curve?

82 (aka the “Empirical Rule”)
The distribution of heights of young women (X) aged 18 to 24 is approximately normal with mean mu=64.5 inches and standard deviation sigma=2.5 inches (i.e. X~N(64.5,2.5)). Lets draw the density curve for X and observe the empirical rule!

83 How many standard deviations from the mean height is the height of a woman who is 68 inches? Who is 58 inches? Example 1.23, page 72

84

85 The Standard Normal Distribution
(mu=0 and sigma=1) Notation: Z~N(0,1) Horizontal axis in units of z-score!

86

87 Let’s draw the distributions by hand first!
Let’s find some proportions (probabilities) using normal distributions! Example 1.25 (page 75) Example 1.26 (page 76) (slides follow) Let’s draw the distributions by hand first!

88 Example 1.25, page 75 TI-83 Calculator Command: Distr|normalcdf
Syntax: normalcdf(left, right, mu, sigma) = area under curve from left to right mu defaults to 0, sigma defaults to 1 Infinity is 1E99 (use the EE key), Minus Infinity is -1E99

89 Example 1.26, page 76 On the TI-83: normalcdf(720,820,1026,209) Let’s find the same probabilities using z-scores!

90 Now try working Example 1.30 page 79!
The Inverse Problem: Given a normal density proportion or probability, find the corresponding z-score! What is the z-score such that 90% of the data has a z- score less than that z-score? Draw picture! Understand what you are solving for! Solve approximately! (we will also use the invNorm key on the next slide) Now try working Example 1.30 page 79! (slide follows)

91 TI-83: Use Distr|invNorm
Syntax: invNorm(area,mu,sigma) gives value of x with area to left of x under normal curve with mean mu and standard deviation sigma. TI-83: Use Distr|invNorm How can we use our TI-83s to solve this?? invNorm(0.9,505,110)=? invNorm(0.9)=? Page 79

92 How can we tell if our data is “approximately normal?”
Box plots and histograms should show essentially symmetric, unimodal data. Normal Quantile plots are also used!

93 Histogram and Normal Quantile Plot for Breaking Strengths (in pounds) of Semiconductor Wires
(Pages 19 and 81 of text)

94 (Pages 38 (data table), 65 and 82 of text)
Histogram and Normal Quantile Plot for Survival Time of Guinea Pigs (in days) in a Medical Experiment (Pages 38 (data table), 65 and 82 of text)

95 Using Excel to Generate Plots
Example Problem 1.30 page 35 Generate Histogram via Megastat Get Numerical Summary of Data via Megastat or Data Analysis Addin Generate Normal Quantile Plot via Macro (plot on next slide)

96 Normal Quantile plot for Problem 1.30 page 35

97 Extra Slides from Homework
Problem 1.80 Problem 1.82 Problem Problem Problem Problem Problem Problem 1.135

98 Problem 1.80 page 84

99 Problem 1.83 page 85

100 Problem page 90

101 Problem page 90

102 Problem page 92

103 Problem page 92

104 Problem page 94

105 Problem page 95-96


Download ppt "Practice of Statistics"

Similar presentations


Ads by Google