Week 3 Chapters 5, 7, 12.

Slides:



Advertisements
Similar presentations
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Advertisements

Designing Experiments
Quartiles  Divide data sets into fourths or four equal parts. Smallest data value Q1Q2Q3 Largest data value 25% of data 25% of data 25% of data 25% of.
Introduction to Statistics
Copyright © 2010 Pearson Education, Inc. Slide
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 12 Sample Surveys.
Psychology: A Modular Approach to Mind and Behavior, Tenth Edition, Dennis Coon Appendix Appendix: Behavioral Statistics.
Table of Contents Exit Appendix Behavioral Statistics.
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
1 Chapter 1: Sampling and Descriptive Statistics.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide Background We have learned ways to display, describe, and summarize.
3.2 Sampling Design. Sample vs. Population Recall our discussion about sample vs. population. The entire group of individuals that we are interested in.
Descriptive Statistics: Numerical Measures
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 12 Sample Surveys
Section 5.1. Observational Study vs. Experiment  In an observational study, we observe individuals and measure variables of interest but do not attempt.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. How to Get a Good Sample Chapter 4.
Sample Surveys Ch. 12. The Big Ideas 1.Examine a Part of the Whole 2.Randomize 3.It’s the Sample Size.
Midterm 1 Review (1) Types of Random Samples (2) Percentages & Crosstabs.
Quantitative vs. Categorical Data
Describing distributions with numbers
Chapter 12: AP Statistics
Chapter 5 Data Production
Chapter 3 - Part B Descriptive Statistics: Numerical Methods
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Introduction to Quantitative Data Analysis (continued) Reading on Quantitative Data Analysis: Baxter and Babbie, 2004, Chapter 12.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Sampling.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Sample Surveys.  The first idea is to draw a sample. ◦ We’d like to know about an entire population of individuals, but examining all of them is usually.
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Research & Statistics Looking for Conclusions. Statistics Mathematics is used to organize, summarize, and interpret mathematical data 2 types of statistics.
Sampling is the other method of getting data, along with experimentation. It involves looking at a sample from a population with the hope of making inferences.
AP Statistics.  Observational study: We observe individuals and measure variables of interest but do not attempt to influence responses.  Experiment:
Objective: What methods can we use to select samples that are representative of the population? In what ways can samples be biased?
Chapter 12 Notes Surveys, Sampling, & Bias Examine a Part of the Whole: We’d like to know about an entire population of individuals, but examining all.
Chapter 12 Sample Surveys *Sample *Bias *Randomizing *Sample Size.
Part III Gathering Data.
Chapter 12 Sample Surveys
Objectives Chapter 12: Sample Surveys How can we make a generalization about a population without interviewing the entire population? How can we make a.
Section 1-4 Collecting Sample Data. DEFINITIONS Observational Study: observing and measuring specific characteristics without attempting to modify the.
1 1 Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University © 2002 South-Western/Thomson Learning.
Day 3: Sampling Distributions. CCSS.Math.Content.HSS-IC.A.1 Understand statistics as a process for making inferences about population parameters based.
Section 5.1 Continued.  A simple random sample (SRS) of size n contains n individuals from the population chosen so that every set of n individuals has.
Slide 12-1 Copyright © 2004 Pearson Education, Inc.
Categorical vs. Quantitative…
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
The hypothesis that most people already think is true. Ex. Eating a good breakfast before a test will help you focus Notation  NULL HYPOTHESIS HoHo.
Numerical Measures of Variability
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
Part III – Gathering Data
Chapter 5 Sampling: good and bad methods AP Standards Producing Data: IIB4.
I can identify the difference between the population and a sample I can name and describe sampling designs I can name and describe types of bias I can.
Chapter 3 Surveys and Sampling © 2010 Pearson Education 1.
We’ve been limited to date being given to us. But we can collect it ourselves using specific sampling techniques. Chapter 12: Sample Surveys.
5.1: Designing Samples. Important Distinction Observational Study – observe individuals and measure variables but do not attempt to influence the responses.
Chapter 11 Understanding Randomness. Practical Randomness Suppose a cereal company puts pictures of athletes on cards in boxes of cereal in hopes to boost.
Exploratory Data Analysis
Chapter 12 Sample Surveys.
Sample Surveys.
Part III – Gathering Data
Chapter 10 Samples.
CHAPTER 12 Sample Surveys.
6A Types of Data, 6E Measuring the Centre of Data
SCATTER PLOTS AND LINES OF BEST FIT
Presentation transcript:

Week 3 Chapters 5, 7, 12

Outliers, Fences, Box plots Chapter 5 Outliers, Fences, Box plots

Outliers (p.95) An outlier is a value that is located very far away from almost all of the other values. An observation that is unusually large or small relative to the other values in a data set is called an outlier. Outliers occur by: 1. Being observed, recorded, or entered into the computer incorrectly. 2. The data value is correct and represents a rare event.

Detecting Outliers Determine the fences. Fences are cutoff points for outlier. Lower Fence = Q1 – 1.5(IQR) Upper Fence = Q3 + 1.5(IQR) If a data value is less than the lower fence or greater than the upper fence, it is considered an outlier. Example The following data represent income (in thousands of dollars) for a sample of 12 students from Cornell University – 5 years after graduation. 35 29 44 72 34 64 41 50 54 104 39 58

Cont. Example 29 34 35 39 41 44 50 54 58 64 72 104 Lower Fence = Q1 – 1.5 x IQR = 37-1.5(61-37)=1 Upper Fence = Q3 +1.5 x IQR= 61 +1.5(61-37)= 97 Outlier: 104 Median=(44+50)/2=47 Min Max Q3=61 Q1=37

Cont. Example 29 34 35 39 41 44 50 54 58 64 72 104 Lower Fence = Q1 – 1.5 x IQR = 37-1.5(61-37)=1 Upper Fence = Q3 +1.5 x IQR= 61 +1.5(61-37)= 97 Outlier: 104 Median=(44+50)/2=47 Min Max Q3=61 Q1=37 LF Q1 M Q3 UF 20 40 60 80 100 120

Cont. Example 29 34 35 39 41 44 50 54 58 64 72 104 Lower Fence = Q1 – 1.5 x IQR = 37-1.5(61-37)=1 Upper Fence = Q3 +1.5 x IQR= 61 +1.5(61-37)= 97 Outlier: 104 Median=(44+50)/2=47 Min Max Q3=61 Q1=37 Represents the highest and lowest values within LF, UF LF Q1 M Q3 UF 20 40 60 80 100 120

Cont. Example 29 34 35 39 41 44 50 54 58 64 72 104 Lower Fence = Q1 – 1.5 x IQR = 37-1.5(61-37)=1 Upper Fence = Q3 +1.5 x IQR= 61 +1.5(61-37)= 97 Outlier: 104 Median=(44+50)/2=47 Min Max Q3=61 Q1=37 Represents any outliers LF Q1 M Q3 UF 20 40 60 80 100 120

Cont. Example 29 34 35 39 41 44 50 54 58 64 72 104 Lower Fence = Q1 – 1.5 x IQR = 37-1.5(61-37)=1 Upper Fence = Q3 +1.5 x IQR= 61 +1.5(61-37)= 97 Outlier: 104 Median=(44+50)/2=47 Min Max Q3=61 Q1=37 “Whiskers” “Box plot” LF Q1 M Q3 UF 20 40 60 80 100 120

Box Plots and Skewness (p.91) NOTE: Often statisticians will represent box plots vertically. (this may happen on your test) Remember: Q1 is at the bottom, Q3 is at the top Which of the vertical box plots is skewed right? #3 Which of the vertical box has the highest median? #1 Which of the vertical box has the biggest range? All the same Which of the vertical box has the lowest IQR? #2

Scatterplots, Association, and Correlation Chapter 7 Scatterplots, Association, and Correlation

Scatter plot A scatter plot is the most common display for comparing two quantitative variables . By just looking at them, you can see patterns, trends, and relationships.

Direction of the relationship A pattern like this (runs from the upper left to the lower right) is said to be negative. A pattern running the other way is called positive.

Strength of relationship Strength: how much scatter. Weak relationship Strong relationship

Correlation coefficient Correlation coefficient (r) is a measure of relationship between two qualitative variables. It determines the degree of association. The correlation coefficient will vary from -1 to 1. A -1 indicates perfect negative correlation, and +1 indicates perfect positive correlation.

Chapter 12: Sample Surveys

Sample VS Population “We’d like to know about an entire population of individuals, but examining all of them is usually impractical, if not impossible. So we settle for examining a smaller group of individuals—a sample—selected from the population” --Page 303 & 304 Sample Survey: “…ask questions of a small group of people in the hope of learning something about the entire population.” Example: You’re bringing pizza to a party, and you have two options – Pizza Hut or Papa Johns. Instead of calling the 100 friends that might show up, you decide to call 5 and ask for their preference.

Sample VS Population “We’d like to know about an entire population of individuals, but examining all of them is usually impractical, if not impossible. So we settle for examining a smaller group of individuals—a sample—selected from the population” --Page 303 & 304 Biased Survey – “Sampling methods that tend to over- or underemphasize some characteristics of the population” Example: If you want to know the proportion of Americans that consider themselves Republican, it would be a bad idea to survey people in Utah alone. Recent polls show that Utah is the most Republican state in the country.

Sample VS Population “We’d like to know about an entire population of individuals, but examining all of them is usually impractical, if not impossible. So we settle for examining a smaller group of individuals—a sample—selected from the population” --Page 303 & 304 Randomizing – “[Protects us from bias, by] making sure that , on average, the sample looks like the rest of the population” Example: It’s final exam week at Cornell University, and 1,200 calculus students are taking a standardized test in the library at 4pm. You want to sample 30 students to find out if the test was easier or harder than expected. What is a better idea (with respect to eliminating bias) – sampling the first 30 students to finish or randomly choosing the students selected for the survey? Why?

4 Types of Random Samples Type 1: Simple Random Sample (SRS) When choosing a sample of size n from a given population, the sample is called a simple random sample if every possible sample of size n has an equal chance to be selected In a SRS, every person is equally likely to be chosen However, if you choose a sample such that everyone is equally likely to be chosen, it is not necessarily SRS. Example 1: Suppose you want to select a sample of 100 students from a school where there are 100 males and 100 females. Choose your sample this way: flip a coin, if heads choose all the males, if tails, choose all the females. Every person has an equally likely chance of being selected (50%), BUT THIS IS NOT SRS!!! Why?

4 Types of Random Samples Type 1: Simple Random Sample (SRS) When choosing a sample of size n from a given population, the sample is called a simple random sample if every possible sample of size n has an equal chance to be selected In a SRS, every person is equally likely to be chosen However, if you choose a sample such that everyone is equally likely to be chosen, it is not necessarily SRS. Example 2: A better way to sample 100 students from a school with 200 students: Use Minitab to assign a unique random number to each student (1, 2, 3,… 200), then choose the first 100 numbers to be in your sample.

4 Types of Random Samples Type 2: Stratified Sample “First [slice the population] into homogeneous groups, called strata, before the sample is selected. Then SRS is used within each stratum; combine the selections from each stratum into one large sample.” -- page 310 Example from book: page 310 – football example Idea: Suppose we want to know how Akron University feels about using funds to support the football team. Men and women feel differently about using funds in this way. If the school is 60% men and 40% women, we want our sample to represent this. So, if we are going to sample 100 people, we should separate the men and women, then randomly sample exactly 40 women (40% of 100) and 60 men (60% of 100)

4 Types of Random Samples Type 3: Cluster Sample “Splitting the population into representative clusters can make sampling more practical. Then we could simply select one or a few clusters at random and [include all observations in these clusters in our sample].” Example from book: page 311 – Sentence Length Idea: Suppose we want to know the length of the average sentence in the textbook. SRS is complicated in this case, because we would have to count each sentence individually. However, if we believe each page is “representative” of the entire book, then we can just choose a few pages at random and combine the sentences found on these pages as one sample. Each page is a “cluster” of “representative” sentences.

4 Types of Random Samples Type 3: Cluster Sample “Splitting the population into representative clusters can make sampling more practical. Then we could simply select one or a few clusters at random and [include all observations in these clusters in our survey].” IN CLUSTER SAMPLE: divide population into representative “clusters” main goal: makes sampling easier IN STRATIFIED SAMPLE: divide population into nonrepresentative “stratum” main goal: gives a more representative sample

4 Types of Random Samples Type 4: Systematic Sample Some samples select individuals systematically – for example: choose every 10th person in an alphabetical list. This is not SRS (why?), but it is still representative as long as the order you choose observations is not related to the variable(s) you’re measuring.