Statistics Chapter 1 Introduction to Statistics

Slides:



Advertisements
Similar presentations
Introduction to Statistics
Advertisements

Chapter 1 Getting Started Understandable Statistics Ninth Edition
Math 2260 Introduction to Probability and Statistics
Experimental Design Sections 1.2 & 1.3. Section Random Samples Samples are used to gain an understanding of “Total Population” Def: Simple Random.
Chapter 5 Producing Data
AP Statistics Chapter 5 Notes.
Understanding Basic Statistics Outline
The Practice of Statistics
MATH1342 S08 – 7:00A-8:15A T/R BB218 SPRING 2014 Daryl Rupp.
Chapter 1 Getting Started
Chapter 5 Data Production
Homework Check Homework check now... Please take out your homework so we can check it.
Chapter 1: Introduction to Statistics
Random Sampling and Introduction to Experimental Design.
Understanding Statistics Eighth Edition By Brase and Brase Prepared by: Joe Kupresanin Ohio State University Chapter One Getting Started.
WHAT IS STATISTICS STATISTICS is the study of how to collect, organize, analyze, and interpret data.
Do Now: 1.Be sure to have picked up three papers upon entry. 2.Work with a partner to complete “The White House is not a Metronome Questions”
Understanding Basic Statistics
Vocabulary: Statistics – a study of how to collect, organize, analyze, and interpret numerical information from data Individuals – the people or objects.
Statistics: Basic Concepts. Overview Survey objective: – Collect data from a smaller part of a larger group to learn something about the larger group.
Section 1.3 Experimental Design Larson/Farber 4th ed.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Elementary Statistics M A R I O F. T R I O L A Copyright © 1998, Triola, Elementary.
Chapter 1 DATA AND PROBLEM SOLVING. Section 1.1 GETTING STARTED.
Chapter 1: The Nature of Statistics
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
1.3 – Introduction to Experimental Design Vocabulary Census Sample Simulation.
1  Specific number numerical measurement determined by a set of data Example: Twenty-three percent of people polled believed that there are too many polls.
Chapter 5: Producing Data “An approximate answer to the right question is worth a good deal more than the exact answer to an approximate question.’ John.
Chapter 7: Data for Decisions Lesson Plan Sampling Bad Sampling Methods Simple Random Samples Cautions About Sample Surveys Experiments Thinking About.
© 2010 Pearson Prentice Hall. All rights reserved 1-1 Objectives 1.Define statistics and statistical thinking 2.Explain the process of statistics 3.Distinguish.
Chapter 1 Getting Started 1.1 What is Statistics?.
Designing Samples Chapter 5 – Producing Data YMS – 5.1.
Conducting A Study Designing Sample Designing Experiments Simulating Experiments Designing Sample Designing Experiments Simulating Experiments.
1. Identify the variable(s) of interest (the focus) and the population of the study. 2. Develop a detailed plan for collecting data. Make sure sample.
Chapter 1 Getting Started Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
An Overview of Statistics Section 1.1. Ch1 Larson/Farber 2 Statistics is the science of collecting, organizing, analyzing, and interpreting data in order.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Section 1.3 Introduction to Experimental Design 1.3 / 1.
Notes 1.3 (Part 1) An Overview of Statistics. What you will learn 1. How to design a statistical study 2. How to collect data by taking a census, using.
Part III – Gathering Data
Introduction to Statistics Chapter 1. § 1.1 An Overview of Statistics.
Column 1 Column 2 Column 3 Column
 An observational study observes individuals and measures variable of interest but does not attempt to influence the responses.  Often fails due to.
CHAPTER 1: INTRODUCTION TO STATISTICS SECTION 1.1: AN OVERVIEW OF STATISTICS.
Ch1 Larson/Farber 1 1 Elementary Statistics Larson Farber Introduction to Statistics As you view these slides be sure to have paper, pencil, a calculator.
Ch1 Larson/Farber 1 1 Elementary Statistics Larson Farber Introduction to Statistics As you view these slides be sure to have paper, pencil, a calculator.
Chapter 3 Producing Data. Observational study: observes individuals and measures variables of interest but does not attempt to influence the responses.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 1 Getting Started What is Statistics?. Individuals vs. Variables Individuals People or objects included in the study Variables Characteristic.
1 Chapter 11 Understanding Randomness. 2 Why Random? What is it about chance outcomes being random that makes random selection seem fair? Two things:
An Overview of Statistics Section 1.1 After you see the slides for each section, do the Try It Yourself problems in your text for that section to see if.
Chapter 1 Getting Started Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
WHAT IS STATISTICS? 1.1. Tons of vocabulary Statistics – the study of how to collect, organize, analyze, and interpret numerical information from data.
Unit 1 Section 1.3.
Chapter 5 Data Production
Part III – Gathering Data
Probability & Statistics Chapter 10
Statistics Section 1.2 Identify different methods for selecting a sample Simulate a random process Review: quantitative and qualitative variables, population.
statistics Specific number
Understandable Statistics
Chapter 1 Getting Started Understandable Statistics Ninth Edition
Statistics Mr. zboril | Milford PEP.
Introduction to Statistics
statistics Specific number
Definitions Covered Statistics Individual Variable
Definitions Covered Descriptive/ Inferential Statistics
Use your Chapter 1 notes to complete the following warm-up.
Understanding Basic Statistics
Statistics Section 1.3 Describe the components and types of censuses
Understanding Basic Statistics
Presentation transcript:

Statistics Chapter 1 Introduction to Statistics

Keep this in mind Statistics have very logical answers. Statistics can be linked with psychology and sociology Keep an open-mind, there are always 2 sides to a coin (positive and negative)

Quick Talk “1 out of 3 people cheat in a relationship” Discuss this statement What does it mean? Do you believe it? Why or why not?

Based on your discussion, why do you think it’s important to know about statistics?

Potential answers Know how authentic the statement is Don’t get “cheated” or “tricked” on Know where the data comes from Know how effective something is

What is statistics? Statistics: the study of how to collect, organize, analyze, and interpret numerical information from data

So then, what is needed? Individual: people or objects included in the study Variable: characteristic of the individual to be measured or observed

Quick Talk Think about dating. What “variables” do people look for when finding a boyfriend or girlfriend? List them

Variable comes in two types Quantitative variable: has a value or numerical measurement for which operations such as addition or averaging make sense (usually has numbers) Qualitative variable: describes an individual by placing the individual into a category or group such as male or female

Base on your list, identify them as quantitative or qualitative.

Sample Answers Age (quantitative) Weight (quantitative) Height (quantitative) Race (qualitative) Income (quantitative) Looks (qualitative) Body type (qualitative) Personality (qualitative)

Data Population data: data from “every” individual of interest Sample data: data from “only some” of the individual of interest

Quick Talk Compare the definition. Which type is more probable? Why?

Parameter: a numerical measure that describes an aspect of a population Statistic: numerical measure that describes an aspect of a sample

Easy way to remember Population with parameter Sample with statistic

Group work: Example #1 A car dealer wants to know what type of car people drive in the desert. He sent out 5000 surveys to random people living in the desert. A)identify the individual of study and the variable B)do the data comprise a sample? If so, what is the underlying population? C)is the variable qualitative or quantitative? D)Identify a quantitative variable that might be or interest E) Is the random sample a statistic or a parameter?

Answer A) individual: people in the desert Variable: car B) The data comprise a sample of the population of all people living in the desert C) qualitative D)Income, age E) statistic- computed from sample data

Example #2 Television station QUE wants to know the proportion of TV owners in Virginia who watch the station’s new program at least once a week. The station asked a group of 1000 TV owners in Virginia if they watch the program at least once a week A)identify the individual of study and the variable B)do the data comprise a sample? If so, what is the underlying population? C)is the variable qualitative or quantitative? D)Identify a quantitative variable that might be or interest E) Is the random sample a statistic or a parameter?

Levels of Measurement Nominal level of measurement: applies to data that consists of names, labels or categories. There are no implied criteria by which the data can be ordered from smallest to largest Ordinal level of measurement: applies to data that can be arranged in order. However, differences between data values either cannot be determined or are meaningless Interval level of measurement: applies to data that can be arranged in order. In addition, differences between data values are meaningful Ratio Level of measurement: applies to data that can be arranged in order. In addition, both differences between data values and ratios of data values are meaningful. Data at the ratio level have a true zero. (Means zero means something)

Example: Identify what level of measurement A)Taos, Acoma, Zuni, and Cochiti are names of four Native American pueblos from the population of names of all Native American pueblos in Arizona and New Mexico B) In a high school graduating class of 600 Students. Jeff ranked 1st, Melissa ranked 38th, Patrick ranked 150th, Ashley ranked 3rd, where 1 is the highest rank C) Body temperatures of trout in the Yellowstone River D) Length of shark swimming in the Pacific Ocean

Answer A) nominal B) ordinal C) interval D)ratio

Example #2 Name the levels of measurement A) My name is Mr. Liu B) I am 28 years old C) Highschool 1999-2003 College 2003-2007 Masters 2007-2008 D) I make $35,000 after tax E) I ranked 100th in highschool, 58th in college, 27th in Masters F) Some of my friend’s name are Michael, Katherine, Patrick, Ashley, Sarah, Mya, Chris. G) I am 5’8

Answers A) Nominal B) Ratio C) Interval D) Ratio E) Ordinal F) Nominal G) Ratio

Homework Practice Pg 10-11 #1-13 odd

Quick Talk Mr. Liu looked at the first 15 male students’ grades (which averages to a C) and made conclusion that of all the students in the school should have a C average. Discuss why this statement might not be correct. What is wrong with this study?

Things to remember If there is a study or data collect, it can not be BIASED in any way. You need to have a decent sample size and fair randomness to it. Fair = equal chance

1st type of data collection Simple random sample: Simple random sample of n measurements from a population selected in a manner such that every sample of size n from the population has an equal chance of being selected. Basically, everything has the same chance of getting selected.

Simple random sample example If I were to assign a number to each of the students here. (40 students) If I were to randomly choose 5 numbers, would the number 7 as likely to be selected as number 37? Could all 5 numbers be all odd? Could it ever be 27,28,29,30,31?

How to Draw a Random Sample 1) Number all members of the population sequentially 2) Use a table, calculator, or computer to select random numbers from the numbers assigned to the population members 3) Create the sample by using population members with numbers corresponding to those randomly selected

Read Example 3 in pg 13 Random-Number Table It is one of the way to create “randomness” in terms of number It is called a simulation

Another way Random Integer (randInt): Calculator TI83, TI84 Go to MATH Slide over to PRB Choose #5 It should show randInt( If you want ONE random number out of total of 500, you should type randInt(1,500) This will give you a random number between 1 and 500 If you want 30 random numbers out of total of 500, you should type randInt(1,500,30)

2nd type of data collection Simulation (usually with number): a numerical facsimile or representation of a real-world phenomenon Note: Productive in studying nuclear reactors, cloud formation, cardiology, highway design, production control, shipbuilding, airplane design, war games, economics, and electronics.

Quick Talk: Why do you think it is important to use simulation as a data collection method? (think about the application field we just discussed)

Group Activity In your group, create a sample simulation of a coin-tossing event 10 times One person will record One person will use a coin (head or tail) One person use calculator (1=head, 2=tail) One person use the table from the back of the book (odd=head, even=tail) You should have a total of 30 trials. Answer this question: What is the theoretical probability of getting head? What is experimental probability of getting head?

Answer Theoretical probability: 50% Experimental probability: depends on your group

Sampling: Different ways to create “randomness” Stratified sampling: Divide the entire population into distinct subgroups called strata. The strata are based on a specific characteristic such as age, income, education level, and so on. All members of a stratum share the specific characteristic. Draw random samples from each stratum Systematic sampling: Number all members of the population sequentially. Then, from a starting point selected at random, include every kth member of the population in the sample Cluster sampling: Divide the entire population into pre-existing segments of clusters. The clusters are often geographic. Make a random selection of clusters. Include every member of each selected cluster in the sample. Multistage sampling: Use a variety of sampling methods to create successively smaller groups at each stage. The final sample consists of clusters. Convenience sampling: Create a sample by using data from population members that are readily available (potential to have lots of bias).

Vocabulary dealing with sampling Sampling frame: a list of individuals from which a sample is actually selected Undercoverage: results from omitting population members from the sample frame Sampling error: the difference between measurements from a sample and corresponding measurements from the respective population. It is caused by the fact that the sample does not perfectly represent the population. Nonsampling error: result of poor sample design, sloppy data collection, faulty measuring instruments, bias in questionnaires, and so on

Note: Remember, is it possible to get a “population” sample? We have to use sample to predict the population. Sample is not a perfect representation of the population! Sampling error do not represent mistakes! They are just the consequences of using samples instead of population. Nonsampling error do occur, be aware of them! Avoid bias and sloppy data collection leading to false-truth, or truth-false (false-positive)

Homework Practice Pg 17-19 #1-5, 7, 13, 15

Quick Talk Why is planning a good experimental design important? Think about what we learned.

2 more types of Data collection techniques 1) Experiment (most stringent and restrictive) 2) Observational (Somewhat convenient) Census 3) Survey (most convenient way to collect data)

Basic Guideline for planning a statistical study 1) Identify the individuals or objects of interest 2)Specify the variables as well as protocols for taking measurements or making observations 3)Determine if you will use an entire population or a representative sample. Decide on a viable sampling method 4)In your data collection plan, address issues of ethics, subject confidentiality, and privacy. If you are collecting data at a business, store, college, or other institution, be sure to be courteous and to obtain permission as necessary. 5)Collect the data 6) Use appropriate descriptive statistics methods and make decisions using appropriate inferential statistics methods 7) Finally, note any concerns you might have about your data collection methods and list any recommendations for future studies.

Quick talk: You are a researcher in a biotech company. You are trying to find out the efficacy and the effectiveness of a vaccine. How would you conduct the experiment? Note: statistics is needed for ALL medical companies to test the effectiveness of their technology or medicine

3rd type of data collection: Experiments (zombie creation?!?!) Completely randomized experiment: one in which a random process is used to assign each individual to one of the treatments Block: a group of individuals sharing some common features that might affect the treatment Randomized block experiment: individuals are first sorted into blocks, and then a random process is used to assign each individual in the block to one of the treatments.

Two types of group Experimental Group: treatment is deliberately imposed on the individuals in order to observe a possible change in the response or variable being measured. (one receiving actual treatment) Control group: Receives dummy treatment, enabling the researchers to control for the placebo effect. It is used to take account for the influence of other known or unknown variables that might be an underlying cause of a change in response in the experimental group (one receiving fake treatment)

Placebo effect (aka sugar pill effect): occurs when a subject receives no treatment but (incorrectly) believes he or she is in fact receiving treatment and responds favorably. (In the control group only)

Remember, a good experiment needs: 1) Randomization: used to assign individuals to two treatment groups. This helps prevent bias in selecting members for each group. It helps with the data collection. 2) Replication (re-test): repeating the same experiment to reduce the possibility that the difference in pain relief for the two group occurred by chance alone. Note: Many experiments are also “double-blinded”. It means that neither the individual nor the observer know which subject are receiving treatment. It controls the biases that a doctor might have on a patient.

Quick talk What is the downside of using experimental as a data collection technique? What is the downside of using sampling as a data collection technique?

4th data collection Census (type of observational study): measurements or observations from the entire population are used. US do it every 10 years Although, it’s still impossible to get to everyone, e.g. homeless people. So estimates are used. Sample (sampling; need with observational study): Measurements or observations from part of the population are used (mostly used)

Observational study: observations and measurements of individuals are conducted in a way that doesn’t change the response or the variable being measured.

5th type of data collection: Survey A useful tool to gather data (without experimenting) is by using surveys. It is a type of observational study

Downside of survey Nonresponse: Individuals either cannot be contacted or refuse to participate. Can result in significant under coverage of a population Truthfulness of response: Respondents may lie intentionally or inadvertently Faulty recall: Respondents may not accurately remember Hidden bias: The question may be worded in such a way to elicit a specific response. Vague wording: Words such as “often”, “seldom” and “occasionally” mean differently to different people Interviewer influence: factors such as tone of voice, body language, dress, gender, authority, and ethnicity of the interviewer might influence responses Voluntary responses: Individuals with strong feelings about a subject are more likely than others to respond. Such a study is interesting but not reflective of the population.

Downside of all data collection Lurking variable: one for which no data have been collected but that has influence on the other variables in the study Two variables are confounded when the effects of one cannot be distinguished from the effects of the other. Confounding variables may be part of the study, or they may be outside lurking variables.

Quick Talk Have you guys ever used Yelp or other similar app? Discuss if there is ever a case the comment section “influenced” you in some type of decision. How does it relate to the “downside of survey”?

Review We have learned different data collection methods. Look in your notes, and in your group list all the collection methods.

Answer experiment, census, simulation, sampling (w/ survey)

Which type of data collection do you think is the most appropriate for the following studies? 1) Study of the effect of stopping the cooling process of a nuclear reactor 2) Study the amount of time students watching tv while studying 3) Study on the effects of weight-loss pill given to women 4) Study the credit each student enrolled at the high school at the end of 1st semester.

Group work Comment on the usefulness of the data (both positive and negative) 1) Interviewer asks the interviewee if they have taken drugs this year. 2) Jessica saw some data that show that cities with more low-income housing have more homeless people. Does building low-income housing cause homelessness? 3) You look at the reviews on yelp to determine the wellness of a restaurant 4) Extensive study on cancer conducted using men over 40

Homework Practice Pg 26-27 #1-5 odd

Review Practice P29 #1-9