Inference.ppt - © Aki Taanila1 Sampling Probability sample Non probability sample Statistical inference Sampling error.

Slides:



Advertisements
Similar presentations
“Students” t-test.
Advertisements

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Probability & Statistical Inference Lecture 6
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
PSY 307 – Statistics for the Behavioral Sciences
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Topic 2: Statistical Concepts and Market Returns
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chapter 9 Hypothesis Testing.
Ch. 9 Fundamental of Hypothesis Testing
Chapter 8 Introduction to Hypothesis Testing
BCOR 1020 Business Statistics
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Two Sample Tests Ho Ho Ha Ha TEST FOR EQUAL VARIANCES
Chapter 10 Hypothesis Testing
Confidence Intervals and Hypothesis Testing - II
Statistical inference: confidence intervals and hypothesis testing.
Fundamentals of Hypothesis Testing: One-Sample Tests
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
More About Significance Tests
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Chapter 8 Introduction to Hypothesis Testing
STA Statistical Inference
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Testing of Hypothesis Fundamentals of Hypothesis.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Confidence intervals and hypothesis testing Petter Mostad
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
© Copyright McGraw-Hill 2004
WS 2007/08Prof. Dr. J. Schütze, FB GW KI 1 Hypothesis testing Statistical Tests Sometimes you have to make a decision about a characteristic of a population.
Formulating the Hypothesis null hypothesis 4 The null hypothesis is a statement about the population value that will be tested. null hypothesis 4 The null.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
PART 2 SPSS (the Statistical Package for the Social Sciences)
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Chapter 7 Inference Concerning Populations (Numeric Responses)
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
T-TEST. Outline  Introduction  T Distribution  Example cases  Test of Means-Single population  Test of difference of Means-Independent Samples 
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Presentation transcript:

inference.ppt - © Aki Taanila1 Sampling Probability sample Non probability sample Statistical inference Sampling error

inference.ppt - © Aki Taanila2 Probability sample Goal: A representative sample = miniature of the population You can use simple random sampling, systematic sampling, stratified sampling, clustered sampling or combination of these methods to get a probability sample Probability sample  You can draw conclusions about the whole population

inference.ppt - © Aki Taanila3 Simple Random Population Sample

inference.ppt - © Aki Taanila4 Systematic Select picking interval e.g. every fifth Choose randomly one among the first five (or whatever the picking interval is) Pick out every fifth (or whatever the picking interval is) beginning from the chosen one

inference.ppt - © Aki Taanila5 Stratified Population Sample Proportional allocation Even allocation Compare groups Guarantee that all the groups are represented like in the population Sample

inference.ppt - © Aki Taanila6 Cluster Divide population into the clusters (schools, districts,…) Choose randomly some of the clusters Draw sample from the chosen clusters using appropriate sampling method (or investigate chosen groups in whole) Sample

inference.ppt - © Aki Taanila7 Non probability Sample When a sample is not drawn randomly it is called a non probability sample For example, when you use elements most available, like in self-selecting surveys or street interviews In the case of a non probability sample you should not draw conclusions about the whole population

inference.ppt - © Aki Taanila8 Statistical inference Statistical inference: Drawing conclusions about the whole population on the basis of a sample Precondition for statistical inference: A sample is randomly selected from the population (=probability sample)

inference.ppt - © Aki Taanila9 Sampling Error Population mean 40,8 Sample 1 mean 40,5 Sample 2 mean 40,3 Sample 3 mean 41,4 Different samples from the same population give different results Due to chance

inference.ppt - © Aki Taanila10 Sampling distributions Mean Normal distribution T-distribution Proportion Normal distribution

inference.ppt - © Aki Taanila11 Sampling distribution Most of the statistical inference methods are based on sampling distributions You can apply statistical inference without knowing sampling distributions Still, it is useful to know, at least the basic idea of sampling distribution In the following the sampling distributions of mean and proportion are presented as examples of sampling distributions

inference.ppt - © Aki Taanila12 Denotations Population parameters are denoted using Greek letters  (mean),  (standard deviation),  (proportion) Sample values are denoted x (mean), s (standard deviation), p (proportion) , ,  Population Sample x, s, p Parameters Estimates for parameters

inference.ppt - © Aki Taanila13 Mean calculated from a sample is usually the best guess for population mean. But different samples give different sample means! It can be shown that sample means from samples of size n are normally distributed: Term is called standard error (standard deviation of sample means). Sampling Distribution of Mean 1

inference.ppt - © Aki Taanila14 Sampling Distribution of Mean 2 Sample mean comes from the normal distribution above. Knowing normal distribution properties, we can be 95% sure that sample mean is in the range:

inference.ppt - © Aki Taanila15 Confidence interval for mean Based on the previous slide, we can be 95% sure that population mean is in the range:

inference.ppt - © Aki Taanila16 If population standard deviation is unknown then it can be shown that sample means from samples of size n are t-distributed with n-1 degrees of freedom As an estimate for standard error we can use Sampling Distribution of Mean σ unknown

inference.ppt - © Aki Taanila17 Confidence interval for mean σ unkown Based on the previous slide, we can be 95% sure that population mean is in the range:

inference.ppt - © Aki Taanila18 T-distribution T-distribution is quite similar to normal distribution, but the exact shape of t-distribution depends on sample size When sample size increases then t-distribution approaches normal distribution T-distribution’s critical values can be calculated with Excel =TINV(probability;degrees of freedom) In the case of error margin for mean degrees of freedom equals n – 1 (n=sample size) Ex. Critical value for 95% confidence level when sample size is 50: =TINV(0,05;49) results 2,00957

inference.ppt - © Aki Taanila19 Sampling Distribution of Proportion Proportion calculated from a sample is usually the best guess for population proportion. But different samples give different sample proportions! It can be shown that proportions from samples of size n are normally distributed Standard error (standard deviation of sample proportions) is As an estimate for standard error we use

inference.ppt - © Aki Taanila20 Error margin for proportion Based on the sampling distribution of proportion we can be 95% sure that population proportion is (95% confidence interval)

inference.ppt - © Aki Taanila21 Parameter Estimation Parameter and its estimate Error margin

inference.ppt - © Aki Taanila22 Parameter estimation Objective is to estimate the unknown population parameter using the value calculated from the sample The parameter may be for example mean or proportion

inference.ppt - © Aki Taanila23 Error margin A value calculated from the sample is the best guess when estimating corresponding population value Estimate is still uncertain due to sampling error Error margin is a measure of uncertainty Using error margin you can state confidence interval: estimate + error margin

inference.ppt - © Aki Taanila24 Error margin for mean -  known If population standard deviation  is known then error margin for population mean is We can be 95% sure that population mean is (95% confidence interval):

inference.ppt - © Aki Taanila25 Error margin for mean -  unknown If population standard deviation is unknown then error margin for population mean is We can be 95% sure that population mean is (95% confidence interval):

inference.ppt - © Aki Taanila26 Confidence level Confidence level can be selected to be different from 95% If population standard deviation  is known then critical value can be calculated from normal distribution Ex. In Excel =-NORMSINV(0,005) gives the critical value for 99% confidence level (0,005 is half of 0,01) If population standard deviation  is unknown then critical value can be calculated from t-distribution Ex. In Excel =TINV(0,01;79) gives critical value when sample size is 80 and confidence level is 99%

inference.ppt - © Aki Taanila27 Error margin for proportion Error margin for proportion is We can be 95% sure that population proportion is (95% confidence interval)

inference.ppt - © Aki Taanila28 Hypothesis testing Null hypothesis Alternative hypothesis 2-tailed or 1 –tailed P-value

inference.ppt - © Aki Taanila29 Hypothesis 1 Hypothesis is a belief concerning a parameter Parameter may be population mean, proportion, correlation coefficient,... I believe that mean weight of cereal packages is 300 grams!

inference.ppt - © Aki Taanila30 Hypothesis 2 Null hypothesis is prevalent opinion, previous knowledge, basic assumption, prevailing theory,... Alternative hypothesis is rival opinion Null hypothesis is assumed to be true as long as we find evidence against it If a sample gives strong enough evidence against null hypothesis then alternative hypothesis comes into force.

inference.ppt - © Aki Taanila31 Hypothesis examples H0: Mean height of males equals 174. H1: Mean height is bigger than 174. H0: Half of the population is in favour of nuclear power plant. H1: More than half of the population is in favour of nuclear power plant. H0: The amount of overtime work is equal for males and females. H1: The amount of overtime work is not equal for males and females. H0: There is no correlation between interest rate and gold price. H1: There is correlation between interest rate and gold price.

inference.ppt - © Aki Taanila32 2-tailed Test Use 2-tailed if there is no reason for 1-tailed. In 2-tailed test deviations (from the null hypothesis) to the both directions are interesting. Alternative hypothesis takes the form ”different than”.

inference.ppt - © Aki Taanila33 1-tailed Test In 1-tailed test we know beforehand that only deviations to one direction are possible or interesting. Alternative hypothesis takes the form ”less than” or ”greater than”.

inference.ppt - © Aki Taanila34 Reject null hypothesis! Sample mean is only 45! Population Prevalent opinion is that mean age in that group is 50 (null hypothesis) Mean age = 45 Random sample Logic behind hypothesis testing

inference.ppt - © Aki Taanila35 Risk of being wrong Not Guilty until proved otherwise! Null hypothesis remains valid until proved otherwise! Sometimes it happens that innocent person is proved guilty. Same may happen in hypothesis testing: We may reject null hypothesis although it is true. (there is always a risk of being wrong when we reject null hypothesis; risk is due to sampling error).

inference.ppt - © Aki Taanila36 Significance Level When we reject the null hypothesis there is a risk of drawing a wrong conclusion Risk of drawing a wrong conclusion (called p-value or observed significance level) can be calculated Researcher decides the maximum risk (called significance level) he is ready to take Usual significance level is 5%

inference.ppt - © Aki Taanila37 P-value We start from the basic assumption: The null hypothesis is true P-value is the probability of getting a value equal to or more extreme than the sample result, given that the null hypothesis is true Decision rule: If p-value is less than 5% then reject the null hypothesis; if p-value is 5% or more then the null hypothesis remains valid In any case, you must give the p-value as a justification for your decision.

inference.ppt - © Aki Taanila38 Steps in hypothesis testing! 1. Set the null hypothesis and the alternative hypothesis. 2. Calculate the p-value. 3. Decision rule: If the p-value is less than 5% then reject the null hypothesis otherwise the null hypothesis remains valid. In any case, you must give the p-value as a justification for your decision.

inference.ppt - © Aki Taanila39 Testing mean Null hypothesis: Mean equals x 0 Alternative hypothesis (2-tailed): Mean is different from x 0 Alternative hypothesis (1-tailed): Mean is less than x 0 Alternative hypothesis (1-tailed): Mean is bigger than x 0

inference.ppt - © Aki Taanila40 Testing mean -  known p-value Calculate standardized sample mean Calculate the p-value that indicates, how likely it is to get this kind of value if we assume that null hypothesis is true In Excel you can calculate the p-value: =NORMSDIST(-ABS(z))

inference.ppt - © Aki Taanila41 Testing mean -  unknown p-value Calculate standardized sample mean Calculate the p-value that indicates, how likely it is to get this kind of value if we assume that null hypothesis is true In Excel you can calculate the p-value: =TDIST(ABS(t),degrees of freedom,tails); in this case degrees of freedom equals n-1 and tails defines whether you use one-tailed (1) or two-tailed (2) test

inference.ppt - © Aki Taanila42 Testing Proportion In the following p 0 is a value between 0 and 1 Null hypothesis: Proportion equals p 0 *100% Alternative hypothesis (2-tailed): Proportion is different from p 0 *100% Alternative hypothesis (1-tailed): Proportion is less than p 0 *100% Alternative hypothesis (1-tailed): Proportion is bigger than p 0 *100%

inference.ppt - © Aki Taanila43 Testing proportion p-value Calculate standardized sample proportion Calculate the p-value that indicates, how likely it is to get this kind of value if we assume that null hypothesis is true In Excel you can calculate the p-value: =NORMSDIST(-ABS(z))

inference.ppt - © Aki Taanila44 Comparing two group means Null hypothesis: Group means are equal Alternative hypothesis (2-tailed): Group means are not equal Alternative hypothesis (1-tailed): Mean in a group is bigger than in another group

inference.ppt - © Aki Taanila45 Comparing two group means - Selecting appropriate t-test If we have an experiment, in which observations are paired (e.g. group1: salesmen’s monthly sales before training and group2: same salesmen’s monthly sales after training), then we should use paired sample t-test. If we compare two independent groups with equal variances then we should use independent samples t-test for equal variances. If we compare two independent groups with unequal variances then we should use independent samples t-test for unequal variances.

inference.ppt - © Aki Taanila46 Comparing two group means t-test p-value Calculate the p-value using function =TTEST(group1;group2;tail;type) Group1 refers to cells containing data for group1 and group2 refers to cells containing data for group2 Tail may be 1 (1-tailed test) or 2 (2-tailed test). Type may be 1 (paired t-test), 2 (independent samples t-test for equal variances) or 3 (independent samples t-test for unequal variances).

inference.ppt - © Aki Taanila47 Equal or unequal variances? Independent samples t-test is calculated differently depending on whether we assume population variances equal or unequal If sample standard deviations are near each other then you can use equal variances test In most cases both ways give almost the same p-value If you are unsure about which one to use then you can test whether the variances are equal or not by using F-test You should use 2-tailed test with the following hypothesis H0: Variances are equal H1: Variances are unequal

inference.ppt - © Aki Taanila48 Equal or unequal variances p-value F-test is included in Tools-menu’s Data Analysis – tools As an output you get among other things p-value for 1-tailed test You have to multiply p-value by two to get p-value for 2-tailed test If 2-tailed p-value is less than 0,05 (5%) then you should reject H0 and use t-test for unequal variances

inference.ppt - © Aki Taanila49 Testing cross tabulation Null hypothesis: No relationship in the population Alternative hypothesis: Relationship in the population

inference.ppt - © Aki Taanila50 Testing cross tabulation p-value See See SPSS instructions

inference.ppt - © Aki Taanila51 Testing correlation Null hypothesis: Correlation coefficient equals 0 (no correlation) Alternative hypothesis (2-tailed): Correlation coefficient is different from 0 Alternative hypothesis (1-tailed): Correlation coefficient is less than 0 Alternative hypothesis (1-tailed): Correlation coefficient is bigger than 0

inference.ppt - © Aki Taanila52 Testing correlation p-value See See SPSS instructions