PROBABILITY & STATISTICAL INFERENCE LECTURE 6 MSc in Computing (Data Analytics)

Slides:



Advertisements
Similar presentations
Hypothesis Testing Steps in Hypothesis Testing:
Advertisements

Analysis of Variance Outlines: Designing Engineering Experiments
Probability & Statistical Inference Lecture 6
Confidence Interval and Hypothesis Testing for:
Probability & Statistical Inference Lecture 8 MSc in Computing (Data Analytics)
Probability & Statistical Inference Lecture 7 MSc in Computing (Data Analytics)
Probability & Statistical Inference Lecture 6 MSc in Computing (Data Analytics)
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Lecture 9: One Way ANOVA Between Subjects
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Inferences About Process Quality
EEM332 Lecture Slides1 EEM332 Design of Experiments En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14 Ext
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
Simple Linear Regression and Correlation
13 Design and Analysis of Single-Factor Experiments:
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Experimental Statistics - week 2
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Regression Analysis (2)
QNT 531 Advanced Problems in Statistics and Research Methods
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
1 1 Slide Analysis of Variance Chapter 13 BA 303.
More About Significance Tests
NONPARAMETRIC STATISTICS
Comparing Two Proportions
One-Way Analysis of Variance Comparing means of more than 2 independent samples 1.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Ch9. Inferences Concerning Proportions. Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 12 Inference About A Population.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
CHAPTER 4 Analysis of Variance One-way ANOVA
1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Week 8 October Three Mini-Lectures QMM 510 Fall 2014.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Chapter 10 The t Test for Two Independent Samples
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
© Copyright McGraw-Hill 2004
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
1 Chapter 5.8 What if We Have More Than Two Samples?
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 Pertemuan 19 Analisis Varians Klasifikasi Satu Arah Matakuliah: I Statistika Tahun: 2008 Versi: Revisi.
Rancangan Acak Lengkap ( Analisis Varians Klasifikasi Satu Arah) Pertemuan 16 Matakuliah: I0184 – Teori Statistika II Tahun: 2009.
Two-Sample Hypothesis Testing
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Chapter 4. Inference about Process Quality
Analysis of Variance (ANOVA)
i) Two way ANOVA without replication
Statistics Analysis of Variance.
Statistics for Business and Economics (13e)
Econ 3790: Business and Economic Statistics
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

PROBABILITY & STATISTICAL INFERENCE LECTURE 6 MSc in Computing (Data Analytics)

Lecture Outline  Quick Recap  Testing the difference between two sample means  Practical Hypothesis Testing  Analysis Of Variance

General Steps in Hypotheses testing 1. From the problem context, identify the parameter of interest. 2. State the null hypothesis, H Specify an appropriate alternative hypothesis, H Choose a significance level, . 5. Determine an appropriate test statistic. 6. State the rejection region for the statistic. 7. Compute any necessary sample quantities, substitute these into the equation for the test statistic, and compute that value. 8. Decide whether or not H 0 should be rejected and report that in the problem context.

Type of questions that can be answered with Two sample hypothesis tests  A manufacturing plant want to compare the defective rate of items coming off two different process lines.  Whether the test results of patients who received a drug are better than test results of those who received a placebo.  The question being answered is whether there is a significant (or only random) difference in the average cycle time to deliver a pizza from Pizza Company A vs. Pizza Company B.

Difference in Means of Two Normal Distributions, Variances Known

Test Assumptions

Example

The P-Value is the exact significance level of a statistical test; that is the probability of obtaining a value of the test statistic that is at least as extreme as that when the null hypothesis is true

Confidence Interval on a Difference in Means, Variances Known

Example

Difference in Means of Two Normal Distributions, Variances unknown We wish to test: The pooled estimator of  2 :

Difference in Means of Two Normal Distributions, Variances unknown

Example

Confidence Interval on the Difference in Means, Variance Unknown

Example

Practical Hypothesis Testing 1. From the problem context, identify the parameter of interest. 2. State the null hypothesis, H Specify an appropriate alternative hypothesis, H Choose a significance level, . 5. Calculate the P-value using a software package of choice. 6. Decide whether or not H 0 should be rejected and report that in the problem context. Reject H 0 when P-Value is less than . (Golden rule: Reject H 0 for small  )

Some Reserach  Look up the correct formula for calculating the hypotheses test between two proportions  What are the assumptions for the test  Find an example of the research

Analysis of Variance

Introduction  In the previous section we were concerned with the analysis of data where we compared the sample means.  Frequently data contains more that two samples, they may compare several treatments.  In this lecture we introduce statistical analysis that allows us compare the mean of more that two samples. The method is called ‘Analysis of Variance ‘ or AVOVA for short.

Total Sum of Squares Data set: 14, 12, 10, 6,4, 2 Group A: 6,4, 2 Group B: 14, 12, 10 Overall Mean : 8 Total Sum of Squares: SS T = (14-8) 2 + (12-8) 2 + (10-8) 2 + (6-8) 2 + (4-8) 2 + (2-8) 2 =112

Between Group Variation  Sum of Squares of the Model: SS m = n a (µ - µ a ) 2 + n b (µ - µ b ) 2 =3*(8-4) 2 + 3*(8-12) 2 =96

Within Group Variation  Sum of Squares of the Error: SS e = = (14-12) 2 + (12-12) 2 + (10-12) 2 + (6-4) 2 + (4- 4) 2 + (4-2) 2 + = 16

Structure of the Data GroupObservationTotalMean 1x 11 x x 1n x1x1 2x 21 x x 2n x2x ax a1 x a x an xaxa Total

ANOVA Table SourceDegrees of Freedom Sum Of SquaresMean Square F- Stat Modela - 1SS M /(a-1)MS M / MS E Errorn-a SS E /(n-a) Totaln-1 SS T /(n-1) Where : n is the sample size and a is the number of groups

ANOVA Table – Original Example SourceDegrees of Freedom Sum Of SquaresMean Square F- Stat Model2 - 1 = Error6 – 2 = Total6 – 1 = 5112 Where : n is the sample size and a is the number of groups

Model Assumptions  Independence of observations within and between samples  normality of sampling distribution  equal variance - This is also called the homoscedasticity assumption

The ANOVA Equation  We can describe the observations in the above table using the following equation: Where : n is the sample size and k is the number of groups

ANOVA Hypotheses We wish to test the hypotheses: The analysis of variance partitions the total variability into two parts.

Example

Graphical Display of Data Figure 13-1 (a) Box plots of hardwood concentration data. (b) Display of the model in Equation 13-1 for the completely randomized single-factor experiment

Example  We can use ANOVA to test the hypotheses that different hardwood concentrations do not affect the mean tensile strength of the paper. The hypotheses are:  The ANOVA table is below:

Example  The p-value is less than 0.05 therefore the H 0 can be rejected and we can conclude that at least one of the hardwood concentrations affects the mean tensile strength of the paper.

Test Model Assumptions  Use the Bartletts Test to test for homoscedasticity assumption  Bartlett's test (Snedecor and Cochran, 1983) is used to test if k samples have equal variances.  Bartlett's test is sensitive to departures from normality. That is, if your samples come from non- normal distributions, then Bartlett's test may simply be testing for non-normality. The Levene test is an alternative to the Bartlett test that is less sensitive to departures from normality.

Barlett Test for Equal Variance  The hypotheses for the Barlett test are as follows:  The barlett test statistic follows a chi-squared distribution  Interpert the p-value like any other hypothese test

If the Assumption of Equal Variance is not met  If the assumption for equal variance is not met use the Welches ANOVA  Assignment for next week:  Investigate the difference between the standard ANOVA and Welches ANOVA?

Demo

Confidence Interval about the mean For 20% hardwood, the resulting confidence interval on the mean is

Confidence Interval about on the difference of two treatments For the hardwood concentration example,

An Unbalanced Experiment

Multiple Comparisons Following the ANOVA  The least significant difference (LSD) is If the sample sizes are different in each treatment:

Example: Multi-comparison Test

Demo

Exercises