CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.

Slides:

Advertisements

Similar presentations

Tests of Hypotheses Based on a Single Sample

Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.

Hypothesis testing Another judgment method of sampling data.

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

11 Simple Linear Regression and Correlation CHAPTER OUTLINE

Objectives (BPS chapter 24)

The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.

10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.

9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.

The Simple Regression Model

Chapter 11 Multiple Regression.

4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.

Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.

EEM332 Design of Experiments En. Mohd Nazri Mahmud

Chapter 2 Simple Comparative Experiments

Inferences About Process Quality

Chapter 9 Hypothesis Testing.

BCOR 1020 Business Statistics Lecture 20 – April 3, 2008.

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.

5-3 Inference on the Means of Two Populations, Variances Unknown

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.

Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.

ECONOMETRICS I CHAPTER 5: TWO-VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING Textbook: Damodar N. Gujarati (2004) Basic Econometrics,

Chapter 13: Inference in Regression

McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.

1/2555 สมศักดิ์ ศิวดำรงพงศ์

ISE 352: Design of Experiments

4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.

Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.

Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.

More About Significance Tests

NONPARAMETRIC STATISTICS

1 Design of Engineering Experiments Part 2 – Basic Statistical Concepts Simple comparative experiments –The hypothesis testing framework –The two-sample.

Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.

10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.

6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.

9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.

Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.

1 10 Statistical Inference for Two Samples 10-1 Inference on the Difference in Means of Two Normal Distributions, Variances Known Hypothesis tests.

Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.

4 Hypothesis & Testing. CHAPTER OUTLINE 4-1 STATISTICAL INFERENCE 4-2 POINT ESTIMATION 4-3 HYPOTHESIS TESTING Statistical Hypotheses Testing.

1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.

EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.

5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.

VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.

DOX 6E Montgomery1 Design of Engineering Experiments Part 2 – Basic Statistical Concepts Simple comparative experiments –The hypothesis testing framework.

Correlation & Regression Analysis

© Copyright McGraw-Hill 2004

11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.

Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,

1 Design and Analysis of Experiments (2) Basic Statistics Kyung-Ho Park.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.

Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Chapter 2 Simple Comparative Experiments

Chapter 9 Hypothesis Testing.

9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE

6-1 Introduction To Empirical Models

Chapter 9 Hypothesis Testing.

Product moment correlation

Inference on the Mean of a Population -Variance Known

Presentation transcript:

CHAPTER 2 Building Empirical Model

Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important characteristics of the product. An engineer is interested in comparing the strength of a modified formulation in which polymer latex emulsion have been added during mixing to the strength of unmodified mortar. The experimenter has collected observations on the strength, 10 each for both mortars. The data are shown in Table 2.1

 Each observations,j is called a run  Fluctuation (noise) – experimental error  Presence of error implies that response variable is a random variable (can be discreate or continuous)

Dot diagram for data in Table 2.1 What can you conclude from the dot diagram? Where is the general location or central tendency?

Other graphical methods… Histogram For fairly numerous data

Other graphical methods… Box plot (or box and whisker plot) median Upper quartiles (75%) lower quartiles (25%)

Probability Distributions The probability structure of a random variable, y is described by its probability distributions. If y is discrete – the probability function of y, p(y) If y is continuous – the probability density function, f(y)

Mean,μ of a probability distribution is a measure of its central tendency or location Mean, Variance and Expected value We may also express the mean in terms of expected value of random variable, y Where E denotes the expected value operator

The variability or dispersion of a probability distribution can be measured by the variance, defined as Note that the variance can be expressed entirely in terms of expectation because Finally the variance is used so extensively that it is convenient to define a variance operator, V such that

Inferences About Differences In Means, Randomized Design Hypothesis testing Choice of sample size Confidence intervals The case where σ 1 2 ≠ σ 2 2 The case where σ 1 2 and σ 2 2 are known Comparing a single mean to specified value

Hypothesis testing Lets reconsider the portland cement experiment. In general, we can consider 2 formulations (unmodified and modified mortar) involved as 2 level of the factor formulations. Let y 11,y 12,y 13,…y 1n1 represent the n1 observations from the first factor level, whereas y 21,y 22,y 23,…y 1n1 represent the n2 observations from the second factor level.

We describe the results of experiment with a model. A simple statistical model : y= j observation from factor level i μ= mean of response ε = normal random variable = random error

1) Statistical hypothesis  Is a statement either about the parameters of a probability distribution or the parameters of a model.  Decision-making procedure about hypothesis is called hypothesis testing.  For example, in the portland cement experiment, we may think that the mean tension bond strengths of two mortar formulation are equal. This may stated formally as:

Power = the probability of rejecting null hypothesis, H 0 when the alternative hypothesis, H 1 is true.

2) The two-sample t-Test The appropriate test statistic to use for comparing two treatment mean in completely randomized design is

To determine whether to reject H 0 :μ 1 =μ 2, we would compare t 0 to the t distribution with n+n-2 degrees of freedom. If where is the upper α/2 percentage point of t distribution with n 1 +n 2 -2 degrees of freedom, we would reject H 0 and conclude that the mean strength of two formulation of portland cement differ. This test procedure is called two-sample t-test For one sided alternative hypothesis H 1 :μ 1 >μ 2, H 0 would be rejected if For H 1 :μ 1 <μ 2, H 0 would be rejected if

Example: From the portland cement data,

3) P-values One way to report the results of a hypothesis test is to state that the null hypothesis was or was not rejected at specified α-value or level of confidence. For example; in portland cement mortar formulation, we can say that H 0 :μ 1 =μ 2 was rejected at 0.05 level of confidence. This is inadequate conclusion because no idea exact location of the computed value in rejection region. Moreover, some decision maker might be uncomfortable with α=0.05. To overcome this difficulties P-value approach

P-value is the smallest level of significance that would lead to rejection of null hypothesis. P-value: Smallest level α at which data are significant. Therefore, can determine significance of data. It is not easy to compute exact P-value. However, approximation can be done. For portland cement mortar example, degree of freedom=18. From t- distribution table, the smallest tail area probability is , for which t ,18 = Now (H 0 is rejected), so because the alternative hypothesis is two-sided, P-values must be less than 2(0.0005)=

4) Normal probability plot  Is a graphical technique for determining whether sample data conform to hypothesized distribution based on subjective visual exam of data.  How to interpret?  How to construct?? (j-0.5)/n, where j=1,2,3….n

Choice of sample size The choice of sample size and probability of type II error, β are closely related. Suppose we are testing And that the means, μ are not equal. Because H 0 :μ1=μ2 is not true, we are concerned about wrongly failing to reject H 0. β depends on true difference in mean,δ Graph β vs δ is called the operating characteristic curve or O.C. curve. Generally, β error decreases as the sample size increases. So, δ is easier to detect in bigger sample size.

d Example of O.C curve for the case where variance σ 1 2 and σ 2 2 are unknown but equal, and α= 0.05 From the curve;  The greater the difference in mean, the smaller β error  As the sample size increases, β gets smaller

How to use the O.C curve to calculate sample size? Suppose that δ=0.1, therefore, If σ = 0.25, then d= 0.2. If we want to reject the null hypothesis 95% of the time when μ1-μ2=0.1, then β=0.05 and d=0.2 yields n*=15 Since, therefore n = 8

Confidence intervals  an interval within which the value of parameter or parameters in question would be expected to lie.  Recall that an interval such as:

 L and U are called lower and upper confidence limits.  1-α is called confidence coefficient. If α = 0.05, Equation 8.29 is called a 95% confidence interval for μ.

How to calculate confidence interval? is a 100(1-α) percent confidence interval for μ1-μ2.

Example From portland cement mortar example discuss earlier; the actual 95% confidence interval estimate for difference in mean tension strength, Thus the confidence interval is μ1-μ2 = kgf/cm 2 ±0.27 kgf/cm 2 Or the difference in mean strength is and the accuracy of this estimate is ±0.27 kgf/cm 2

The case where σ 1 2 ≠ σ 2 2 If we are testing, And cannot assume the variances are equal, the test statistic becomes With calculation of degree of freedom as follows

The case where σ 1 2 and σ 2 2 are known If both variances are known, then the hypothesis

Comparing a single mean to specified value If we are testing, The test statistics, The confidence interval,

SUMMARY

Regression model

Regression model & Empirical model Suppose there is a single dependent variable or response,y that depends on k independent or regressor variables, for example x 1,x 2,x 3,…x k The relationship between y and k is characterized by mathematical model called a regression model. Regression model is the basis of empirical model *empirical model is created from experimental observations

Linear Regression Model Suppose we wish to develop an empirical model which relates viscosity of polymer to the temperature, x 1 and catalyst feed rate,x 2 This is multiple linear regression model. Why? β =regression coefficient x =predictor variables or regressor In general, any regression model that is linear in parameters is a linear regression model, regardless of the surface that is generated (normally related to model with interaction). Methods for estimating parameters in multiple linear regression is called model fitting. Typical method is method of least squares

Least squares estimation of the parameters

Matrix Approach To Multiple Linear Regression

Properties of the least squares estimators and estimation of σ 2

Hypothesis Testing In Multiple Regression Test for significance of regression Test on individual regression coefficients and groups of coefficients

Test for significance of regression

Test on individual regression coefficients and groups of coefficients The model might be more effective with the inclusion of additional variables or with deletion of one or more regressor. test individual or groups of regression coefficient

Why C 22 = ? Because of covariance matrix, C

Confidence interval in multiple regression On individual regression coefficient On the mean response

Confidence interval in multiple regression- On individual regression coefficient

Confidence interval in multiple regression- On the mean response

Thank you…

Quiz 1.Discuss one function of a regression model. 2.Define residual. List 2 plot that can be constructed using residual values. 3.Justify the importance of test of significance of a regression model. If the H o is accepted, what does it means? 4.An experiment was conducted to examine the effect of T and P on growth. Given, f 0 = MS R /MS E, MS E = 4.356, and f value from the f table = 2.3. Propose an appropriate conclusion if MS R value < 4.