Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical tests for replicated experiments

Similar presentations


Presentation on theme: "Statistical tests for replicated experiments"— Presentation transcript:

1 Statistical tests for replicated experiments
Normal probability plots are a less formal diagnostic tool for detecting effects F-tests and t-tests provide a statistical test of factor effects

2 Statistical tests for replicated experiments
Statistical tests are possible for unreplicated designs (unreplicated pilot studies are essential tools in sample size calculations) We will first focus on statistical tests for replicated designs

3 Statistical tests for replicated experiments--Example
Response--Pulse rate of subject Factors Treatment (Energy Drink, Placebo) Setting (Moderate, Difficult) Machine (Stair climber, Recumbent bike) Open Minitab worksheet. n=4

4 Statistical tests for replicated experiments--Example
Point out website notes on this.

5 Statistical tests for replicated experiments--Example
T=8.25 and the variance of the first cell being 8.25 is strictly a coincidence.

6 Statistical tests for replicated experiments--Example
AB is the largest negative effect, but it’s on the line.

7 Statistical tests for replicated experiments
Effect sizes depend on the measurement scale Statistical tests are based on standardized effects To compute standardized effects, start with an estimate of experimental error E.g., pulse varies depending on the unit measurement. We need to know the variation in a single observation; here we have 8 estimates already!

8 Statistical tests for replicated experiments
Experimental error can be summarized by the square root of the variance of the background noise (the standard deviation) The experimental error measures variation in a single observation

9 Statistical tests for replicated experiments
The variance is best estimated by the Mean Square for Pure Error (MSPE) Like pooled sample variance in two-sample test. MSPE assumes same magnitude of error in every cell.

10 Statistical tests for replicated experiments--Example
The standard deviation for each run is ~3 beats per minute The standard deviation does not depend on factor settings.

11 Statistical tests for replicated experiments
While the standard deviation for a single response is the square root of MSPE, the standard deviation of an effect (its standard error) is: Hidden replication from factorial experiment

12 Statistical tests for replicated experiments
We divide an effect in a k-factor experiment with n replications (e.g., A) by its standard error to compute a t-test statistic : T represents the number of standard deviations from 0. Effects are averages and so have smaller standard error.

13 Statistical tests for replicated experiments
Test statistics for other effects are computed similarly U-do-it: Calculate the T-statistics of all effects for the Exercise data Use ANOVAEXAMPLEDATA.mpj for answers. k=3, n=4—use same denominator for all effects. Root(MSPE/(2^{k-2}*n))=root(10.62/(2*4))=1.152

14 Statistical tests for replicated experiments
When an effect is negligible, T has a t-distribution The shape of the t-distribution curve depends on the number of replicates and number of effects (“degrees of freedom”=2k(n-1)) The t-distribution curves have slightly more spread than the bell-shaped (“normal”) curve In our example, 2^3*(4-1)=24 df

15 Statistical tests for replicated experiments--t curve for 3-factor design
T with 24 df, and T with 8 df.

16 Statistics tests for replicated experiments
If |T| is larger than the 99.5th or 97.5th percentile of the t distribution, an effect is significant These percentiles are commonly found in textbooks (but please use a computer package instead)

17 Statistical tests for replicated experiments--t critical value for 3-factor design (n=4)
2.064=t(.975;24). Do t(.975) in Minitab. Refer to handout.

18 Statistical tests for replicated experiments
Sometimes, twice the area to the right of |T| is reported as a p-value. Small p-values (<0.05 or <0.01 typically) suggest that a standardized effect is distinguishable from background noise You definitely need a computer to compute p-values--in the following example, the p-value for the M effect is 2*.122=.244

19 Statistical test for replicated experiments--Example

20 Statistical tests for replicated experiments--Example
U-do-it: Compute p-values for the remaining effects. Which effects are significant? Are these the same effects that the probability plot detected? Compare |T| to t(.974;24).

21 Statistical tests for replicated experiments
F tests for individual effects are equivalent to t-tests (F1,n has same distribution as t2n) F tests allow several comparisons to be tested simultaneously. F distributions have an additional parameter (numerator df) representing the number of independent comparisons being made. Show simultaneous effects of main effects etc in Minitab

22 Statistical tests for replicated experiments
The F distribution is always positive The F distribution’s mean is always greater than 1*

23 Statistical tests for replicated experiments
Hypothesis testing can be extended to combine estimates of error from both pure error and negligible effects Negligible effects can be selected a priori or from effects plots Degrees of freedom for t-tests and F-tests should be adjusted accordingly Be careful of data snooping.

24 Statistical tests for replicated experiments
Error estimated from negiglible terms (Lack of Fit) is similar to MSPE Source DF SS MS Residual Error Lack of Fit Pure Error

25 Testing without Replication Lenth’s Test
Lenth (1989) developed a more formal test of effects. Denote the effects by ei, i=1,…,m (m=2k-1 for 2k designs). We say that the ei’s are iid N(0,t2), where t is their common standard error. Lenth’s test is somewhat more formal.

26 Experimental Error Estimates
Lenth develops two estimates of the common standard error, t, of the ei’s: S-naught is tau-hat from Minitab output. PSE stands for pseudo-standard error.

27 Margins of Error Though both are consistent estimates, PSE is more robust The following terms (margins of error) are used to test effects Ad hoc df. We use a Sidak-like correction for SME.

28 Testing with Margins of Error
The df term was developed from a study of the empirical distribution of PSE2 ME is a 1-a confidence bound for the absolute value of a single effect SME is an exact (since the effects are independent) simultaneous 1-a confidence bound for all m effects


Download ppt "Statistical tests for replicated experiments"

Similar presentations


Ads by Google