Inverse Transformation Scale Experimental Power Graphing

Slides:



Advertisements
Similar presentations
Sampling: Final and Initial Sample Size Determination
Advertisements

Confidence Intervals This chapter presents the beginning of inferential statistics. We introduce methods for estimating values of these important population.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Statistical Tests Karen H. Hagglund, M.S.
Topic 6: Introduction to Hypothesis Testing
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
The Simple Regression Model
Relationships Among Variables
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Probability and the Sampling Distribution Quantitative Methods in HPELS 440:210.
Chapter 6 Random Error The Nature of Random Errors
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
AM Recitation 2/10/11.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Single-Sample T-Test Quantitative Methods in HPELS 440:210.
Covariance and correlation
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Choosing and using statistics to test ecological hypotheses
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
ITED 434 Quality Assurance Statistics Overview: From HyperStat Online Textbook by David Lane, Ph.D. Rice University.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
1 Chapter 8 Interval Estimation. 2 Chapter Outline  Population Mean: Known  Population Mean: Unknown  Population Proportion.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Hypothesis Testing and Statistical Significance
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Lecture Slides Elementary Statistics Twelfth Edition
Thursday, May 12, 2016 Report at 11:30 to Prairieview
Inference for a Single Population Proportion (p)
Exploring Group Differences
Chapter 13 Simple Linear Regression
GS/PPAL Section N Research Methods and Information Systems
Chapter Eight Estimation.
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
Probability and Statistics
Regression Analysis AGEC 784.
Parameter Estimation.
AP Biology Intro to Statistics
Correlation, Bivariate Regression, and Multiple Regression
26134 Business Statistics Week 5 Tutorial
Pen-size Optimization Workbook for Experimental Research design
Virtual COMSATS Inferential Statistics Lecture-11
Elementary Statistics
Probability and the Sampling Distribution
Experimental Power Graphing Program
Experimental Simulating Program
AP Biology Intro to Statistics
Correlation and Regression
Introduction to Summary Statistics
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
The normal distribution
Introduction to Summary Statistics
Inferential Statistics
Quantitative Methods in HPELS HPELS 6210
Undergraduated Econometrics
CHAPTER 12 More About Regression
Statistics II: An Overview of Statistics
Regression Assumptions
Estimating Population Parameters Based on a Sample
Regression Assumptions
REGRESSION ANALYSIS 11/28/2019.
Presentation transcript:

Inverse Transformation Scale Experimental Power Graphing ITSEPG Inverse Transformation Scale Experimental Power Graphing USER GUIDE Prepared by: Dr. Ricaro V. Nunes, Universidade Estadual do Oeste do Parana Dr. Gene M. Pesti, University of Georgia Version 1.0

❺Spreadsheets EP Data Inverse Transformation ❷Spreadsheet Introduction Home ❺Spreadsheets EP Data Inverse Transformation ❷Spreadsheet Introduction ❸Spreadsheet Calculation of Test Power ❹Spreadsheet Inputs & Notes

Introduction Determining experimental power for planning purposes by it's very nature is a risky business. Experiments are planned because we don't know what the results will be! We hope that the means and variances of control and treated groups will be similar to previous experiments, but that can't be assumed. There is always the possibility that any imposed treatments will affect individuals differently, changing the variation between treatment groups. Sometimes the data is not "normally" distributed, e.g. frequency distributions do not resemble bell shaped curves. To properly analyze non-normal data, some transformation is needed to make it appear normal. QQ Plots may be helpful to see if a given transformation will make the data more "normal" and improve the accuracy of probability estimates using analyses of variance.

Introduction Determining experimental power for planning purposes by it's very nature is a risky business. Experiments are planned because we don't know what the results will be! We hope that the means and variances of control and treated groups will be similar to previous experiments, but that can't be assumed. There is always the possibility that any imposed treatments will affect individuals differently, changing the variation between treatment groups. Sometimes the data is not "normally" distributed, e.g. frequency distributions do not resemble bell shaped curves. To properly analyze non-normal data, some transformation is needed to make it appear normal. QQ Plots may be helpful to see if a given transformation will make the data more "normal" and improve the accuracy of probability estimates using analyses of variance.

Introduction Determining experimental power for planning purposes is, by its very nature, a risky business. Experiments are planned because we don't know what the results will be! We hope that the means and variances of control and treated groups will be similar to previous experiments, but that can't be assumed. There is always the possibility that any imposed treatments will affect individuals differently, changing the variation between treatment groups. Sometimes the data is not "normally" distributed, e.g., frequency distributions do not resemble bell shaped curves. To properly analyze non-normal data, some transformation is needed to make it appear normal. QQ Plots may be helpful to see if a given transformation will make the data more "normal" and improve the accuracy of probability estimates using analyses of variance.

Introduction Whenever non-normal data is transformed to make it normal for analyses of variance, the scales are changed and it becomes difficult to visualize how differences in the original data relate to differences in the transformed data. The example below is for Blood Uric Acid levels in chickens. The QQ Plot R2 is improved from 0.91 to 0.99 by taking the log10 of the original data. When the log of the individual data (Minimum or Maximum for instance) is taken and 10 is raised to that power (the inverse transformation), the result is exactly the same as the starting value. However, this is not true for the mean or other descriptive statistics. Not only are the mean and inverse transformed mean different, but the lower and upper 90 % confidence limits for the inverse transformed mean are no longer equally distant from the mean. (See Table below) Descriptive statistics for the blood uric acid of 210 broiler chickens (X) Variable Minimum Maximum Std Error Std Deviation Lower 90% CL for Mean Interval Mean Upper 90% CL for Mean X 1.711 9.374 0.092 1.327 3.550 0.151 3.701 3.852 log10X=T 0.233 0.972 0.010 0.145 0.527 0.017 0.543 0.560 10T 1.023 1.397 3.364 0.131 3.495 0.136 3.631

Introduction How then to relate differences in the transformed results to differences in the original data? With changes in the transformed data are direct linear functions of the changes in the original data (tables and figures below). With the log10 transformation, a 1% difference in the original data corresponds to approximately a 1.6% difference in the transformed data, etc. If the sin of the data has to be taken to make them normal, a 1% difference in the original data results in about a 1.074% change when the data are inverse transformed back to the original scale. Since the scales are related linearly, predicting the number of replicates to find a given difference should be possible by the methods used here. Note that the linear relationship is not a perfect correlation, there is an intercept. Nonetheless, it is always important to remember that the descriptive statistics for the transformed data cannot be directly converted to the original scale. For example, the standard deviation for the transformed data cannot just be transformed to the original scale. It would be more appropriate to determine the lower and upper confidence limits in the transformed data and invert those to the original scale.

Difference between original data vs transformation data X log10(X) 10(log10(X)) Δ X Δ 10(log10(X)) 3.701 0.568 0.000 5 6.762 3.886 0.597 3.951 10 13.981 4.071 0.625 4.218 15 21.687 4.256 0.654 4.504 20 29.916 4.441 0.682 4.808 25 38.700 4.626 0.710 5.133 30 48.078 4.811 0.739 5.480 Log10 % Difference X sin(X) asin(sin(X)) Δ X Δ asin(sin(X)) 3.701 0.358 3.662 0.000 5 5.256 3.886 0.376 3.855 10 10.553 4.071 0.394 4.049 15 15.895 4.256 0.412 4.244 20 21.284 4.441 0.430 4.442 25 26.725 4.626 0.448 4.641 30 32.220 4.811 0.466 4.842 Sin(x)

Introduction When estimating experimental power, it is important to know the variation expected in the results. It is just as important to know the mean of the control or normal reference group. The example in the graph below shows how important it is to know both the mean and standard error when estimating experimental power. For planning purposes, it may be best to use the expected mean of the control or reference group. Then the estimated power will be for differences from that mean.

STEP 1 - Calculation of Test Power – For example Alternative distribution, mean of 2.766 (+4%) Null distribution, mean of 2.66 Same ST of 0.158 SD of 0.158 Reps = 4

STEP 1 - Calculation of Test Power – For example Better Experimental Power The results are different at 5%

STEP 2 - NOTES ❶ The two graphs included on each spreadsheet display the power of a t-test based on two samples of size n Reps each when trying to separate two distributions with given means and variances (null and alternative) ❷ Both graphs allow the choice of % difference in means of the null and alternative distributions and baseline (null) standard deviation. ❸ The standard deviation of the alternative distribution is then calculated either based on constant CV (Constant CV table) or is set equal to that of null distribution (Constant Variance table) ❹ The calculation of test power consists of determining the area under the probability density function of the alternative distribution outside of the bounds determining the 95% confidence interval of the null distribution (see explanation and figure)

STEP 3 - Working with ITSEPG - Input data ❶Open the program and after reading information Click on the worksheet

STEP 3 - Working with ITSEPG - Input data Enter - Desired Test Power and Confidence Level Confidence Level + Test Power = 1.00 These may vary: 0.01 – 0.99 0.05 – 0.95 0.10 – 0.90 “Depends on the rigor that the user wants”

STEP 3 - Working with ITSEPG - Input data Enter – Mean and St. Deviation Data from your experiment

STEP 3 - Working with ITSEPG - Input data Enter – % Difference in Means

IMPORTANT Other changes occur automatically STEP 3 - Working with ITSEPG - Input data Enter – Numbers Replicates Again - Change values in green, only IMPORTANT Other changes occur automatically

STEP 4 - Results Example – Log10(x) Results: data was inverse transformed to return to the original scale before graphing For example, with: Confidence level: 0.05 and Desired test power: 0.95 - Have the following: Example – Log10(x) To 10 replicates the difference between mean, can be Constant CV - Test Power 0.8: Below 10% - Test Power 0.9: Above 10% Constant Variance

STEP 4 - Results Results: data was inverse transformed to return to the original scale before graphing Constant CV To Constant CV is necessary 10 replicates to found difference between average (TP: 0.9) To Constant Variance is necessary 8 replicates to found difference between average (TP: 0.9) Constant Variance

Results – Constant CV When the Test Power is 0.95 and the Confidence Level 0.05, we found that Confidence Level: 0.05 Target Mean: 0.75 Target SD: 0.03 Desired Test Power: 0.95 6 replicates we have 18.85% of difference between means 10 replicates we have 14.815% of difference between means Confidence Level: 0.01 Target Mean: 0.75 Target SD: 0.03 Desired Test Power: 0.99 When the Test Power is 0.99 and the Confidence Level 0.01, we found that 6 replicates we have 31.826% of difference between means 8 replicates we have 23.027% of difference between means

Results – Constant Variance When the Test Power is 0.95 and the Confidence Level 0.05, we found that Confidence Level: 0.05 Target Mean: 0.75 Target SD: 0.03 Desired Test Power: 0.95 6 replicates we have 18.85% of difference between means 10 replicates we have 14.815% of difference between means Confidence Level: 0.01 Target Mean: 0.75 Target SD: 0.03 Desired Test Power: 0.99 When the Test Power is 0.99 and the Confidence Level 0.01, we found that 6 replicates we have 27.35% of difference between means 10 replicates we have 14.815% of difference between means

Important points ❶ This ITSEPG workbook was designed for data needing a transformation and inverse transformation for estimating experimental power. The QPDOL.exe workbook may be helpful to determine if a transformation is needed to help normalize data for analysis, and if so, ITSEPG should be helpful for graphing the inverse transformed data. ❷ A basic assumption of Analysis of Variance is that variances are equal. While it may not be possible to prove variances are not equal, it is sometimes obvious that variation increases proportional to the mean. For instance, as birds grow, they become more variable. Therefore calculations in ITSEPG are produced for both constant absolute variation and constant coefficient of variation models. ❸ ITSEPG.xls is a planning tool. While it may be intuitive that having more replication improves the chances of finding statistically significant differences, in reality the determined p-value “is what it is.” Planning an experiment with a large number of replicates should not influence the determined p-value and should not be interpreted as doing so.