Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Chromium Emissions Data Nagaraj Neerchal and Justin Newcomer, UMBC and OIAA/OEI and Mohamed Seregeldin, Office of Air Quality Planning and.

Similar presentations


Presentation on theme: "Analysis of Chromium Emissions Data Nagaraj Neerchal and Justin Newcomer, UMBC and OIAA/OEI and Mohamed Seregeldin, Office of Air Quality Planning and."— Presentation transcript:

1 Analysis of Chromium Emissions Data Nagaraj Neerchal and Justin Newcomer, UMBC and OIAA/OEI and Mohamed Seregeldin, Office of Air Quality Planning and Standards, EPA, RTP

2 Objective To develop a protocol (methodology) for obtaining confidence bounds for the “Mean Chromium Emissions” for each welding process and rod type combination. Incorporate all the data, including the averages, to the best of our ability.

3 About The Data Three Welding Processes –GMAW, SMAW, FCAW Three Rod Types –E308, E309, and E316 Multiple Sources of Data –Some report individual measurements –Some report only averages without the original observations. –Units of reporting vary—all are converted to g/kg

4 Summary Statistics Note: Summary Statistics based only on observations with single measurement.

5 Combining Rod Types Combine E308+E316 because of the similar technology and small sample size Sample Sizes:

6 Summary Statistics After Combing Data for Rod Types Note: Summary Statistics based only on observations with single measurement.

7 Traditional Approaches Assume Normality? –Normality is not a good assumption for this data set at all –Sample sizes are very small for certain combinations –Bounds obtained assuming normality give meaningless results (e.g. negative bounds) when the data does not follow normality 95% Confidence Intervals for the Mean: Note: Summary Statistics based only on observations with single measurement.

8 Traditional Approaches Transform the data to normality –Optimal transformation for Total Chromium data is different from optimal for Chrom6 data. –It is hard to transform the confidence bounds back to the original scale (mean of the log is not the same log of the mean!) Box-Cox Log-Likelihood Plots:

9 Traditional Approaches Weighted regression to incorporate the averages

10 Traditional Approaches Weighted Regression –Estimates have good properties (such as BLUE) in general—not only for normal data –But the confidence bounds are sensitive to the normality assumption, especially when the sample sizes are small as in our case.

11 Nonparametric Approaches? –Nonparametric approaches usually use ranks. When only averages are reported we completely lose the information regarding ranks. Therefore, means can not be incorporated into nonparametric approaches. Bootstrapping? –Made popular by Bradley Efron in the 1980’s –Efron and Tibshirani (1993) –Millard, S. P. and Neerchal, N. K. (2000) Traditional Approaches

12 Bootstrapping What is Bootstrapping? –Resampling the observed data –It is a simulation type of method where the observed data (not a mathematical model) is repeatedly sampled for generating representative data sets –Only indispensable assumption is that “observations are a random sample from a single population” –There are some fixes available when the single population assumption is violated as in our case. –Can be implemented in quite a few software packages: e.g. SPLUS, SAS –Millard and Neerchal (2000) gives S-Plus code

13 Bootstrapping - The Details DataX=(X 1,X 2,X 3,….,X n ) Statistic: T=T(X) rep #1X* 1 =(X* 1,X* 2,X* 3,….,X* n )T* 1 =T(X* 1 ) rep #2X* 2 =(X* 1,X* 2,X* 3,….,X* n )T* 1 =T(X* 1 ) …..……..……. rep #BX* B =(X* 1,X* 2,X* 3,….,X* n )T* 1 =T(X* 1 ) Bootstrapping inference is based on the distribution of the replicated values of the statistic : T* 1,T* 2,….T* B. For example, Bootstrap 95% Upper Confidence Bound based on T is given by the 95 th percentile of the distribution of T*s.

14 Bootstrapping Single Tests Data Note: Columns in yellow represent the 95% upper confidence bound

15 Bootstrapping the Combined Data Group the data points according to the number of tests used in reporting the average, within each welding process and rod type combination. Then bootstrap within each such group. i.e. for GMAW and E316: Note: Each color represents a separate group

16 Bootstrapping - Results Note: Columns in yellow represent the 95% upper confidence bound

17 Final Remarks Normality assumption is not appropriate for either Total Chromium or Chromium6 data. Weighted regression model can accommodate the averages into the estimates. Bootstrapping the data seems to be a way to ensure that meaningful confidence bounds are obtained More work is needed to study the robustness of Bootstrapping results with respect to some extreme values in the data


Download ppt "Analysis of Chromium Emissions Data Nagaraj Neerchal and Justin Newcomer, UMBC and OIAA/OEI and Mohamed Seregeldin, Office of Air Quality Planning and."

Similar presentations


Ads by Google