What we’ll cover today Transformations Inferential statistics

EHS 655 Lecture 11: Transformations, inferential statistics (t-test, ANOVA)

What we’ll cover today Transformations Inferential statistics
t-test ANOVA Review of midterm report requirements

TRANSFORMING VARIABLES
Many inferential statistical methods assume data are normally distributed t-test ANOVA Linear regression However, many exposures positive and right-skewed

One solution: log-transform data
Yi=ln(xi) where Yi is log-transformed data point xi is original data point ln is natural logarithmic function Natural log (ln) transform of lognormally distributed variable has properties of normal distribution i.e., bell-shaped and symmetric Described by geometric mean (GM) and geometric standard deviation (GSD)

Log transformation Exposure distributions – original and transformed
Rappaport and Kupper, 2008

Log transformation

Evaluating lognormal distribution
Quantile-quantile plots Untransformed Log-transformed Stata: qnorm varname1

Interpreting log transformed estimates
Arithmetic mean of log transformed exposures Arithmetic SD of log transformed exposures Geometric mean Antilog of mean of log-transformed exposures Geometric standard deviation Stata: can use two combinations for transformation ln() or log() and exp() … OR … log10() and 10log10value

Caution about transformation
Back-transformed mean ≠ original variable mean GM isn't easily interpreted Proper to run statistical tests on transformed values But often report means in unit of untransformed scale as well “If it ain’t broke, don’t fix it.” Transformation bad if: Distribution more or less symmetrical, few outliers Variances reasonably homogeneous Transformation may be useful Markedly skewed data or heterogeneous variances

INFERENTIAL STATISTICS
Descriptive statistics applied to populations are called parameters Inferential statistics apply to samples We’ll focus on two inferential approaches today t-test ANOVA

t-test

t-test Detect differences between means of (normally-distributed) samples Significant t-statistic = means differ Student’s (unpaired) t-test Test hypothesis that means of two samples are equal; null is Stata: ttest varname1, by(groupvar) Paired sample t-test Test whether two measurements on same individual are equal Stata: ttest varname1 == varname2

Things we can do with a t-test
Single-sample t-test: identify differences in the mean of a group and a reference value Unpaired t-test: identify differences in mean exposures between two groups Paired-sample t-test: identify differences in exposure before and after an intervention in a group of subjects

Interpreting a single-sample t-test in Stata

Interpreting a t-test between groups in Stata

Interpreting a paired t-test in Stata

ANOVA (ANalysis Of Variance)
Technique for assessing how categorical independent variables affect continuous dependent variable Like a t-test generalized to three or more means Tells use whether means from k groups are same or not Null hypothesis:

Things we can do with ANOVA
Identify differences in mean exposures between more than two groups Evaluate relationship of within-worker variance within exposure group to between-worker variance Within-worker > between worker = good exposure grouping Within-worker < between worker = poor exposure grouping

ANOVA assumptions Continuous dependent variable
Independent variable is 2+ categorical groups Data independent from each other Errors normally distributed Variances same for all groups ANOVA fairly robust for these assumptions But data should not be extremely far off

ANOVA illustrated

Generic ANOVA components

ANOVA – F-test Compares variability in exposure accounted for by predictor variable vs error variability Error variability (mean squared error) measures inherent randomness of observations Large differences between groups = significant F test

F-statistic

Stata ANOVA output Stata: oneway responsevar groupvar
Bigger F = significant

Stata ANOVA output

Stata ANOVA output Stata: anova responsevar groupvar
Note different output: now get R2, adj R2, RMSE, etc. More in regression lecture

Stata ANOVA output Stata: oneway responsevar groupvar, tabulate
Tabulate gives results by group

Why use ANOVA instead of t-test?
Could do t-tests for all pairs of predictor variable categories Not a good idea As number of exposure groups grows, so does number of needed pair comparisons Each comparison introduces risk of error ANOVA puts all data into one number (F) and gives one P for null hypothesis

What if I want to know which groups are different
Multiple comparisons possible After you run oneway command, use this second command Stata: pwcompare groupvar, effects sort mcompare(tukey)

Multiple comparison ANOVA output

Measure of agreement between categorical and continuous variables
Stata: loneway responsevar groupvar Intraclass correlation coefficient = measure of agreement, same scale as Cohen’s kappa

ANOVA in action Enough with words already. Let’s see how ANOVA actually works Stata ANOVA commands: oneway responsevar groupvar Option (to get more detailed output by group) oneway responsevar groupvar, tabulate means standard

Resources Choosing statistical tests
Stata annotated output from various tests

Review of midterm report

Example of noise exposure calculation requiring transformation
Can describe noise exposures (in dBA) across individuals arithmetically In other words, to estimate a group mean for individuals in, say, the same trade, compute arithmetic mean To estimate average noise exposures within individual (in dBA) is computing dose Requires temporary transformation LEQi= 10 log [1/N (10 (TWA1/10) +10 (TWA2/10) + …+ 10 (TWAn/10))] Where N is total number of TWAs used to estimate average LEQ for person i How to operationalize in Stata? Note temporary transformation

What we’ll cover today Transformations Inferential statistics

Similar presentations

Presentation on theme: "What we’ll cover today Transformations Inferential statistics"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

What we’ll cover today Transformations Inferential statistics

Similar presentations

Presentation on theme: "What we’ll cover today Transformations Inferential statistics"— Presentation transcript:

Similar presentations

About project

Feedback