Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences & Department of Psychiatry
Generate Commands Using Logic generate obese2 =. recode obese2.=0 if bmi <= 30 recode obese2.=1 if bmi > 30 tab obese obese2 prtest obese2, by(sex) Missing as obese, which is strange.
Missing Values and Logical Operators management/logical-expressions-and- missing-values/ management/logical-expressions-and- missing-values/
Generate Commands Using Logic generate obese2 =. recode obese2.=0 if bmi <= 30 recode obese2.=1 if bmi > 30 & bmi !=. tab obese obese2, missing prtest obese2, by(sex) This code works.
Statistical Errors
Sample Size Simulation
Sample Size Calculation in STATA 3 2 1
Sample Size Dialogue Boxes
Let’s do a calculation! You are planning a parallel group RCT – with treatment and control groups. Normally, 20% of people die with disease X, but you expect to cut this in half with a new treatment. How many do you need in each group to achieve 95% power at alpha = 5%?
Output (sampsi)
Another Calculation A QoL scale in a particular disease has a mean score of 20 and a standard deviation of 5. You are conducting a placebo controlled trial to evaluate a treatment that is expected to improve the QoL by 2 points on this scale. You recruit n=50 into each group – what power will you achieve?
Output (sampsi)
Go to “ Scroll to the bottom. Right click to download the files described as being “for PGME Students” –One is a dataset –One is a data dictionary Save them on your desktop
Review: Comparing Proportions We’ve looked at several procedures for comparing proportions (e.g. for obesity in men vs. women): generate obese =. recode obese.=0 if bmi <= 30 recode obese.=1 if bmi > 30 & bmi !=. tab obese obese, missing prtest obese, by(sex)
Epitab Commands 1 3 2
Review: Comparing Proportions We’ve looked at several procedures for comparing proportions (e.g. for obesity in men vs. women): recode sex 2=1 1=0 cs obese sex
The output…
A “non-significant” association generate highgluc =. recode highgluc.=0 if glucose <= 140 recode highgluc.=1 if glucose > 140 & glucose !=. generate female=sex recode female (1=0) (2=1) tab highgluc female, exact
How does this look with cs?
Review: Try the cci command to obtain the OR Check your work with the cc command.
Comparing Proportions? Yes No Fisher’s Exact TestParametric Assumptions? Yes No Multiple Groups? Yes No YesNo ANOVA t-test Kruskall-Wallis Wilcoxon’s-Rank Sum
Two situations we haven’t covered… Severely skewed distributions Two continuous variables
Severely Skewed Variables
Solution: Make Some Categories For example: –Non-smokers –Light smokers (<20) –Moderate –Heavy > 40 Your task: Make a variable with these categories and do a statistical test to compare men to women.
E.g. for the recoding… generate smoke =. recode smoke.=1 if cigpday==0 recode smoke.=2 if cigpday > 0 & cigpday < 20 recode smoke.=3 if cigpday >=20 & cigpday <= 40 recode smoke.=4 if cigpday > 40 & cigpday !=. tab smoke, missing
Some output…
Two continuous variables E.g. diastolic blood pressure and BMI The place to start is always a scatter plot STATA calls this a “two way” graph
Start with Create
Select the two variables Submit
The command produced… Produced by our dialogue box… twoway (scatter diabp sysbp) The same dialogue box can fit a line… twoway (lfit diabp sysbp) This time select “line”
You can combine the two.. Try it! twoway (scatter diabp sysbp) (lfit diabp sysbp) To assess significance, use the regress command (can you find the menu option?) regress diabp sysbp
Note: the linear output Line: y = mx + b diabp = (sysbp)
(In Class) Assignment for Today Assess whether there is an association between systolic blood pressure and death (you need to decide how) We’ll define elevated systolic blood pressure as being > 140 mm of Hg. –What is the risk ratio for death for people with elevated systolic blood pressure? –Is the risk ratio statistically significant?