Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.

Similar presentations


Presentation on theme: "Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health."— Presentation transcript:

1 Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences & Department of Psychiatry patten@ucalgary.ca

2 Objective 1: Upon completion, students will be (more) able to …. Read, Understand, Critically interpret, The statistical portions of articles in the medical literature.

3 Objective 2: Given a dataset, students will be able to …. Select appropriate statistical procedures for basic analyses Implement these analyses using typical statistical software (we will use Stata)

4 Objective 3: Upon completion, students will be (more) able to …. Define and interpret specialized parameters found in the clinical epidemiology literature, for example… –Sensitivity –Specificity –Predictive values The statistical portions of articles in the medical literature.

5 Topics for Session 1: Why do we need statistics? Calculating a 95% confidence interval for a proportion

6 Why Do We Need Statistics? We don’t always need statistics. However, statistics are the most powerful tools for answering questions in medicine, for example…. –Determining whether treatments work –Comparing different treatments –Identifying the causes of diseases

7

8 Why Do We Need Statistics? Statistics are the most powerful tools for answering questions in medicine, for example…. –Determining whether treatments work –Comparing different treatments –Identifying the causes of diseases

9 The Power of Statistics Where does it come from? –Fundamentally, from the laws of probability A familiar example: –Flipping one coin versus flipping many coins

10 Coin Flipping First, I’ll flip a coin and you can try to guess what I got. Then, I’ll ask you to flip a coin and I’ll guess how many you get

11 Coin Tossing Simulator

12 The Power of Statistics A set of observations can allow us to make statements of a sort that we generally cannot make based on a single observation –E.g. how well does a treatment work? Larger and larger sets of observations allow us to make stronger and stronger statements

13 Formal Terminology Source of the observations are a sample The sample is a subset of a population The observations are data The collection of observations are a dataset

14 Inference Generally, “A conclusion reached on the basis of evidence and reasoning.” Statistical, “Making a statement about a population based on observations from a sample (a dataset)”

15 Stata’s Graphical Interface

16 Lets do a Study! We’ll select a sample of half the class Tabulate the frequency of male/female Estimate the proportion of women

17 Select a Sample We’ll consider the class, of ‘N’ students as our population The first step in obtaining a sample is to have a sampling frame – a list of the population Lets make one in Stata For notation, I’ll type Stata commands in red. These go into the command window To execute a command, press Enter:

18 Command Menus (an alternative to the command window) 1 2 I’ll use screen captures and add red numbers if things need to be done in more than one step.

19 Use this drop-down variable to select the new variable 1 2 Click OK

20 Let’s create a sampling frame in Stata. In the command window, tell Stata that we want to create a list with N rows: set obs 30 (instead of 40, we’ll use the # in the class) generate id = _n

21 Let’s create a sampling frame in Stata? We’ll start by typing into the command window..

22 In the command window, tell Stata that we want to create a list with N* rows: set obs 30 generate id = _n * instead of 30, we’ll use the # in the class

23 The data viewer

24 Now, lets sample half of these sample 50 Click on the data viewer to see our sample

25 Data Collection From each member of our sample, we’ll record the person’s sex Male = 0 Female = 1 Let’s create a variable called “sex” in which to enter our data generate sex =.

26 The data viewer Look at the Dataset!

27 The data editor Enter the Data

28 Highlight a cell (click on it) and start entering data!

29 Closing the Data Editor Click Exit

30 Making a Table At this point, we could make a table to show the frequency of men and women in our sample, 1 2 3 4

31 Use this drop-down variable to select the new variable 1 2 Click OK

32 A few things to note…. Our table doesn’t look so great The command that our menus created is executed by Stata (see the “. tab var2” in the output window) We can do the same thing by typing: tab var2 in the command line

33 Command Line

34 Our Table is Still Very Ugly (not exactly, but something like this)

35 Renaming a Variable 1 2

36 The Variables Manager 1 Select “var2” (click) 2 Type “sex” here, under Name

37 Using the Command Window Another way to do it is just to type into the command window rename var2 sex

38 Our Table is Still Very Ugly (not exactly, but something like this)

39 Creating a Label 1 23 4

40 Click Here

41 Creating a Label In Stata, you need to give your label a name, Our values are 0 and 1 Our labels are men and women Click Here 1 2 3 4

42 Creating a Label After adding women, make add a second value-label for men. Our labels are men and women 1 2 3 4

43 Attaching the Label 1 23 4

44 Assigning the Label 1 2 3

45 A Good Looking Table

46 Saving a Dataset Click Here To Save

47 Let’s do Statistics! 1 23 We need to enter the Statistics menu 4

48 Entering the Command 1 2 3 4

49 Our Output What is the 95% confidence interval? What does it mean? What kind of statement can be made about the population (our class)? Is the statement true?

50

51 Introducing the “do file” editor 1 23

52 Executing a “do” file

53

54 Something more Realistic Go to “www.ucalgary.ca/~patten” www.ucalgary.ca/~patten www.ucalgary.ca/~patten Scroll to the bottom. Right click to download the two files described as being “for PGME Students” Save them on your desktop

55 Open the Datafile

56 Explore the Datafile Click on the data browser in Stata Type describe into the command bar Open the data documentation file Note that sex is not labeled properly and that it is coded differently than in our example

57 Recode the Sex Variable as 0/1 Let’s use the command window: generate female = sex recode female 1=0 2=1 Double check you’ve done it right: tab female sex

58 Your Task… Create a good label for this new variable Make a good table of the new variable Create a 95% exact binomial confidence interval for the proportion of females in Framingham Interpret what this 95% confidence interval means Create a do file that will do all of these steps automatically

59 Creating a Log File 1 23

60 Additional Tasks Create a log file for your calculation of the proportion of women in Framingham, and an associated 95% confidence interval.

61 Additional Tasks Calculate an estimate of the proportion of people in Framingham with greater than high school education (and 95% confidence interval) – generate and save a log file that shows this calculation.


Download ppt "Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health."

Similar presentations


Ads by Google