Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.

Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences & Department of Psychiatry patten@ucalgary.ca

Objective 1: Upon completion, students will be (more) able to …. Read, Understand, Critically interpret, The statistical portions of articles in the medical literature.

Objective 2: Given a dataset, students will be able to …. Select appropriate statistical procedures for basic analyses Implement these analyses using typical statistical software (we will use Stata)

Objective 3: Upon completion, students will be (more) able to …. Define and interpret specialized parameters found in the clinical epidemiology literature, for example… –Sensitivity –Specificity –Predictive values The statistical portions of articles in the medical literature.

Topics for Session 1: Why do we need statistics? Calculating a 95% confidence interval for a proportion

Why Do We Need Statistics? We don’t always need statistics. However, statistics are the most powerful tools for answering questions in medicine, for example…. –Determining whether treatments work –Comparing different treatments –Identifying the causes of diseases

Why Do We Need Statistics? Statistics are the most powerful tools for answering questions in medicine, for example…. –Determining whether treatments work –Comparing different treatments –Identifying the causes of diseases

The Power of Statistics Where does it come from? –Fundamentally, from the laws of probability A familiar example: –Flipping one coin versus flipping many coins

Coin Flipping First, I’ll flip a coin and you can try to guess what I got. Then, I’ll ask you to flip a coin and I’ll guess how many you get

Coin Tossing Simulator

The Power of Statistics A set of observations can allow us to make statements of a sort that we generally cannot make based on a single observation –E.g. how well does a treatment work? Larger and larger sets of observations allow us to make stronger and stronger statements

Formal Terminology Source of the observations are a sample The sample is a subset of a population The observations are data The collection of observations are a dataset

Inference Generally, “A conclusion reached on the basis of evidence and reasoning.” Statistical, “Making a statement about a population based on observations from a sample (a dataset)”

Stata’s Graphical Interface

Lets do a Study! We’ll select a sample of half the class Tabulate the frequency of male/female Estimate the proportion of women

Select a Sample We’ll consider the class, of ‘N’ students as our population The first step in obtaining a sample is to have a sampling frame – a list of the population Lets make one in Stata For notation, I’ll type Stata commands in red. These go into the command window To execute a command, press Enter:

Command Menus (an alternative to the command window) 1 2 I’ll use screen captures and add red numbers if things need to be done in more than one step.

Use this drop-down variable to select the new variable 1 2 Click OK

Let’s create a sampling frame in Stata. In the command window, tell Stata that we want to create a list with N rows: set obs 30 (instead of 40, we’ll use the # in the class) generate id = _n

Let’s create a sampling frame in Stata? We’ll start by typing into the command window..

In the command window, tell Stata that we want to create a list with N* rows: set obs 30 generate id = _n * instead of 30, we’ll use the # in the class

The data viewer

Now, lets sample half of these sample 50 Click on the data viewer to see our sample

Data Collection From each member of our sample, we’ll record the person’s sex Male = 0 Female = 1 Let’s create a variable called “sex” in which to enter our data generate sex =.

The data viewer Look at the Dataset!

The data editor Enter the Data

Highlight a cell (click on it) and start entering data!

Closing the Data Editor Click Exit

Making a Table At this point, we could make a table to show the frequency of men and women in our sample, 1 2 3 4

Use this drop-down variable to select the new variable 1 2 Click OK

A few things to note…. Our table doesn’t look so great The command that our menus created is executed by Stata (see the “. tab var2” in the output window) We can do the same thing by typing: tab var2 in the command line

Command Line

Our Table is Still Very Ugly (not exactly, but something like this)

Renaming a Variable 1 2

The Variables Manager 1 Select “var2” (click) 2 Type “sex” here, under Name

Using the Command Window Another way to do it is just to type into the command window rename var2 sex

Our Table is Still Very Ugly (not exactly, but something like this)

Creating a Label 1 23 4

Click Here

Creating a Label In Stata, you need to give your label a name, Our values are 0 and 1 Our labels are men and women Click Here 1 2 3 4

Creating a Label After adding women, make add a second value-label for men. Our labels are men and women 1 2 3 4

Attaching the Label 1 23 4

Assigning the Label 1 2 3

A Good Looking Table

Saving a Dataset Click Here To Save

Let’s do Statistics! 1 23 We need to enter the Statistics menu 4

Entering the Command 1 2 3 4

Our Output What is the 95% confidence interval? What does it mean? What kind of statement can be made about the population (our class)? Is the statement true?

Introducing the “do file” editor 1 23

Executing a “do” file

Something more Realistic Go to “www.ucalgary.ca/~patten” www.ucalgary.ca/~patten www.ucalgary.ca/~patten Scroll to the bottom. Right click to download the two files described as being “for PGME Students” Save them on your desktop

Open the Datafile

Explore the Datafile Click on the data browser in Stata Type describe into the command bar Open the data documentation file Note that sex is not labeled properly and that it is coded differently than in our example

Recode the Sex Variable as 0/1 Let’s use the command window: generate female = sex recode female 1=0 2=1 Double check you’ve done it right: tab female sex

Your Task… Create a good label for this new variable Make a good table of the new variable Create a 95% exact binomial confidence interval for the proportion of females in Framingham Interpret what this 95% confidence interval means Create a do file that will do all of these steps automatically

Creating a Log File 1 23

Additional Tasks Create a log file for your calculation of the proportion of women in Framingham, and an associated 95% confidence interval.

Additional Tasks Calculate an estimate of the proportion of people in Framingham with greater than high school education (and 95% confidence interval) – generate and save a log file that shows this calculation.

Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.

Similar presentations

Presentation on theme: "Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.

Similar presentations

Presentation on theme: "Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health."— Presentation transcript:

Similar presentations

About project

Feedback