Download presentation
Presentation is loading. Please wait.
1
6/2/20151: Measurement & Sampling1 1: Measurement and Sampling What is biostatistics? What is measurement? How do we sample populations?
2
6/2/20151: Measurement & Sampling2 HS 167 Logistics Syllabus: materials (text, lab workbook, calculator) Calendar and assignments are on www.sjsu.edu/biostat → click HS167 (become familiar with Web site)www.sjsu.edu/biostat Exam1 = 10/9, Exam2 = 11/13, Final = Thur 12/13 2:45 Lab 0 and Lab 1 (Tu and We lab may have additional time to complete Lab 1) Text (reading): pp. 1 –10, 15 – 19 (note vocabulary on p. 11) Exercises: 1.1 – 1.6, 1.8, 1.9, 2.1 – 2.3, 2.11 – 2.13 [due at beginning of next lecture] Yahoo group: send email to hs167-F07-subscribe@yahoogroups.com hs167-F07-subscribe@yahoogroups.com Academic integrity (do your own work) Odd-numbered exercises and lab work → OK to get help from friends Even numbered exercises & exams → do NOT get help from friends How to get a good grade: Attend all classes and labs (attendance required) Stay on task Read text (listed to Nancy) Do Lab & HWs diligently Do not cut corners
3
6/2/20151: Measurement & Sampling3 Biostatistics is not merely a compilation of computational techniques is a way of learning from data is concerned with all many elements of study design and analysis (not just computations) requires vocabulary) requires more judgment than math (pay attention to vocabulary) is statistics applied to biological and health problems
4
6/2/20151: Measurement & Sampling4 Biostatistics involves A data detective element Uncovering patterns and clues This is a combination of exploratory data analysis (EDA) and descriptive statistics A data judge element Confirmation of clues This often requires inferential methods
5
6/2/20151: Measurement & Sampling5 Measurement PMeasurement ≡ “assigning of numbers and codes according to prior-set rules” PThree types of statistical measurements: PCategorical ≡ classify observations into named (nominal) categories e.g., HIV classified as “positive” or “negative” POrdinal ≡ ranked categories e.g., OPINION ranked 5 = strongly agree, 4 = agree, 3 = neutral, and so on PQuantitative ≡ numbers with equal spacing e.g., AGE in years e.g., BLOOD_PRESSURE in mm Hg
6
6/2/20151: Measurement & Sampling6 Illustrative Example: Weight Change and Heart Disease Goal: to determine the effect of weight change on coronary heart disease risk 115,818 women 30- to 55-years of age, free of CHD Body mass index (BMI, kg/m 2 ) determined at entry to study Body weight determined as of age 18 Subjects followed for 14 years Number of CHD onsets (fatal and nonfatal) counted (1292 cases) Source: Willett et al., 1995
7
6/2/20151: Measurement & Sampling7 Illustrative Example (cont.) Smoker or nonsmoker Family history of heart disease (yes or no) Non-smoker, light-smoker, moderate smoker, heavy smoker BMI (kgs/m 3 ) Age (years) Weight presently Weight at age 18 Quantitative Categorical Variables Ordinal
8
6/2/20151: Measurement & Sampling8 Variable, Value, Observation Observation the unit upon which measurements are made PCan be an individual (e.g., a person) PCan be an aggregate of individuals (e.g., a region) Variable the generic thing we measure Pe.g., AGE of a person Pe.g., HIV status of a person Value a realized measurement Pe.g.,“27” Pe.g.,“positive”
9
6/2/20151: Measurement & Sampling9 Data Structure (Forms) Data Collection Form Var1 (ID)1 Var2 (AGE) 27 Var3 (SEX) F Var4 (HIV)Y Var5(KAPOSISARC)Y Var6 (REPORTDATE)4/25/89 Var7 (OPPORTUNIS) N Observation 3 Observation 4 Observation 1 Observation 2
10
6/2/20151: Measurement & Sampling10 U.S. Census Form
11
6/2/20151: Measurement & Sampling11 Data Structure (Table) ID AGE SEX HIV KAPOSISARC REPORTDATE OPPORTUNIS --- --- --- --- ---------- ---------- ---------- 1 27 F Y Y 04/25/89 N 2 30 F Y N 09/11/89 Y 3 21 F Y Y 01/12/89 N 4 30 M Y Y 10/08/89 Y Observation Variable Value Observations → rows Variables → columns Values → cells
12
6/2/20151: Measurement & Sampling12 Illustrative Example: Cigarette Consumption and Lung Cancer Note: Unit of observation in this data set are regions (not people) Variables: country = name of country/region cig1930 = per capita cigarette consumption, 1930 mortalit = lung cancer deaths per 100,000 in 1950
13
6/2/20151: Measurement & Sampling13 Data Quality An analysis is only as good as its data GIGO ≡ garbage in, garbage out Does a variable measure what it purports to? Validity = freedom from systematic error Objectivity = seeing things as they are without making it conform to your worldview Discussion on avoiding bias when questioning e.g., consider the word “jam”
14
6/2/20151: Measurement & Sampling14 Ethos: Which do you choose? The difference is intention and method: BS has a predetermined outcome. Truth is earnest in its intent and does not bend the facts to a predetermined outcome. Blackburn, S. (2005). Truth. Oxford Univ. Press Frankfurt, H. G. (2005). On Bullshit. Princeton University Press
15
6/2/20151: Measurement & Sampling15 Truth Versus Perception I cannot give any scientist of any age any better advice than this: The intensity of the conviction that a hypothesis is true has no bearing on whether it is true or not. Peter Medawar 1915-1987 Plato’s Allegory of the Cave We observe shadows on the wall. The truth lies outside.
16
6/2/20151: Measurement & Sampling16 Two Types of Statistical Studies Surveys Surveys – quantify population characteristics e.g., % of population that is overweight e.g., expected life span Comparative Studies Comparative Studies – determine relationships between variables e.g., relationship between weight gain and heart disease risk e.g., relationship between alcohol consumption and esophageal cancer risk We start by considering survey sampling
17
6/2/20151: Measurement & Sampling17 Sampling for a Survey We seldom (if never) study an entire population Take a subset (sample) of the population Use characteristics of the sample to infer population characteristics Select a probability sample chance determines which individuals are selected Avoid non-probability samples Discuss volunteer bias as an example
18
6/2/20151: Measurement & Sampling18 Simple Random Sample (SRS) SRS (definition) = every possible sample from the population has the same probability this is the most basic type of probability sample SRSs have sampling independence selection of one individual does not influence selection of any other SRSs can be done with replacement or without replacement (both methods are usually valid) Sampling fraction = n ÷ N = probability of selection where n sample size N population size
19
6/2/20151: Measurement & Sampling19 SRS Method Compile census listing (sampling frame) individuals numbered: 1, 2,..., N Generate n random numbers between 1 and N Can be done with random number generator (lab) or with table of random digits Select individuals based on random number list You will take a SRS in lab this week
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.