Lecture 3 Biostatistics in practice of health protection

Slides:



Advertisements
Similar presentations
CHAPTER 11: Sampling Distributions
Advertisements

Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
CHAPTER 11: Sampling Distributions
Chapter 5 Sampling Distributions
Chapter 12 Sample Surveys
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Estimation of authenticity of results of statistical research (part II)
Stat 1510: Sampling Distributions
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
CONFIDENCE STATEMENT MARGIN OF ERROR CONFIDENCE INTERVAL 1.
Chapter 7 Sampling Distributions Target Goal: DISTINGUISH between a parameter and a statistic. DEFINE sampling distribution. DETERMINE whether a statistic.
Authenticity of results of statistical research. The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95%
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
C1, L1, S1 Chapter 1 What is Statistics ?. C1, L1, S2 Chapter 1 - What is Statistics? A couple of definitions: Statistics is the science of data. Statistics.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Chapters 1 & 2 An Overview of Statistics Classifying Data Critical Thinking 1 Larson/Farber 4th ed.
Sampling Distributions
Chapter 12 Sample Surveys.
Learning Objectives : After completing this lesson, you should be able to: Describe key data collection methods Know key definitions: Population vs. Sample.
Topic 1: Samples and Populations
Topic 8: Sampling Distributions
Experiments vs. Observational Studies vs. Surveys and Simulations
Sample Surveys.
Overview of probability and statistics
This will help you understand the limitations of the data and the uses to which it can be put (and the confidence with which you can put it to those.
Sampling Why use sampling? Terms and definitions
CHAPTER 7 Sampling Distributions
Elementary Statistics
Part III – Gathering Data
Unit 1 Introduction to Business
Introduction to Statistics
Relative Values.
SAMPLING DISTRIBUTIONS
Distribution of the Sample Means
Chapter 5 Sampling Distributions
CHAPTER 12 Sample Surveys.
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Business and Management Research
Information from Samples
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2018 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Chapter 4 – Part 3.
STA 291 Spring 2008 Lecture 20 Dustin Lueker.
Econ 3790: Business and Economics Statistics
When Can We Draw Conclusions?
Chapter 5 Sampling Distributions
Daniela Stan Raicu School of CTI, DePaul University
2.1 Simple Random Sampling
1.1 What Is/Are Statistics?
WARM – UP Use LINE 5 of the random digit table. 30. The World Series.
Use your Chapter 1 notes to complete the following warm-up.
Overview of Statistics
Daniela Stan Raicu School of CTI, DePaul University
Chapter 1 Statistics: The Art and Science of Learning from Data
Chapter 1: Statistics.
Business and Management Research
Sampling Distributions
Chapter 7: Sampling Distributions
MATH 2311 Section 6.1.
Chapter 7: Sampling Distributions
Sample Surveys Idea 1: Examine a part of the whole.
Unit 1 Day 5: Parameter vs. Statistic
CHAPTER 11: Sampling Distributions
Chapter 9: Sampling Distributions
SAMPLING DISTRIBUTIONS
Lecture Slides Elementary Statistics Twelfth Edition
What do Samples Tell Us Variability and Bias.
Applied Biostatistics
Chapter 5: Sampling Distributions
Presentation transcript:

Lecture 3 Biostatistics in practice of health protection

Biostatistics Commonly the word statistics means the arranging of data into charts, tables, and graphs along with the computations of various descriptive numbers about the data. This is a part of statistics, called descriptive statistics, but it is not the most important part.

SAMPLING AND ESTIMATION Let us take Louis Harris and Associates for an example It conducts polls on various topics, either face-to-face, by telephone, or by the internet. In one survey on health trends of adult Americans conducted in 1991 they contacted 1;256 randomly selected adults by phone and asked them questions about diet, stress management, seat belt use, etc.

SAMPLING AND ESTIMATION One of the questions asked was “Do you try hard to avoid too much fat in your diet?” They reported that 57% of the people responded YES to this question, which was a 2% increase from a similar survey conducted in 1983. The article stated that the margin of error of the study was plus or minus 3%.

SAMPLING AND ESTIMATION This is an example of an inference made from incomplete information. The group under study in this survey is the collection of adult Americans, which consists of more than 200 million people. This is called the population.

SAMPLING AND ESTIMATION If every individual of this group were to be queried, the survey would be called a census. Yet of the millions in the population, the Harris survey examined only 1;256 people. Such a subset of the population is called a sample.

SAMPLING AND ESTIMATION Once every ten years the U.S. Census Bureau conducts a survey of the entire U.S. population. The year 2000 census cost the government billions of dollars. For the purposes of following health trends, it’s not practical to conduct a census. It would be too expensive, too time consuming, and too intrusive of people’s lives.

SAMPLING AND ESTIMATION We shall see that, if done carefully, 1;256 people are sufficient to make reasonable estimates of the opinion of all adult Americans. Samuel Johnson was aware that there is useful information in a sample. He said that you don’t have to eat the whole ox to know that the meat is tough.

SAMPLING AND ESTIMATION The people or things in a population are called units. If the units are people, they are sometimes called subjects. A characteristic of a unit (such as a person’s weight, eye color, or the response to a Harris Poll question) is called a variable.

SAMPLING AND ESTIMATION A number derived from a sample is called a statistic, whereas a number derived from the population is called a parameter.

SAMPLING AND ESTIMATION Parameters are is usually denoted by Greek letters, such as π, for population percentage of a dichotomous variable, or μ, for population mean of a quantitative variable. For the Harris study the sample percentage p = 57% is a statistic. It is not the (unknown) population percentage π, which is the percentage that we would obtain if it were possible to ask the same question of the entire population.

SAMPLING AND ESTIMATION

SAMPLING AND ESTIMATION Inferences we make about a population based on facts derived from a sample are uncertain. The statistic p is not the same as the parameter π. In fact, if the study had been repeated, even if it had been done at about the same time and in the same way, it most likely would have produced a different value of p, whereas π would still be the same. The Harris study acknowledges this variability by mentioning a margin of error of ± 3%.

ERROR ANALYSIS An experiment is a procedure which results in a measurement or observation. The Harris poll is an experiment which resulted in the measurement (statistic) of 57%. An experiment whose outcome depends upon chance is called a random experiment.

ERROR ANALYSIS On repetition of such an experiment one will typically obtain a different measurement or observation. So, if the Harris poll were to be repeated, the new statistic would very likely differ slightly from 57%. Each repetition is called an execution or trial of the experiment.

ERROR ANALYSIS Suppose we made three more series of draws, and the results were + 16%, + 0%, and + 12%. The random sampling errors of the four simulations would then average out to:

ERROR ANALYSIS Note that the cancellation of the positive and negative random errors results in a small average. Actually with more trials, the average of the random sampling errors tends to zero.

ERROR ANALYSIS So in order to measure a “typical size” of a random sampling error, we have to ignore the signs. We could just take the mean of the absolute values (MA) of the random sampling errors. For the four random sampling errors above, the MA turns out to be

ERROR ANALYSIS The MA is difficult to deal with theoretically because the absolute value function is not differentiable at 0. So in statistics, and error analysis in general, the root mean square (RMS) of the random sampling errors is generally used. For the four random sampling errors above, the RMS is

ERROR ANALYSIS The RMS is a more conservative measure of the typical size of the random sampling errors in the sense that MA ≤ RMS.

ERROR ANALYSIS For a given experiment the RMS of all possible random sampling errors is called the standard error (SE). For example, whenever we use a random sample of size n and its percentages p to estimate the population percentage π, we have