Average values. Measures of Association n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Sample size estimation
© 2011 Pearson Education, Inc
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Methods and Measurement in Psychology. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
Statistics for Health Care
Quantitative Genetics
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Introduction to Biostatistics
Descriptive Statistics: Part One Farrokh Alemi Ph.D. Kashif Haqqi M.D.
Inferential Statistics
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Cohort Study.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Multiple Choice Questions for discussion
Chapter 1: Introduction to Statistics
Medical statistics.
ESTIMATION. STATISTICAL INFERENCE It is the procedure where inference about a population is made on the basis of the results obtained from a sample drawn.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Statistics for Infection Control Practitioners Presented By: Shana O’Heron, MPH, CIC Infection Prevention and Management Associates.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.1 Descriptive Statistics, The Normal Distribution, and Standardization.
Dynamic Lines. Dynamic analysis n Health of people and activity of medical establishments change in time. n Studying of dynamics of the phenomena is very.
Chapter 1 Introduction to Statistics. Statistical Methods Were developed to serve a purpose Were developed to serve a purpose The purpose for each statistical.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Estimation of authenticity of results of statistical research (part II)
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Average Arithmetic and Average Quadratic Deviation.
Dynamic lines. Measures of Association n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable.
Medical Statistics as a science
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
Chapter Eight: Using Statistics to Answer Questions.
Authenticity of results of statistical research. The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95%
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Average Arithmetic and Average Quadratic Deviation.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Average values and their types. Averages n Averages are widely used for comparison in time, that allows to characterize the major conformities to the.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Introduction to Medical Statistics. Why Do Statistics? Extrapolate from data collected to make general conclusions about larger population from which.
Estimation of authenticity of results of statistical research.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Measuring of Correlation. Definition Correlation is a measure of mutual correspondence between two variables and is denoted by the coefficient of correlation.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Direct method of standardization of indices. Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Relative values and their types
Statistical analysis.
Doc.RNDr.Iveta Bedáňová, Ph.D.
Lecture 3 Biostatistics in practice of health protection
Measuring of Correlation
Reliability and Validity
Statistical analysis.
Direct method of standardization of indices
Relative Values.
Biostatistics?.
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

Average values

Measures of Association

n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable risk or Risk difference is a measure of absolute risk. It represents the excess risk of disease in those exposed taking into account the background rate of disease. The attributable risk is defined as the difference between the incidence rates in the exposed and non-exposed groups. n Population Attributable Risk is used to describe the excess rate of disease in the total study population of exposed and non-exposed individuals that is attributable to the exposure. n Number needed to treat (NNT) -The number of patients who would need to be treated to prevent one adverse outcome is often used to present the results of randomized trials.

Relative Values As a result of statistical research during processing of the statistical data of disease, mortality rate, lethality, etc. absolute numbers are received, which specify the number of the phenomena. Though absolute numbers have a certain cognitive values, but their use is limited.

Relative Values In order to acquire a level of the phenomenon, for comparison of a parameter in dynamics or with a parameter of other territory it is necessary to calculate relative values (parameters, factors) which represent result of a ratio of statistical numbers between itself. The basic arithmetic action at subtraction of relative values is division.

In medical statistics themselves the following kinds of relative parameters are used: n Extensive; n Intensive; n Relative intensity; n Visualization; n Correlation.

The extensive parameter, or a parameter of distribution, characterizes a parts of the phenomena (structure), that is it shows, what part from the general number of all diseases (died) is made with this or that disease which enters into total.

The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods to attack research problems as diverse as n determination of major risk factors for heart disease, lung disease and cancer n testing of new drugs to combat AIDS n evaluation of potential environmental factors harmful to human health, such as tobacco smoke, asbestos or pollutants

Applications of Biostatistics n Public health, including epidemiology, health services research, nutrition, and environmental health n Design and analysis of clinical trials in medicine n Genomics, population genetics, and statistical genetics in populations in order to link variation in genotype with a variation in phenotype. This has been used in agriculture to improve crops and farm animals. In biomedical research, this work can assist in finding candidates for gene alleles that can cause or influence predisposition to disease in human genetics n Ecology n Biological sequence analysis

Applications of Biostatistics Statistical methods are beginning to be integrated into n medical informatics n public health informatics n bioinformatics

Types of Data n Categorical data:  values belong to categories -Nominal data: there is no natural order to the categories e.g. blood groups -Ordinal data: there is natural order e.g. Adverse Events (Mild/Moderate/Severe/Life Threatening) -Binary data: there are only two possible categories e.g. alive/dead n Numerical data:  the value is a number (either measured or counted) -Continuous data: measurement is on a continuum e.g. height, age, haemoglobin -Discrete data: a “count” of events e.g. number of pregnancies

Measures of Frequency of Events n Incidence -The number of new events (e.g. death or a particular disease) that occur during a specified period of time in a population at risk for developing the events. n Incidence Rate -A term related to incidence that reports the number of new events that occur over the sum of time individuals in the population were at risk for having the event (e.g. events/person-years). n Prevalence -The number of persons in the population affected by a disease at a specific time divided by the number of persons in the population at the time.

Measures of Association n Relative risk and cohort studies -The relative risk (or risk ratio) is defined as the ratio of the incidence of disease in the exposed group divided by the corresponding incidence of disease in the unexposed group. n Odds ratio and case-control studies -The odds ratio is defined as the odds of exposure in the group with disease divided by the odds of exposure in the control group.

Measures of Association

n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable risk or Risk difference is a measure of absolute risk. It represents the excess risk of disease in those exposed taking into account the background rate of disease. The attributable risk is defined as the difference between the incidence rates in the exposed and non-exposed groups. n Population Attributable Risk is used to describe the excess rate of disease in the total study population of exposed and non-exposed individuals that is attributable to the exposure. n Number needed to treat (NNT) -The number of patients who would need to be treated to prevent one adverse outcome is often used to present the results of randomized trials.

Terms Used To Describe The Quality Of Measurements n Reliability is variability between subjects divided by inter-subject variability plus measurement error. n Validity refers to the extent to which a test or surrogate is measuring what we think it is measuring.

Measures Of Diagnostic Test Accuracy n Sensitivity is defined as the ability of the test to identify correctly those who have the disease. n Specificity is defined as the ability of the test to identify correctly those who do not have the disease. n Predictive values are important for assessing how useful a test will be in the clinical setting at the individual patient level. The positive predictive value is the probability of disease in a patient with a positive test. Conversely, the negative predictive value is the probability that the patient does not have disease if he has a negative test result. n Likelihood ratio indicates how much a given diagnostic test result will raise or lower the odds of having a disease relative to the prior probability of disease.

Measures Of Diagnostic Test Accuracy

Expressions Used When Making Inferences About Data n Confidence Intervals -The results of any study sample are an estimate of the true value in the entire population. The true value may actually be greater or less than what is observed. n Type I error (alpha) is the probability of incorrectly concluding there is a statistically significant difference in the population when none exists. n Type II error (beta) is the probability of incorrectly concluding that there is no statistically significant difference in a population when one exists. n Power is a measure of the ability of a study to detect a true difference.

Kaplan-Meier Survival Curves

Why Use Statistics?

Percentage of Specimens Testing Positive for RSV ( respiratory syncytial virus)

Descriptive Statistics

Distribution of Course Grades

The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95% of values fall between 2 SDs. Mean, Median, Mode 11 22

Hypertension Trial

30 Day % Mortality

95% Confidence Intervals

Types of Errors Truth Conclusion Power = 1- 

Using this parameter, it is possible to determine the structure of patients according to age, social status, etc. It is accepted to express this parameter in percentage, but it can be calculated and in parts per thousand case when the part of the given disease is small and at the calculation in percentage it is expressed as decimal fraction, instead of an integer.

The general formula of its calculation is the following: part × 100 total

n The intensive parameter characterizes frequency or distribution. n It shows how frequently the given phenomenon occurs in the given environment. n For example, how frequently there is this or that disease among the population or how frequently people are dying from this or that disease. n To calculate the intensive parameter, it is necessary to know the population or the contingent.

n General formula of the calculation is the following: phenomenon×100 (1000; ; ) environment

General mortality rate number of died during the year × 1000 number of the population

n Parameters of relative intensity represent a numerical ratio of two or several structures of the same elements of a set, which is studied. n They allow determining a degree of conformity (advantage or reduction) of similar attributes and are used as auxiliary reception; in those cases where it isn’t possible to receive direct intensive parameters or if it is necessary to measure a degree of a disproportion in structure of two or several close processes.

n The parameter of correlation characterizes the relation between diverse values. n For example, the parameter of average bed occupancy, nurses, etc. n The techniques of subtraction of the correlation parameter is the same as for intensive parameter, nevertheless the number of an intensive parameter stands in the numerator, is included into denominator, where as in a parameter of visualization of numerator and denominator different.

n The parameter of visualization characterizes the relation of any of comparable values to the initial level accepted for 100. This parameter is used for convenience of comparison, and also in case shows a direction of process (increase, reduction) not showing a level or the numbers of the phenomenon. n It can be used for the characteristic of dynamics of the phenomena, for comparison on separate territories, in different groups of the population, for the construction of graphic.

Consider a box containing chips or cards, each of which is numbered either 0 or 1. We want to take a sample from this box in order to estimate the percentage of the cards that are numbered with a 1. The population in this case is the box of cards, which we will call the population box. The percentage of cards in the box that are numbered with a 1 is the parameter π. SIMULATION

In the Harris study the parameter π is unknown. Here, however, in order to see how samples behave, we will make our model with a known percentage of cards numbered with a 1, say π = 60%. At the same time we will estimate π, pretending that we don’t know its value, by examining 25 cards in the box. SIMULATION

We take a simple random sample with replacement of 25 cards from the box as follows. Mix the box of cards; choose one at random; record it; replace it; and then repeat the procedure until we have recorded the numbers on 25 cards. Although survey samples are not generally drawn with replacement, our simulation simplifies the analysis because the box remains unchanged between draws; so, after examining each card, the chance of drawing a card numbered 1 on the following draw is the same as it was for the previous draw, in this case a 60% chance. SIMULATION

Let’s say that after drawing the 25 cards this way, we obtain the following results, recorded in 5 rows of 5 numbers: SIMULATION

Based on this sample of 25 draws, we want to guess the percentage of 1’s in the box. There are 14 cards numbered 1 in the sample. This gives us a sample percentage of p=14/25=.56=56%. If this is all of the information we have about the population box, and we want to estimate the percentage of 1’s in the box, our best guess would be 56%. Notice that this sample value p = 56% is 4 percentage points below the true population value π = 60%. We say that the random sampling error (or simply random error) is -4%. SIMULATION

An experiment is a procedure which results in a measurement or observation. The Harris poll is an experiment which resulted in the measurement (statistic) of 57%. An experiment whose outcome depends upon chance is called a random experiment. ERROR ANALYSIS

On repetition of such an experiment one will typically obtain a different measurement or observation. So, if the Harris poll were to be repeated, the new statistic would very likely differ slightly from 57%. Each repetition is called an execution or trial of the experiment. ERROR ANALYSIS

Suppose we made three more series of draws, and the results were + 16%, + 0%, and + 12%. The random sampling errors of the four simulations would then average out to: ERROR ANALYSIS

n Note that the cancellation of the positive and negative random errors results in a small average. Actually with more trials, the average of the random sampling errors tends to zero. ERROR ANALYSIS

So in order to measure a “typical size” of a random sampling error, we have to ignore the signs. We could just take the mean of the absolute values (MA) of the random sampling errors. For the four random sampling errors above, the MA turns out to be ERROR ANALYSIS

The MA is difficult to deal with theoretically because the absolute value function is not differentiable at 0. So in statistics, and error analysis in general, the root mean square (RMS) of the random sampling errors is generally used. For the four random sampling errors above, the RMS is ERROR ANALYSIS

The RMS is a more conservative measure of the typical size of the random sampling errors in the sense that MA ≤ RMS. ERROR ANALYSIS

For a given experiment the RMS of all possible random sampling errors is called the standard error (SE). For example, whenever we use a random sample of size n and its percentages p to estimate the population percentage π, we have ERROR ANALYSIS

Dynamic analysis n Health of people and activity of medical establishments change in time. n Studying of dynamics of the phenomena is very important for the analysis of a state of health and activity of system of public health services.

Example of a dynamic line YearBed occupancy (days)

Parameters applied for analysis of changes of a phenomenon n Rate of growth –relation of all numbers of dynamic lines to the previous level accepted for 100 %.

Parameters applied for analysis of changes of a phenomenon n Pure gain – difference between next and previous numbers of dynamic lines.

Parameters applied for analysis of changes of a phenomenon n Rate of gain – relation of the pure gain to previous number.

Parameters applied for analysis of changes of a phenomenon n Parameter of visualization — relation of all numbers of dynamic lines to the first level, which one starts to 100%.

Measures of Association

n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable risk or Risk difference is a measure of absolute risk. It represents the excess risk of disease in those exposed taking into account the background rate of disease. The attributable risk is defined as the difference between the incidence rates in the exposed and non-exposed groups. n Population Attributable Risk is used to describe the excess rate of disease in the total study population of exposed and non-exposed individuals that is attributable to the exposure. n Number needed to treat (NNT) -The number of patients who would need to be treated to prevent one adverse outcome is often used to present the results of randomized trials.

Relative Values As a result of statistical research during processing of the statistical data of disease, mortality rate, lethality, etc. absolute numbers are received, which specify the number of the phenomena. Though absolute numbers have a certain cognitive values, but their use is limited.

Relative Values In order to acquire a level of the phenomenon, for comparison of a parameter in dynamics or with a parameter of other territory it is necessary to calculate relative values (parameters, factors) which represent result of a ratio of statistical numbers between itself. The basic arithmetic action at subtraction of relative values is division.

In medical statistics themselves the following kinds of relative parameters are used: n Extensive; n Intensive; n Relative intensity; n Visualization; n Correlation.

The extensive parameter, or a parameter of distribution, characterizes a parts of the phenomena (structure), that is it shows, what part from the general number of all diseases (died) is made with this or that disease which enters into total.

Using this parameter, it is possible to determine the structure of patients according to age, social status, etc. It is accepted to express this parameter in percentage, but it can be calculated and in parts per thousand case when the part of the given disease is small and at the calculation in percentage it is expressed as decimal fraction, instead of an integer.

The general formula of its calculation is the following: part × 100 total

n The intensive parameter characterizes frequency or distribution. n It shows how frequently the given phenomenon occurs in the given environment. n For example, how frequently there is this or that disease among the population or how frequently people are dying from this or that disease. n To calculate the intensive parameter, it is necessary to know the population or the contingent.

n General formula of the calculation is the following: phenomenon×100 (1000; ; ) environment

General mortality rate number of died during the year × 1000 number of the population

Let’s say that after drawing the 25 cards this way, we obtain the following results, recorded in 5 rows of 5 numbers: SIMULATION