Average values and their types. Averages n Averages are widely used for comparison in time, that allows to characterize the major conformities to the.

Slides:



Advertisements
Similar presentations
Estimation of authenticity of results of statistical research (part I)
Advertisements

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 4. Measuring Averages.
SUMMARIZING DATA: Measures of variation Measure of Dispersion (variation) is the measure of extent of deviation of individual value from the central value.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Calculating & Reporting Healthcare Statistics
Slides by JOHN LOUCKS St. Edward’s University.
Edpsy 511 Homework 1: Due 2/6.
Today Concepts underlying inferential statistics
Measures of Central Tendency
Introduction to Statistics February 21, Statistics and Research Design Statistics: Theory and method of analyzing quantitative data from samples.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
6 - 1 Basic Univariate Statistics Chapter Basic Statistics A statistic is a number, computed from sample data, such as a mean or variance. The.
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Multiple Choice Questions for discussion
Chapter 3 – Descriptive Statistics
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Medical statistics.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Graphic representations in statistics
Average values. Measures of Association n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable.
Chapter Twelve Census: Population canvass - not really a “sample” Asking the entire population Budget Available: A valid factor – how much can we.
Estimation of authenticity of results of statistical research.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Graphic representations in statistics. Graphic representation and graphic analysis n Graphic representations are used for evident representation of statistical.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
Dynamic Lines. Dynamic analysis n Health of people and activity of medical establishments change in time. n Studying of dynamics of the phenomena is very.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Estimation of authenticity of results of statistical research (part II)
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Average Arithmetic and Average Quadratic Deviation.
Dynamic lines. Measures of Association n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable.
NORMAL DISTRIBUTION Normal curve Smooth, Bell shaped, bilaterally symmetrical curve Total area is =1 Mean is 0 Standard deviation=1 Mean, median, mode.
Medical Statistics as a science
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
Chapter Eight: Using Statistics to Answer Questions.
Authenticity of results of statistical research. The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95%
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Average Arithmetic and Average Quadratic Deviation.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Quality Control: Analysis Of Data Pawan Angra MS Division of Laboratory Systems Public Health Practice Program Office Centers for Disease Control and.
1.  In the words of Bowley “Dispersion is the measure of the variation of the items” According to Conar “Dispersion is a measure of the extent to which.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Estimation of authenticity of results of statistical research.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Measuring of Correlation. Definition Correlation is a measure of mutual correspondence between two variables and is denoted by the coefficient of correlation.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Direct method of standardization of indices. Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Methods of Presenting and Interpreting Information Class 9.
Relative values and their types
Statistical analysis.
Average Arithmetic and Average Quadratic Deviation
Variety of characteristic
Measuring of Correlation
Statistical analysis.
Direct method of standardization of indices
Relative Values.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Graphic representations in statistics
Data analysis and basic statistics
Sampling Distributions
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

Average values and their types

Averages n Averages are widely used for comparison in time, that allows to characterize the major conformities to the law of development of the phenomenon. So, for example, conformity to the law of growth increase of certain age children finds the expression in the generalized indices of physical development. Conformities to the law of dynamics (increase or diminishment) of pulse rate, breathing, clinical parameters at the certain diseases find the display in statistical indices which represent the physiology parameters of organism and other.

Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to outlying data n Mode:  most commonly occurring value n Range:  the difference between the largest observation and the smallest n Interquartile range:  the spread of the data  commonly used for skewed data n Standard deviation:  a single number which measures how much the observations vary around the mean n Symmetrical data:  data that follows normal distribution  (mean=median=mode)  report mean & standard deviation & n n Skewed data:  not normally distributed  (mean  median  mode)  report median & IQ Range

Average Values n Limit is it is the meaning of edge variant in a variation row lim = Vmin Vmax

Average Values n Amplitude is the difference of edge variant of variation row Am = Vmax - Vmin

Average Values n Average quadratic deviation characterizes dispersion of the variants around an ordinary value (inside structure of totalities).

Average quadratic deviation σ = simple arithmetical method

Average quadratic deviation d = V - M genuine declination of variants from the true middle arithmetic

Average quadratic deviation σ = i method of moments

Average quadratic deviation is needed for: 1. Estimations of typicalness of the middle arithmetic (М is typical for this row, if σ is less than 1/3 of average) value. 2. Getting the error of average value. 3. Determination of average norm of the phenomenon, which is studied (М±1σ), sub norm (М±2σ) and edge deviations (М±3σ). 4. For construction of sigmal net at the estimation of physical development of an individual.

Average quadratic deviation This dispersion a variant around of average characterizes an average quadratic deviation (  )

n Coefficient of variation is the relative measure of variety; it is a percent correlation of standard deviation and arithmetic average.

Terms Used To Describe The Quality Of Measurements n Reliability is variability between subjects divided by inter-subject variability plus measurement error. n Validity refers to the extent to which a test or surrogate is measuring what we think it is measuring.

Measures Of Diagnostic Test Accuracy n Sensitivity is defined as the ability of the test to identify correctly those who have the disease. n Specificity is defined as the ability of the test to identify correctly those who do not have the disease. n Predictive values are important for assessing how useful a test will be in the clinical setting at the individual patient level. The positive predictive value is the probability of disease in a patient with a positive test. Conversely, the negative predictive value is the probability that the patient does not have disease if he has a negative test result. n Likelihood ratio indicates how much a given diagnostic test result will raise or lower the odds of having a disease relative to the prior probability of disease.

Measures Of Diagnostic Test Accuracy

Expressions Used When Making Inferences About Data n Confidence Intervals -The results of any study sample are an estimate of the true value in the entire population. The true value may actually be greater or less than what is observed. n Type I error (alpha) is the probability of incorrectly concluding there is a statistically significant difference in the population when none exists. n Type II error (beta) is the probability of incorrectly concluding that there is no statistically significant difference in a population when one exists. n Power is a measure of the ability of a study to detect a true difference.

Multivariable Regression Methods n Multiple linear regression is used when the outcome data is a continuous variable such as weight. For example, one could estimate the effect of a diet on weight after adjusting for the effect of confounders such as smoking status. n Logistic regression is used when the outcome data is binary such as cure or no cure. Logistic regression can be used to estimate the effect of an exposure on a binary outcome after adjusting for confounders.

Survival Analysis n Kaplan-Meier analysis measures the ratio of surviving subjects (or those without an event) divided by the total number of subjects at risk for the event. Every time a subject has an event, the ratio is recalculated. These ratios are then used to generate a curve to graphically depict the probability of survival. n Cox proportional hazards analysis is similar to the logistic regression method described above with the added advantage that it accounts for time to a binary event in the outcome variable. Thus, one can account for variation in follow-up time among subjects.

Kaplan-Meier Survival Curves

Why Use Statistics?

Descriptive Statistics n Identifies patterns in the data n Identifies outliers n Guides choice of statistical test

Percentage of Specimens Testing Positive for RSV ( respiratory syncytial virus)

Descriptive Statistics

Distribution of Course Grades

Describing the Data with Numbers Measures of Dispersion RANGE STANDARD DEVIATION SKEWNESS

Measures of Dispersion RANGE highest to lowest values STANDARD DEVIATION how closely do values cluster around the mean value SKEWNESS refers to symmetry of curve

Measures of Dispersion RANGE highest to lowest values STANDARD DEVIATION how closely do values cluster around the mean value SKEWNESS refers to symmetry of curve

Measures of Dispersion RANGE highest to lowest values STANDARD DEVIATION how closely do values cluster around the mean value SKEWNESS refers to symmetry of curve

The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95% of values fall between 2 SDs. Mean, Median, Mode 11 22

We take a simple random sample with replacement of 25 cards from the box as follows. Mix the box of cards; choose one at random; record it; replace it; and then repeat the procedure until we have recorded the numbers on 25 cards. Although survey samples are not generally drawn with replacement, our simulation simplifies the analysis because the box remains unchanged between draws; so, after examining each card, the chance of drawing a card numbered 1 on the following draw is the same as it was for the previous draw, in this case a 60% chance. SIMULATION

Let’s say that after drawing the 25 cards this way, we obtain the following results, recorded in 5 rows of 5 numbers: SIMULATION

Based on this sample of 25 draws, we want to guess the percentage of 1’s in the box. There are 14 cards numbered 1 in the sample. This gives us a sample percentage of p=14/25=.56=56%. If this is all of the information we have about the population box, and we want to estimate the percentage of 1’s in the box, our best guess would be 56%. Notice that this sample value p = 56% is 4 percentage points below the true population value π = 60%. We say that the random sampling error (or simply random error) is -4%. SIMULATION

An experiment is a procedure which results in a measurement or observation. The Harris poll is an experiment which resulted in the measurement (statistic) of 57%. An experiment whose outcome depends upon chance is called a random experiment. ERROR ANALYSIS

On repetition of such an experiment one will typically obtain a different measurement or observation. So, if the Harris poll were to be repeated, the new statistic would very likely differ slightly from 57%. Each repetition is called an execution or trial of the experiment. ERROR ANALYSIS

Suppose we made three more series of draws, and the results were + 16%, + 0%, and + 12%. The random sampling errors of the four simulations would then average out to: ERROR ANALYSIS

n Note that the cancellation of the positive and negative random errors results in a small average. Actually with more trials, the average of the random sampling errors tends to zero. ERROR ANALYSIS

So in order to measure a “typical size” of a random sampling error, we have to ignore the signs. We could just take the mean of the absolute values (MA) of the random sampling errors. For the four random sampling errors above, the MA turns out to be ERROR ANALYSIS

The MA is difficult to deal with theoretically because the absolute value function is not differentiable at 0. So in statistics, and error analysis in general, the root mean square (RMS) of the random sampling errors is generally used. For the four random sampling errors above, the RMS is ERROR ANALYSIS

The RMS is a more conservative measure of the typical size of the random sampling errors in the sense that MA ≤ RMS. ERROR ANALYSIS

For a given experiment the RMS of all possible random sampling errors is called the standard error (SE). For example, whenever we use a random sample of size n and its percentages p to estimate the population percentage π, we have ERROR ANALYSIS

Dynamic analysis n Health of people and activity of medical establishments change in time. n Studying of dynamics of the phenomena is very important for the analysis of a state of health and activity of system of public health services.

Example of a dynamic line YearBed occupancy (days)

Parameters applied for analysis of changes of a phenomenon n Rate of growth –relation of all numbers of dynamic lines to the previous level accepted for 100 %.

Parameters applied for analysis of changes of a phenomenon n Pure gain – difference between next and previous numbers of dynamic lines.

Parameters applied for analysis of changes of a phenomenon n Rate of gain – relation of the pure gain to previous number.

Parameters applied for analysis of changes of a phenomenon n Parameter of visualization — relation of all numbers of dynamic lines to the first level, which one starts to 100%.

Measures of Association

n Absolute risk -The relative risk and odds ratio provide a measure of risk compared with a standard. n Attributable risk or Risk difference is a measure of absolute risk. It represents the excess risk of disease in those exposed taking into account the background rate of disease. The attributable risk is defined as the difference between the incidence rates in the exposed and non-exposed groups. n Population Attributable Risk is used to describe the excess rate of disease in the total study population of exposed and non-exposed individuals that is attributable to the exposure. n Number needed to treat (NNT) -The number of patients who would need to be treated to prevent one adverse outcome is often used to present the results of randomized trials.

Relative Values As a result of statistical research during processing of the statistical data of disease, mortality rate, lethality, etc. absolute numbers are received, which specify the number of the phenomena. Though absolute numbers have a certain cognitive values, but their use is limited.

Relative Values In order to acquire a level of the phenomenon, for comparison of a parameter in dynamics or with a parameter of other territory it is necessary to calculate relative values (parameters, factors) which represent result of a ratio of statistical numbers between itself. The basic arithmetic action at subtraction of relative values is division.

In medical statistics themselves the following kinds of relative parameters are used: n Extensive; n Intensive; n Relative intensity; n Visualization; n Correlation.

The extensive parameter, or a parameter of distribution, characterizes a parts of the phenomena (structure), that is it shows, what part from the general number of all diseases (died) is made with this or that disease which enters into total.

Using this parameter, it is possible to determine the structure of patients according to age, social status, etc. It is accepted to express this parameter in percentage, but it can be calculated and in parts per thousand case when the part of the given disease is small and at the calculation in percentage it is expressed as decimal fraction, instead of an integer.

The general formula of its calculation is the following: part × 100 total

n The intensive parameter characterizes frequency or distribution. n It shows how frequently the given phenomenon occurs in the given environment. n For example, how frequently there is this or that disease among the population or how frequently people are dying from this or that disease. n To calculate the intensive parameter, it is necessary to know the population or the contingent.

n General formula of the calculation is the following: phenomenon×100 (1000; ; ) environment

General mortality rate number of died during the year × 1000 number of the population

Let’s say that after drawing the 25 cards this way, we obtain the following results, recorded in 5 rows of 5 numbers: SIMULATION