Download presentation
1
INTRODUCTION TO BIOSTATISTICS
DR.S.Shaffi Ahamed Asst. Professor Dept. of Family and Comm. Medicine KKUH
2
This session covers: Background and need to know Biostatistics
Definition of Statistics and Biostatistics Types of data Graphical representation of a data Frequency distribution of a data
3
Basis
4
Dynamic nature of the U n i v e r s e the very continuous change in Nature brings - uncertainty and - variability in each and every sphere of the Universe
5
We by no mean can control or over-power the factor of uncertainty but capable of measuring it in terms of Probability
6
Sources of Medical Uncertainties
Natural variation due to biological, environmental and sampling factors Natural variation among methods, observers, instruments etc. Errors in measurement or assessment or errors in knowledge Incomplete knowledge
7
Biostatistics is the science which helps in managing health care uncertainties
8
“Statistics is the science which deals with collection, classification and tabulation of numerical facts as the basis for explanation, description and comparison of phenomenon”. Lovitt
9
“BIOSTATISICS” (1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health. (2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.
10
Reasons to know about biostatistics:
Medicine is becoming increasingly quantitative. The planning, conduct and interpretation of much of medical research are becoming increasingly reliant on the statistical methodology. Statistics pervades the medical literature.
11
CLINICAL MEDICINE Documentation of medical history of diseases.
Planning and conduct of clinical studies. Evaluating the merits of different procedures. In providing methods for definition of “normal” and “abnormal”.
12
PREVENTIVE MEDICINE To provide the magnitude of any health problem in the community. To find out the basic factors underlying the ill-health. To evaluate the health programs which was introduced in the community (success/failure). To introduce and promote health legislation.
13
BASIC CONCEPTS Data : Set of values of one or more variables recorded on one or more observational units Sources of data Routinely kept records 2. Surveys (census) 3. Experiments 4. External source Categories of data 1. Primary data: observation, questionnaire, record form, interviews, survey, 2. Secondary data: census, medical record,registry
14
TYPES OF DATA QUALITATIVE DATA DISCRETE QUANTITATIVE
CONTINOUS QUANTITATIVE
15
QUALITATIVE Nominal Example: Sex ( M, F) Exam result (P, F)
Blood Group (A,B, O or AB) Color of Eyes (blue, green, brown, black)
16
ORDINAL Example: Response to treatment (poor, fair, good) Severity of disease (mild, moderate, severe) Income status (low, middle, high)
17
QUANTITATIVE (DISCRETE)
Example: The no. of family members The no. of heart beats The no. of admissions in a day QUANTITATIVE (CONTINOUS) Example: Height, Weight, Age, BP, Serum Cholesterol and BMI
18
Discrete data -- Gaps between possible values
Number of Children Continuous data -- Theoretically, no gaps between possible values Hb
19
Scale of measurement Qualitative variable: A categorical variable
Nominal (classificatory) scale - gender, marital status, race Ordinal (ranking) scale - severity scale, good/better/best
20
Scale of measurement Quantitative variable:
A numerical variable: discrete; continuous Interval scale : Data is placed in meaningful intervals and order. The unit of measurement are arbitrary. - Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and No implication of ratio (30º C is not twice as hot as 15º C)
21
Ratio scale: Data is presented in frequency distribution in logical order. A meaningful ratio exists. - Age, weight, height, pulse rate - pulse rate of 120 is twice as fast as 60 - person with weight of 80kg is twice as heavy as the one with weight of 40 kg.
22
Scales of Measure Nominal – qualitative classification of equal value: gender, race, color, city Ordinal - qualitative classification which can be rank ordered: socioeconomic status of families Interval - Numerical or quantitative data: can be rank ordered and sizes compared : temperature Ratio - Quantitative interval data along with ratio: time, age. Nominal variables allow for only qualitative classification. That is, they can be measured only in terms of whether the individual items belong to some distinctively different categories, but we cannot quantify or even rank order those categories. For example, all we can say is that 2 individuals are different in terms of variable A (e.g., they are of different race), but we cannot say which one "has more" of the quality represented by the variable. Typical examples of nominal variables are gender, race, color, city, etc. Ordinal variables allow us to rank order the items we measure in terms of which has less and which has more of the quality represented by the variable, but still they do not allow us to say "how much more." A typical example of an ordinal variable is the socioeconomic status of families. For example, we know that upper-middle is higher than middle but we cannot say that it is, for example, 18% higher. Also this very distinction between nominal, ordinal, and interval scales itself represents a good example of an ordinal variable. For example, we can say that nominal measurement provides less information than ordinal measurement, but we cannot say "how much less" or how this difference compares to the difference between ordinal and interval scales. Interval variables allow us not only to rank order the items that are measured, but also to quantify and compare the sizes of differences between them. For example, temperature, as measured in degrees Fahrenheit or Celsius, constitutes an interval scale. We can say that a temperature of 40 degrees is higher than a temperature of 30 degrees, and that an increase from 20 to 40 degrees is twice as much as an increase from 30 to 40 degrees. Ratio variables are very similar to interval variables; in addition to all the properties of interval variables, they feature an identifiable absolute zero point, thus they allow for statements such as x is two times more than y. Typical examples of ratio scales are measures of time or space. For example, as the Kelvin temperature scale is a ratio scale, not only can we say that a temperature of 200 degrees is higher than one of 100 degrees, we can correctly state that it is twice as high. Interval scales do not have the ratio property. Most statistical data analysis procedures do not distinguish between the interval and ratio properties of the measurement scales.
23
CONTINUOUS DATA QUALITATIVE DATA wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall
24
Table 1 Distribution of blunt injured patients
according to hospital length of stay
25
CLINIMETRICS A science called clinimetrics in which qualities are converted to meaningful quantities by using the scoring system. Examples: (1) Apgar score based on appearance, pulse, grimace, activity and respiration is used for neonatal prognosis. (2) Smoking Index: no. of cigarettes, duration, filter or not, whether pipe, cigar etc., (3) APACHE( Acute Physiology and Chronic Health Evaluation) score: to quantify the severity of condition of a patient
29
INVESTIGATION
30
Frequency Distributions
“A Picture is Worth a Thousand Words”
31
Frequency Distributions
What is a frequency distribution? A frequency distribution is an organization of raw data in tabular form, using classes (or intervals) and frequencies. What is a frequency count? The frequency or the frequency count for a data value is the number of times the value occurs in the data set.
32
Frequency Distributions
data distribution – pattern of variability. the center of a distribution the ranges the shapes simple frequency distributions grouped & ungrouped frequency distributions
33
Categorical or Qualitative Frequency Distributions
What is a categorical frequency distribution? A categorical frequency distribution represents data that can be placed in specific categories, such as gender, blood group, & hair color, etc.
34
Categorical or Qualitative Frequency Distributions -- Example
Example: The blood types of 25 blood donors are given below. Summarize the data using a frequency distribution. AB B A O B O B O A O B O B B B A O AB AB O A B AB O A
35
Categorical Frequency Distribution for the Blood Types -- Example Continued
Note: The classes for the distribution are the blood types.
36
Quantitative Frequency Distributions -- Ungrouped
What is an ungrouped frequency distribution? An ungrouped frequency distribution simply lists the data values with the corresponding frequency counts with which each value occurs.
37
Quantitative Frequency Distributions – Ungrouped -- Example
Example: The at-rest pulse rate for 16 athletes at a meet were 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, and 58. Summarize the information with an ungrouped frequency distribution.
38
Quantitative Frequency Distributions – Ungrouped -- Example Continued
Note: The (ungrouped) classes are the observed values themselves.
39
Example of a simple frequency distribution (ungrouped)
f 9 3 8 2 7 2 6 1 5 4 4 4 3 3 2 3 1 3 f = 25
40
Relative Frequency Distribution
Proportion of the total N Divide the frequency of each score by N Rel. f = f/N Sum of relative frequencies should equal 1.0 Gives us a frame of reference
41
Relative Frequency Example: The relative frequency for the ungrouped
class of 57 will be 4/16 = 0.25.
42
Relative Frequency Distribution
Note: The relative frequency for a class is obtained by computing f/n.
43
Example of a simple frequency distribution
f rel f f = rel f = 1.0
44
Cumulative Frequency and Cumulative Relative Frequency
NOTE: Sometimes frequency distributions are displayed with cumulative frequencies and cumulative relative frequencies as well.
45
Cumulative Frequency and Cumulative Relative Frequency
What is a cumulative frequency for a class? The cumulative frequency for a specific class in a frequency table is the sum of the frequencies for all values at or below the given class.
46
Cumulative Frequency and Cumulative Relative Frequency
What is a cumulative relative frequency for a class? The cumulative relative frequency for a specific class in a frequency table is the sum of the relative frequencies for all values at or below the given class.
47
Cumulative Frequency and Cumulative Relative Frequency
Note: Table with relative and cumulative relative frequencies.
48
Example of a simple frequency distribution (ungrouped)
f cf rel f rel. cf f = rel f = 1.0
49
Quantitative Frequency Distributions -- Grouped
What is a grouped frequency distribution? A grouped frequency distribution is obtained by constructing classes (or intervals) for the data, and then listing the corresponding number of values (frequency counts) in each interval.
50
Tabulate the hemoglobin values of 30 adult male patients listed below
Patient No Hb (g/dl) 1 12.0 11 11.2 21 14.9 2 11.9 12 13.6 22 12.2 3 11.5 13 10.8 23 4 14.2 14 12.3 24 11.4 5 15 25 10.7 6 13.0 16 15.7 26 12.5 7 10.5 17 12.6 27 11.8 8 12.8 18 9.1 28 15.1 9 13.2 19 12.9 29 13.4 10 20 14.6 30 13.1
51
Steps for making a table
Step1 Find Minimum (9.1) & Maximum (15.7) Step2 Calculate difference – 9.1 = 6.6 Step3 Decide the number and width of the classes (7 c.l) , ,---- Step4 Prepare dummy table – Hb (g/dl), Tally mark, No. patients
52
DUMMY TABLE Tall Marks TABLE llll ll Hb (g/dl) Hb (g/dl) Tall marks
DUMMY TABLE Tall Marks TABLE Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 Total Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 l lll llll 1 llll llll llll ll 1 3 6 10 5 2 Total - 30
53
Table Frequency distribution of 30 adult male
patients by Hb Hb (g/dl) No. of patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 2 Total 30
54
Table Frequency distribution of adult patients by
Hb and gender: Hb (g/dl) Gender Total Male Female <9.0 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 2 8 4 14 16 9 30 60
55
Elements of a Table Ideal table should have Number Title
Column headings Foot-notes Number – Table number for identification in a report Title,place Describe the body of the table, variables, Time period (What, how classified, where and when) Column Variable name, No. , Percentages (%), etc., Heading Foot-note(s) - to describe some column/row headings, special cells, source, etc.,
56
Table II. Distribution of 120 (Madras) Corporation divisions according to annual death rate based on registered deaths in 1975 and 1976 Figures in parentheses indicate percentages
57
DIAGRAMS/GRAPHS Discrete data --- Bar charts (one or two groups)
Continuous data --- Histogram --- Frequency polygon (curve) --- Stem-and –leaf plot --- Box-and-whisker plot
58
Example data
59
Histogram Figure 1 Histogram of ages of 60 subjects
60
Polygon
61
Example data
62
Stem and leaf plot Stem-and-leaf of Age N = 60 Leaf Unit = 1.0
63
Box plot
64
Descriptive statistics report: Boxplot
- minimum score maximum score lower quartile upper quartile median - mean the skew of the distribution: positive skew: mean > median & high-score whisker is longer negative skew: mean < median & low-score whisker is longer
65
The prevalence of different degree of Hypertension
Pie Chart Circular diagram – total -100% Divided into segments each representing a category Decide adjacent category The amount for each category is proportional to slice of the pie The prevalence of different degree of Hypertension in the population
66
Bar Graphs Heights of the bar indicates frequency
Frequency in the Y axis and categories of variable in the X axis The bars should be of equal width and no touching the other bars The distribution of risk factor among cases with Cardio vascular Diseases
67
HIV cases enrolment in USA by gender
Bar chart This shows relative trends in admissions by gender.
68
HIV cases Enrollment in USA by gender
Stocked bar chart This emphasizes the constancy of the overall admissions and shows the trends subtly.
69
Graphic Presentation of Data
the frequency polygon (quantitative data) the histogram (quantitative data) the bar graph (qualitative data)
71
General rules for designing graphs
A graph should have a self-explanatory legend A graph should help reader to understand data Axis labeled, units of measurement indicated Scales important. Start with zero (otherwise // break) Avoid graphs with three-dimensional impression, it may be misleading (reader visualize less easily
72
Any Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.