INTRODUCTION TO BIOSTATISTICS

Slides:



Advertisements
Similar presentations
Unit 8: Presenting Data in Charts, Graphs and Tables
Advertisements

Chapter 2 Summarizing and Graphing Data
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Slide 1 Spring, 2005 by Dr. Lianfen Qian Lecture 2 Describing and Visualizing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data.
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
Chapter 1 Data Presentation Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness.
1 Introduction to biostatistics By Dr. S. Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine KKUH.
INTRODUCTION TO BIOSTATISTICS DR.S.Shaffi Ahamed Asst. Professor Dept. of Family and Comm. Medicine KKUH.
DESCRIPTIVE STATISTICS: GRAPHICAL AND NUMERICAL SUMMARIES
INTRODUCTION TO BIOSTATISTICS DR.S.Shaffi Ahamed Asst. Professor Dept. of Family and Comm. Medicine KKUH.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Sexual Activity and the Lifespan of Male Fruitflies
Introduction to Statistics
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Introduction to Biostatistics
Statistical Techniques in Hospital Management QUA 537
The Stats Unit.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Frequency Distributions and Graphs
PSYCHOLOGY 820 Chapters Introduction Variables, Measurement, Scales Frequency Distributions and Visual Displays of Data.
Chapter 1: Introduction to Statistics
Data Handling Collecting Data Learning Outcomes  Understand terms: sample, population, discrete, continuous and variable  Understand the need for different.
Collecting, Presenting, and Analyzing Research Data By: Zainal A. Hasibuan Research methodology and Scientific Writing W# 9 Faculty.
With Statistics Workshop with Statistics Workshop FunFunFunFun.
Chapter 2 Summarizing and Graphing Data
Descriptive Statistics: Tabular and Graphical Methods
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 3 Organizing and Displaying Data.
Data Presentation.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
STATISTICS I COURSE INSTRUCTOR: TEHSEEN IMRAAN. CHAPTER 4 DESCRIBING DATA.
Unit 2 Sections 2.1.
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
1 Introduction to biostatistics By Dr. S. Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine KKUH.
Chapter 2 Describing Data.
Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine.
The exam duration: 1hour 30 min. Marks :25 All MCQ’s. You should choose the correct answer. No major calculations, but simple maths IQ is required. No.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Chapter 2 Data Presentation Using Descriptive Graphs.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Biostatistics.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 2-2 Frequency Distributions.
Biostatistics, statistical software I. Basic statistical concepts Krisztina Boda PhD Department of Medical Informatics, University of Szeged.
Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine.
By: Asma Al-Oneazi Supervised by… Dr. Amal Fatani.
Presenting Data in Charts, Graphs and Tables #1-8-1.
Data, Type and Methods of representation Dr Hidayathulla Shaikh.
1Chapter : Organizing and Displaying Data Introduction: Statistics : Statistics is that area of study which is interested in learning how to collect, organize,
Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine.
Graphical Presentation Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Biostatistics Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
2 NURS/HSCI 597 NURSING RESEARCH & DATA ANALYSIS GEORGE MASON UNIVERSITY.
Introduction to Biostatistics Lecture 1. Biostatistics Definition: – The application of statistics to biological sciences Is the science which deals with.
Descriptive Statistics: Tabular and Graphical Methods
INTRODUCTION AND DEFINITIONS
Chapter 2: Methods for Describing Data Sets
Frequency Distributions
Introduction to biostatistics & Basic Concepts By Dr. S
CHAPTER 5 Basic Statistics
Chapter 5 STATISTICS (PART 1).
Basic Statistical Terms
THE STAGES FOR STATISTICAL THINKING ARE:
Sexual Activity and the Lifespan of Male Fruitflies
Biostatistics College of Medicine University of Malawi 2011.
THE STAGES FOR STATISTICAL THINKING ARE:
Experimental Design Experiments Observational Studies
Organizing, Displaying and Interpreting Data
Biostatistics Lecture (2).
Essentials of Statistics 4th Edition
Presentation transcript:

INTRODUCTION TO BIOSTATISTICS DR.S.Shaffi Ahamed Asst. Professor Dept. of Family and Comm. Medicine KKUH

This session covers: Background and need to know Biostatistics Definition of Statistics and Biostatistics Types of data Graphical representation of a data Frequency distribution of a data

Basis

Dynamic nature of the U n i v e r s e the very continuous change in Nature brings - uncertainty and - variability in each and every sphere of the Universe

We by no mean can control or over-power the factor of uncertainty but capable of measuring it in terms of Probability

Sources of Medical Uncertainties Natural variation due to biological, environmental and sampling factors Natural variation among methods, observers, instruments etc. Errors in measurement or assessment or errors in knowledge Incomplete knowledge

Biostatistics is the science which helps in managing health care uncertainties

“Statistics is the science which deals with collection, classification and tabulation of numerical facts as the basis for explanation, description and comparison of phenomenon”. ------ Lovitt

“BIOSTATISICS” (1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health. (2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.

Reasons to know about biostatistics: Medicine is becoming increasingly quantitative. The planning, conduct and interpretation of much of medical research are becoming increasingly reliant on the statistical methodology. Statistics pervades the medical literature.

CLINICAL MEDICINE Documentation of medical history of diseases. Planning and conduct of clinical studies. Evaluating the merits of different procedures. In providing methods for definition of “normal” and “abnormal”.

PREVENTIVE MEDICINE To provide the magnitude of any health problem in the community. To find out the basic factors underlying the ill-health. To evaluate the health programs which was introduced in the community (success/failure). To introduce and promote health legislation.

BASIC CONCEPTS Data : Set of values of one or more variables recorded on one or more observational units Sources of data 1. Routinely kept records 2. Surveys (census) 3. Experiments 4. External source Categories of data 1. Primary data: observation, questionnaire, record form, interviews, survey, 2. Secondary data: census, medical record,registry

TYPES OF DATA QUALITATIVE DATA DISCRETE QUANTITATIVE CONTINOUS QUANTITATIVE

QUALITATIVE Nominal Example: Sex ( M, F) Exam result (P, F) Blood Group (A,B, O or AB) Color of Eyes (blue, green, brown, black)

ORDINAL Example: Response to treatment (poor, fair, good) Severity of disease (mild, moderate, severe) Income status (low, middle, high)

QUANTITATIVE (DISCRETE) Example: The no. of family members The no. of heart beats The no. of admissions in a day QUANTITATIVE (CONTINOUS) Example: Height, Weight, Age, BP, Serum Cholesterol and BMI

Discrete data -- Gaps between possible values Number of Children Continuous data -- Theoretically, no gaps between possible values Hb

Scale of measurement Qualitative variable: A categorical variable Nominal (classificatory) scale  - gender, marital status, race Ordinal (ranking) scale  - severity scale, good/better/best

Scale of measurement Quantitative variable: A numerical variable: discrete; continuous Interval scale : Data is placed in meaningful intervals and order. The unit of measurement are arbitrary. - Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and No implication of ratio (30º C is not twice as hot as 15º C)

Ratio scale: Data is presented in frequency distribution in logical order. A meaningful ratio exists. - Age, weight, height, pulse rate - pulse rate of 120 is twice as fast as 60 - person with weight of 80kg is twice as heavy as the one with weight of 40 kg.

Scales of Measure Nominal – qualitative classification of equal value: gender, race, color, city Ordinal - qualitative classification which can be rank ordered: socioeconomic status of families Interval - Numerical or quantitative data: can be rank ordered and sizes compared : temperature Ratio - Quantitative interval data along with ratio: time, age. Nominal variables allow for only qualitative classification. That is, they can be measured only in terms of whether the individual items belong to some distinctively different categories, but we cannot quantify or even rank order those categories. For example, all we can say is that 2 individuals are different in terms of variable A (e.g., they are of different race), but we cannot say which one "has more" of the quality represented by the variable. Typical examples of nominal variables are gender, race, color, city, etc. Ordinal variables allow us to rank order the items we measure in terms of which has less and which has more of the quality represented by the variable, but still they do not allow us to say "how much more." A typical example of an ordinal variable is the socioeconomic status of families. For example, we know that upper-middle is higher than middle but we cannot say that it is, for example, 18% higher. Also this very distinction between nominal, ordinal, and interval scales itself represents a good example of an ordinal variable. For example, we can say that nominal measurement provides less information than ordinal measurement, but we cannot say "how much less" or how this difference compares to the difference between ordinal and interval scales. Interval variables allow us not only to rank order the items that are measured, but also to quantify and compare the sizes of differences between them. For example, temperature, as measured in degrees Fahrenheit or Celsius, constitutes an interval scale. We can say that a temperature of 40 degrees is higher than a temperature of 30 degrees, and that an increase from 20 to 40 degrees is twice as much as an increase from 30 to 40 degrees. Ratio variables are very similar to interval variables; in addition to all the properties of interval variables, they feature an identifiable absolute zero point, thus they allow for statements such as x is two times more than y. Typical examples of ratio scales are measures of time or space. For example, as the Kelvin temperature scale is a ratio scale, not only can we say that a temperature of 200 degrees is higher than one of 100 degrees, we can correctly state that it is twice as high. Interval scales do not have the ratio property. Most statistical data analysis procedures do not distinguish between the interval and ratio properties of the measurement scales.

CONTINUOUS DATA QUALITATIVE DATA wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall

Table 1 Distribution of blunt injured patients according to hospital length of stay

CLINIMETRICS A science called clinimetrics in which qualities are converted to meaningful quantities by using the scoring system. Examples: (1) Apgar score based on appearance, pulse, grimace, activity and respiration is used for neonatal prognosis. (2) Smoking Index: no. of cigarettes, duration, filter or not, whether pipe, cigar etc., (3) APACHE( Acute Physiology and Chronic Health Evaluation) score: to quantify the severity of condition of a patient

INVESTIGATION

Frequency Distributions “A Picture is Worth a Thousand Words”

Frequency Distributions What is a frequency distribution? A frequency distribution is an organization of raw data in tabular form, using classes (or intervals) and frequencies. What is a frequency count? The frequency or the frequency count for a data value is the number of times the value occurs in the data set.

Frequency Distributions data distribution – pattern of variability. the center of a distribution the ranges the shapes simple frequency distributions grouped & ungrouped frequency distributions

Categorical or Qualitative Frequency Distributions What is a categorical frequency distribution? A categorical frequency distribution represents data that can be placed in specific categories, such as gender, blood group, & hair color, etc.

Categorical or Qualitative Frequency Distributions -- Example Example: The blood types of 25 blood donors are given below. Summarize the data using a frequency distribution. AB B A O B O B O A O B O B B B A O AB AB O A B AB O A

Categorical Frequency Distribution for the Blood Types -- Example Continued Note: The classes for the distribution are the blood types.

Quantitative Frequency Distributions -- Ungrouped What is an ungrouped frequency distribution? An ungrouped frequency distribution simply lists the data values with the corresponding frequency counts with which each value occurs.

Quantitative Frequency Distributions – Ungrouped -- Example Example: The at-rest pulse rate for 16 athletes at a meet were 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, and 58. Summarize the information with an ungrouped frequency distribution.

Quantitative Frequency Distributions – Ungrouped -- Example Continued Note: The (ungrouped) classes are the observed values themselves.

Example of a simple frequency distribution (ungrouped) 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f 9 3 8 2 7 2 6 1 5 4 4 4 3 3 2 3 1 3 f = 25

Relative Frequency Distribution Proportion of the total N Divide the frequency of each score by N Rel. f = f/N Sum of relative frequencies should equal 1.0 Gives us a frame of reference

Relative Frequency Example: The relative frequency for the ungrouped class of 57 will be 4/16 = 0.25.

Relative Frequency Distribution Note: The relative frequency for a class is obtained by computing f/n.

Example of a simple frequency distribution 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f rel f 9 3 .12 8 2 .08 7 2 .08 6 1 .04 5 4 .16 4 4 .16 3 3 .12 2 3 .12 1 3 .12 f = 25  rel f = 1.0

Cumulative Frequency and Cumulative Relative Frequency NOTE: Sometimes frequency distributions are displayed with cumulative frequencies and cumulative relative frequencies as well.

Cumulative Frequency and Cumulative Relative Frequency What is a cumulative frequency for a class? The cumulative frequency for a specific class in a frequency table is the sum of the frequencies for all values at or below the given class.

Cumulative Frequency and Cumulative Relative Frequency What is a cumulative relative frequency for a class? The cumulative relative frequency for a specific class in a frequency table is the sum of the relative frequencies for all values at or below the given class.

Cumulative Frequency and Cumulative Relative Frequency Note: Table with relative and cumulative relative frequencies.

Example of a simple frequency distribution (ungrouped) 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f cf rel f rel. cf 9 3 3 .12 .12 8 2 5 .08 .20 7 2 7 .08 .28 6 1 8 .04 .32 5 4 12 .16 .48 4 4 16 .16 .64 3 3 19 .12 .76 2 3 22 .12 .88 1 3 25 .12 1.0 f = 25  rel f = 1.0

Quantitative Frequency Distributions -- Grouped What is a grouped frequency distribution? A grouped frequency distribution is obtained by constructing classes (or intervals) for the data, and then listing the corresponding number of values (frequency counts) in each interval.

Tabulate the hemoglobin values of 30 adult male patients listed below Patient No Hb (g/dl) 1 12.0 11 11.2 21 14.9 2 11.9 12 13.6 22 12.2 3 11.5 13 10.8 23 4 14.2 14 12.3 24 11.4 5 15 25 10.7 6 13.0 16 15.7 26 12.5 7 10.5 17 12.6 27 11.8 8 12.8 18 9.1 28 15.1 9 13.2 19 12.9 29 13.4 10 20 14.6 30 13.1

Steps for making a table Step1 Find Minimum (9.1) & Maximum (15.7) Step2 Calculate difference 15.7 – 9.1 = 6.6 Step3 Decide the number and width of the classes (7 c.l) 9.0 -9.9, 10.0-10.9,---- Step4 Prepare dummy table – Hb (g/dl), Tally mark, No. patients

DUMMY TABLE Tall Marks TABLE llll ll Hb (g/dl) Hb (g/dl) Tall marks   DUMMY TABLE Tall Marks TABLE   Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9   Total Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 l lll llll 1 llll llll llll ll 1 3 6 10 5 2 Total - 30

Table Frequency distribution of 30 adult male patients by Hb Hb (g/dl) No. of patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 2 Total 30

Table Frequency distribution of adult patients by Hb and gender: Hb (g/dl) Gender Total Male Female <9.0 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 2 8 4 14 16 9 30 60

Elements of a Table Ideal table should have Number Title Column headings Foot-notes Number – Table number for identification in a report Title,place - Describe the body of the table, variables, Time period (What, how classified, where and when) Column - Variable name, No. , Percentages (%), etc., Heading Foot-note(s) - to describe some column/row headings, special cells, source, etc.,

Table II. Distribution of 120 (Madras) Corporation divisions according to annual death rate based on registered deaths in 1975 and 1976 Figures in parentheses indicate percentages

DIAGRAMS/GRAPHS Discrete data --- Bar charts (one or two groups) Continuous data --- Histogram --- Frequency polygon (curve) --- Stem-and –leaf plot --- Box-and-whisker plot

Example data 68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65 43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 32 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43 27 49 28 23 19 11 52 46 31 30 43 49 12

Histogram Figure 1 Histogram of ages of 60 subjects

Polygon

Example data 68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65 43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 32 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43 27 49 28 23 19 11 52 46 31 30 43 49 12

Stem and leaf plot Stem-and-leaf of Age N = 60 Leaf Unit = 1.0 6 1 122269 19 2 1223344555777788888 11 3 00111226688 13 4 2223334567999 5 5 01127 4 6 3458 2 7 49

Box plot

Descriptive statistics report: Boxplot - minimum score maximum score lower quartile upper quartile median - mean the skew of the distribution: positive skew: mean > median & high-score whisker is longer negative skew: mean < median & low-score whisker is longer

The prevalence of different degree of Hypertension Pie Chart Circular diagram – total -100% Divided into segments each representing a category Decide adjacent category The amount for each category is proportional to slice of the pie The prevalence of different degree of Hypertension in the population

Bar Graphs Heights of the bar indicates frequency Frequency in the Y axis and categories of variable in the X axis The bars should be of equal width and no touching the other bars The distribution of risk factor among cases with Cardio vascular Diseases

HIV cases enrolment in USA by gender Bar chart This shows relative trends in admissions by gender.

HIV cases Enrollment in USA by gender Stocked bar chart This emphasizes the constancy of the overall admissions and shows the trends subtly.

Graphic Presentation of Data the frequency polygon (quantitative data) the histogram (quantitative data) the bar graph (qualitative data)

General rules for designing graphs A graph should have a self-explanatory legend A graph should help reader to understand data Axis labeled, units of measurement indicated Scales important. Start with zero (otherwise // break) Avoid graphs with three-dimensional impression, it may be misleading (reader visualize less easily

Any Questions