Download presentation
Presentation is loading. Please wait.
Published byStewart Barker Modified over 8 years ago
1
Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine
2
Statistics is the science of conducting studies to collect, organize, summarize, analyze, present, interpret and draw conclusions from data. Any values (observations or measurements) that have been collected
3
What Is Statistics? 1. Collecting Data e.g., Sample, Survey, Observe, Simulate 2. Characterizing Data e.g., Organize/Classify, Count, Summarize 3. Presenting Data e.g., Tables, Charts, Statements 4. Interpreting Results e.g. Infer, Conclude, Specify Confidence Why? Data Analysis Decision- Making © 1984-1994 T/Maker Co.
4
(1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health. (2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.
5
BASIC CONCEPTS Data : Set of values of one or more variables recorded on one or more observational units (singular: Datum) Categories of data 1. Primary data: observation, questionnaire, interviews & survey 2. Secondary data: census, medical records, registry Sources of data 1. Routinely kept records 2. Surveys (census) 3. Experiments 4. External source
6
Dataset: Data for a set of variables collection in group of persons. Data Table: A dataset organized into a table, with one column for each variable and one row for each person. Datasets and Data Tables
7
OBSAGEBMIFFNUMTEMP( 0 F)GENDEREXERCISE LEVELQUESTION 12623.2061.0011 23030.2965.5132 33228.91759.6134 43722.4168.4123 53325.5764.5035 62922.3170.2022 73223.0067.3011 83326.3172.8031 93222.2371.5014 103329.1563.2114 112620.8269.1013 123420.9473.6023 133136.3166.3025 143136.4066.9115 152728.6270.2122 163627.5268.5133 173525.614367.8134 183121.21170.7112 193622.7869.8021 203328.1367.8021 Typical Data Table
8
Definitions for Variables AGE: Age in years BMI: Body mass index, weight/height 2 in kg/m 2 FFNUM: The average number of times eating “fast food” in a week TEMP: High temperature for the day GENDER: 1- Female 0- Male EXERCISE LEVEL: 1- Low 2- Medium 3- High QUESTION: what is your satisfaction rating for this Biostatistics session ? 1- Very Satisfied 2- Somewhat Satisfied 3- Neutral 4- Somewhat dissatisfied 5- Dissatisfied
9
When collecting or gathering data we collect data from individuals cases on particular variables. A variable is a unit of data collection whose value can vary. Variables can be defined into types according to the level of mathematical scaling that can be carried out on the data. There are four types of data or levels of measurement: Types of variables and data 1. Nominal2. Ordinal 3. Interval4. Ratio
10
Scales of Measurement
11
Nominal data A type of categorical data in which objects fall into unordered categories. Studies measuring nominal data must ensure that each category is mutually exclusive and the system of measurement needs to be exhaustive. Variables that have only two responses i.e. Yes or No, are known as dichotomies.
12
Examples of Nominal data Type of Car BMW, Mercedes, Lexus, Toyota, etc., Ethnicity White British, Afro-Caribbean, Asian, Arab, Chinese, other, etc. Smoking status smoker, non-smoker
13
Ordinal data is data that comprises of categories that can be rank ordered. Similarly with nominal data the distance between each category cannot be calculated but the categories can be ranked above or below each other. Ordinal data
14
Examples of Ordinal Data: Grades in exam- A+, A, B+ B, C+, C,D, D+, and Fail. Degree of illness- none, mild, moderate, acute, chronic. Opinion of students about stats classes- Very unhappy, unhappy, neutral, happy, ecstatic!
15
Interval variables Examples: Fahrenheit temperature scale- Zero is arbitrary- 40 Degrees is not twice as hot as 20 degrees. IQ tests. No such thing as Zero IQ. 120 IQ not twice as intelligent as 60. Question- Can we assume that attitudinal data represents real, quantifiable measured categories? (ie. That ‘very happy’ is twice as happy as plain ‘happy’ or that ‘Very unhappy’ means no happiness at all). “Statisticians not in agreement on this”.
16
Ratio variables Examples: Can be discrete or continuous data. The distance between any two adjacent units of measurement (intervals) is the same and there is a meaningful zero point. Income- someone earning SR20,000 earns twice as much as someone who earns SR10,000. Height Weight Age
17
These levels of measurement can be placed in hierarchical order. Hierarchical data order
18
Nominal data is the least complex and give a simple measure of whether objects are the same or different. Ordinal data maintains the principles of nominal data but adds a measure of order to what is being observed. Interval data builds on ordinal by adding more information on the range between each observation by allowing us to measure the distance between objects. Ratio data adds to interval with including an absolute zero. Hierarchical data order
19
Terminology Categorical variables Quantitative variables Nominal variables Ordinal Variables Binary data. Discrete and continuous data. Interval and ratio variables
20
Categorical data The objects being studied are grouped into categories based on some qualitative trait. The resulting data are merely labels or categories.
21
Categorical data Ordinal data Nominal data
22
Examples of categorical data Eye color: blue, brown, black, green, etc. Smoking status: smoker, non-smoker Attitudes towards the death penalty: Strongly disagree, disagree, neutral, agree, strongly agree.
23
Binary Data A type of categorical data in which there are only two categories. Examples: Smoking status- smoker, non-smoker Attendance- present, absent Result of a exam- pass, fail. Status of student- undergraduate, postgraduate.
24
Examples: Nominal data (Binary)& Ordinal data What is your gender? (please tick) Male Female Did you enjoy the teaching session ? (please tick) Yes No What is the level of satisfaction with the new curriculum at a medical school received? (please tick) Very satisfied Somewhat satisfied Neutral Somewhat dissatisfied Very dissatisfied
25
QUANTATIVE DATA The objects being studied are ‘measured’ based on some quantitative trait. The resulting data are set of numbers. Examples: Pulse rate Height Age Exam marks Time to complete a statistics test Number of cigarettes smoked
26
Quantitative data Continuous Discrete
27
Discrete Data Only certain values are possible (there are gaps between the possible values). Implies counting. Continuous Data Theoretically, with a fine enough measuring device. Implies counting.
28
28 Discrete data -- Gaps between possible values Continuous data -- Theoretically, no gaps between possible values Number of Children Hb
29
Examples of Discrete Data: Number of children in a family Number of students passing a stats exam Number of crimes reported to the police Number of bicycles sold in a day. Generally, discrete data are counts. We would not expect to find 2.2 children in a family or 88.5 students passing an exam or 127.2 crimes being reported to the police or half a bicycle being sold in one day.
30
Example of Continuous Data: Age ( in years) Height( in cms.) Weight (in Kgs.) Sys.BP, Hb., etc., ‘Generally, continuous data come from measurements.
31
Variables Category Quantity Nominal Ordinal Discrete (counting) Continuous (measuring)
32
32 QUANTITATIVE DATA QUALITATIVE DATA wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall
33
33 A science called clinimetrics in which qualities are converted to meaningful quantities by using the scoring system. Examples: (1) Apgar score based on appearance, pulse, grimace, activity and respiration is used for neonatal prognosis. (2) Smoking Index: no. of cigarettes, duration, filter or not, whether pipe, cigar etc., (3) APACHE( Acute Physiology and Chronic Health Evaluation) score: to quantify the severity of condition of a patient
34
Why do we need to know what type of data we are dealing with? The data type or level of measurement influences the type of statistical analysis techniques that can be used when analysing data. Data types – important?
35
Frequency Distributions What is a frequency distribution? What is a frequency distribution? A frequency distribution is an organization of raw data in tabular form, using classes (or intervals) and frequencies. What is a frequency count? What is a frequency count? The frequency or the frequency count for a data value is the number of times the value occurs in the data set.
36
Frequency Distributions data distribution – pattern of variability. the center of a distribution the ranges the shapes simple frequency distributions grouped & ungrouped frequency distributions
37
Categorical or Qualitative Frequency Distributions What is a categorical frequency distribution? What is a categorical frequency distribution? A categorical frequency distribution represents data that can be placed in specific categories, such as gender, blood group, & hair color, etc.
38
Categorical or Qualitative Frequency Distributions -- Example Example: Example: The blood types of 25 blood donors are given below. Summarize the data using a frequency distribution. AB B A O B O B O A O O B O A O B O B B B B O B B B A O AB AB O A O AB AB O A B AB O A A B AB O A
39
Categorical Frequency Distribution for the Blood Types -- Example Continued Note: The classes for the distribution are the blood types. Note: The classes for the distribution are the blood types.
40
Quantitative Frequency Distributions -- Ungrouped What is an ungrouped frequency distribution? What is an ungrouped frequency distribution? An ungrouped frequency distribution simply lists the data values with the corresponding frequency counts with which each value occurs.
41
Quantitative Frequency Distributions – Ungrouped -- Example Example: 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, 58 Example: The at-rest pulse rate for 16 athletes at a meet were 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, and 58. Summarize the information with an ungrouped frequency distribution.
42
Quantitative Frequency Distributions – Ungrouped -- Example Continued Note: The (ungrouped) classes are the observed values themselves.
43
Example of a simple frequency distribution (ungrouped) 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f 93 82 72 61 54 44 33 23 13 f = 25
44
Relative Frequency Distribution Note: The relative frequency for a class is obtained by computing f/n.
45
Example of a simple frequency distribution 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f rel f 93.12 82.08 72.08 61.04 54.16 44.16 33.12 23.12 13.12 f = 25 rel f = 1.0
46
Cumulative Frequency and Cumulative Relative Frequency Note: Table with relative and cumulative relative frequencies.
47
Example of a simple frequency distribution (ungrouped) 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1 f cf rel f rel. cf 93 3.12.12 82 5.08.20 72 7.08.28 61 8.04.32 54 12.16.48 44 16.16.64 33 19.12.76 23 22.12.88 13 25.12 1.0 f = 25 rel f = 1.0
48
Quantitative Frequency Distributions -- Grouped What is a grouped frequency distribution? What is a grouped frequency distribution? A grouped frequency distribution is obtained by constructing classes (or intervals) for the data, and then listing the corresponding number of values (frequency counts) in each interval.
49
Patient No Hb (g/dl) Patient No Hb (g/dl) Patient No Hb (g/dl) 112.01111.22114.9 211.91213.62212.2 311.51310.82312.2 414.21412.32411.4 512.31512.32510.7 613.01615.72612.5 710.51712.62711.8 812.8189.12815.1 913.21912.92913.4 1011.22014.63013.1 Tabulate the hemoglobin values of 30 adult male patients listed below
50
Hb (g/dl)No. of patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 3 2 Total30 Frequency distribution of 30 adult male Frequency distribution of 30 adult male patients by Hb
51
DIAGRAMS/GRAPHS Categorical data --- Bar diagram (one or two groups) --- Pie diagram Continuous data --- Histogram --- Frequency polygon (curve) --- Stem-and –leaf plot --- Box-and-whisker plot --- Scatter diagram
52
Two-dimensional graphs: Basic Set-Up
53
Histograms
54
Frequency Polygons
55
Example data 6863422730362832 7927222824254465 4325745136422831 2825451257511232 4938422731503821 1624644723224327 4928231911524631 30434912
56
Stem and leaf plot Stem-and-leaf of Age N = 60 Leaf Unit = 1.0 6 1 122269 19 2 1223344555777788888 11 3 00111226688 13 4 2223334567999 5 5 01127 4 6 3458 2 7 49
57
57 Descriptive statistics report: Box plot - minimum score - maximum score - lower quartile - upper quartile - median - mean
58
58 Application of a box and Whisker diagram
60
Bar Graphs The distribution of risk factor among cases with Cardio vascular Diseases Heights of the bar indicates frequency Frequency in the Y axis and categories of variable in the X axis The bars should be of equal width and no touching the other bars
61
HIV cases enrolment in USA by gender Bar chart
62
HIV cases Enrollment in USA by gender Stocked bar chart
63
Grouped Bar Graph
64
The prevalence of different degree of Hypertension in the population Pie Chart Circular diagram – total -100% Divided into segments each representing a category Decide adjacent category The amount for each category is proportional to slice of the pie
65
Pie diagram Pie diagram – depicts the percentage represented by each alternative as a slice of a circular pie; the larger the slice, the greater the percentage.
66
General rules for designing graphs A graph should have a self-explanatory legend A graph should help reader to understand data Axis labeled, units of measurement indicated Scales important. Start with zero (otherwise // break) Avoid graphs with three-dimensional impression, it may be misleading (reader visualize less easily
67
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.