Download presentation
Presentation is loading. Please wait.
Published byMaurice Price Modified over 9 years ago
1
Ana Jerončić, PhD Department for Research in Biomedicine and Health
2
E-mail: ana.jeroncic@mefst.hrana.jeroncic@mefst.hr Location: main building, 5th floor, room 512 Phone: 557-862
3
1.Describing data - Central tendency and variability 2.Estimation - Accuracy, precision, standard error, confidence intervals 3.Hypothesis testing - Test statistics, P-value, choice of a statistical test 4.Interpretation of data - Causality and association, odds ratio, risk, correlation, linear regression 5.Sources of error - Type 1 and type 2 errors, power, bias, confounding
4
Critical appraisal of scientific papers NOT! Implementation of data analysis
5
To identify the best available treatment To prevent “medical zombies” To perform your own research
6
1.How the data should be organized prior to data analysis 2.Data types 3.Graphical & tabular techniques for description, summary statistics Qualitative Data Quantitative Data
7
Height measurements among 1st year medical students 157204184186197155169 150193205150161169147 167159187173146179201 159147144204184192165 146169198164182165173 147166167180169174201 146151203171186179152 189204189200202147181 145161173155203190164 141163179195155197151 197141146202149197203 172143151200197192 160173187172177179188
8
the unit of measurement What is the unit of measurement ? observations per subject How many observations per subject ?
9
Entity Height (cm) Weight (kg) Age (years) Sex (category) Person 1 Person 2 Person 3 * 176 171 182 * 70 60 75 * 33 38 62 * Male Female Male * OBSERVATIONSOBSERVATIONS VARIABLES Measurement/ Observation
10
VariableFeatures of variables ExampleDescriptive statistics Informativeness level Categorical, Nominal Unordered /unarranged categories Gender, urbanization Number, proportion Low OrdinalOrded/arranged categories Grades, scales MedianMedium NumericalArranged categories with equal intervals Height, weight Mean or median High
11
Categorical Nominal Qualitative Ordinal Numerical Quantitative
12
Height Grades Age in years Weight Insuline concentration Blood glucose
13
How many cigarettes do you smoke a day? 1-5 6-10 11-15 16-20 21 and more
14
Have you ever had a heart attack? Yes No Do you suffer from hypertension? Yes No ?
15
Gender: Male Female
16
Marital status: married divorced widowed single lives alone ?
17
Education: elementary school high school two-year college four-year college ?
18
Likert scale Claim: Violence among the youth is becoming an increasing problem in Croatia. I agree completely I agree Undecided I disagree I argue strongly against 1 2 3 4 5
19
Visually analogous scale E.g. pain level that examinee experiences I don’t feel pain I feel intolerable pain
20
NumericalDistance is meaningfull OrdinalAtributes can be ordered NominalAttributes are only named; weakest
22
Person No.Height [cm] Person 1148 Person 2142 Person 3154 Person 4153 Person 5160 Person 6177 Person 7204 Person 8192 Person 9191 Person 10203 Person 11197 Person 12202 Person 13177
23
Organized data are input for Graphical & Tabular data representations Person No.Height [cm] Person 1148 Person 2142 Person 3154 Person 4153 Person 5160 Person 6177 Person 7204 Person 8192 Person 9191 Person 10203 Person 11197 Person 12202 Person 13177
25
In one study researchers investigated genotype of the YPEL5 gene in a population sample from Split. They got the following results on 10 examinees : Individual YPEL5 Genotype 1AA 2B 3BB 4 5AB 6 7BB 8AA 9AB 10BB GenotypeFrequencyRelative Frequency Relative Frequency [%] AA20.220% AB30.330% BB50.550% Total10101.00100% Table Frequency Distribution of YPEL5 genotypes proportionpercentage
26
Bar Charts are often used to display frequencies… categories’ names Counts Or Percentages
27
(84%) (16%) (100%) (19%) (81%) (100%)
28
The only allowable calculation => count the frequency of category. We can summarize the data in a contingency table that presents the categories and their counts called a frequency distribution. A relative frequency distribution lists the categories and the proportion with which each occurs.
29
Nominal data has no order. However, sometimes it is usefull to arrange the outcomes from the most frequently occurring to the least frequently occurring. We call this bar chart representation a “pareto chart” categories’ names counts
30
Chart with relative frequency is more informative categories’ names percentages
31
Pie Charts show relative frequencies…
32
Authors can use percentages to hide the true size of the data. To say that 50% of a sample has a certain condition when there are only four people in the sample is clearly not providing the same level of information as 50% of a sample based on 400 people. So, percentages should be used as an additional help for the reader rather than replacing the actual data
35
Height measurements among 1st year medical students IndividualHeight (cm) 1186 2144 3175 4199 5149 6157 7150 8176 9179 10165 11151 12164 13167 14175 15191 16163 17187 18176 19184 20191 21172 22151 23179 Frequency distribution for quantitative data: Building a Histogram
36
Category limits [cm]Freq. Relative Freq. Percent Relative Freq. >140;<=15030,1313% 150-16030,1313% 160-17040,1717% 170-18070,3030% 180-19050,2222% 190-20010,044% Total231,00100% Frequency distribution of height
37
There are several graphical methods that are used when the data are quantitative ( numeric). The most important of these graphical methods is the histogram. The histogram is not only a powerful graphical technique used to summarize interval data, but it is also used to help explain probabilities.
38
http://www.shodor.org/interactivate/activities/Histogram/ http://www.shodor.org/interactivate/activities/Histogram/
40
Qualitative Frequency Distribution – tabular summary of data Bar Chart Pie Chart Quantitative Frequency Distribution – tabular summary of data Histogram Line Chart (Time-Series Plot) Stem and Leaf Display
42
To compare two variables we use: Scatter plot/diagram (quantitative) Cross table (qualitative)
43
Scatter plot, showing the strong association between enzyme activity at pH 5.5 and the 5α-reductase 2-specific mRNA expression, as expressed on the basis of β-actin (n = 30; r s = 0.81; 95% confidence interval, 0.64–0.91; P < 0.0001).
44
Linearity and Direction are two concepts we are interested in Positive Linear RelationshipNegative Linear Relationship Weak or Non-Linear Relationship
45
Squamous cell carcinoma tumor and perilesional display distinctly different scatter plots from normal tissue. Expresion levels for gene subset 1 in patient 1
46
Used to compare two qualitative variables If first variable has r categories, second variable c categories, then we have an r × c cross table.
47
Disease X YESNOTOTAL YPEL5 Genotype AA202 AB134 BB044 TOTAL3710 Based on data presented do you think that YPEL5 could be associated with disease X?
48
Room 512 (5th floor) E-mail: ajeronci@mefst.hrajeronci@mefst.hr
50
The results of measuring the height among med. students IndividualHeight (cm) 1186 2144 3175 4199 5149 6157 7150 8176 9179 10165 11151 12164 13167 14175 15191 16163 17187 18176 19184 20191 21172 22151 23179 subjects Height [cm] subjects Height [cm]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.