Introduction to statistics in medicine – Part 1 Arier Lee.

Slides:



Advertisements
Similar presentations
Introduction to Summary Statistics
Advertisements

5 - 1 © 1997 Prentice-Hall, Inc. Importance of Normal Distribution n Describes many random processes or continuous phenomena n Can be used to approximate.
3.3 Toward Statistical Inference. What is statistical inference? Statistical inference is using a fact about a sample to estimate the truth about the.
Bios 101 Lecture 4: Descriptive Statistics Shankar Viswanathan, DrPH. Division of Biostatistics Department of Epidemiology and Population Health Albert.
Sampling Distributions
Descriptive Statistics
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 3-1.
Measures of Dispersion
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Central Limit Theorem The Normal Distribution The Standardised Normal.
Understanding sample survey data
Standard error of estimate & Confidence interval.
Review of normal distribution. Exercise Solution.
1. Homework #2 2. Inferential Statistics 3. Review for Exam.
Think of a topic to study Review the previous literature and research Develop research questions and hypotheses Specify how to measure the variables in.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Statistics and Research methods Wiskunde voor HMI Betsy van Dijk.
ESTIMATION. STATISTICAL INFERENCE It is the procedure where inference about a population is made on the basis of the results obtained from a sample drawn.
Introduction to Statistical Inference Chapter 11 Announcement: Read chapter 12 to page 299.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
PTP 560 Research Methods Week 8 Thomas Ruediger, PT.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Continuous Probability Distributions Continuous random variable –Values from interval of numbers –Absence of gaps Continuous probability distribution –Distribution.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Why statisticians were created Measure of dispersion FETP India.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Interpreting Performance Data
Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.1 Descriptive Statistics, The Normal Distribution, and Standardization.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Sampling and Confidence Interval Kenneth Kwan Ho Chui, PhD, MPH Department of Public Health and Community Medicine
Day 2 Session 1 Basic Statistics Cathy Mulhall South East Public Health Observatory Spring 2009.
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Statistics PSY302 Quiz One Spring A _____ places an individual into one of several groups or categories. (p. 4) a. normal curve b. spread c.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Describing and Displaying Quantitative data. Summarizing continuous data Displaying continuous data Within-subject variability Presentation.
Copyright © 2014 by Nelson Education Limited. 3-1 Chapter 3 Measures of Central Tendency and Dispersion.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
INVESTIGATION 1.
Determination of Sample Size: A Review of Statistical Theory
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Confidence Intervals & Sample Size Chapter 7. Introduction One aspect of inferential statistics is estimation, which is the process of estimating the.
Medical Statistics as a science
Introduction to Statistics Santosh Kumar Director (iCISA)
Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Principles of statistical testing
Statistical Analysis I Mosuk Chow, PhD Senior Scientist and Professor Department of Statistics December 8, 2015 CTSI BERD Research Methods Seminar Series.
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 5 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
Introduction to Medical Statistics. Why Do Statistics? Extrapolate from data collected to make general conclusions about larger population from which.
Sampling Distributions & Sample Means Movie Clip.
Descriptive Statistics(Summary and Variability measures)
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
THE NORMAL DISTRIBUTION
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Chapter 4 – Statistics II
Statistical Methods Michael J. Watts
Doc.RNDr.Iveta Bedáňová, Ph.D.
Statistical Methods Michael J. Watts
26134 Business Statistics Week 3 Tutorial
Description of Data (Summary and Variability measures)
Ch. 18- Descriptive Statistics.
An Introduction to Statistics
Basic Statistical Terms
Univariate Statistics
Review for Exam 1 Ch 1-5 Ch 1-3 Descriptive Statistics
Statistics PSY302 Review Quiz One Spring 2017
Advanced Algebra Unit 1 Vocabulary
Introductory Statistics
Presentation transcript:

Introduction to statistics in medicine – Part 1 Arier Lee

Introduction Who am I Who am I Who do I work with Who do I work with What do I do What do I do

Why do we need statistics Population Sample

The important role of statistics in medicine Statisticians pervades every aspect of medical research Statisticians pervades every aspect of medical research Medical practice and research generates lots of data Medical practice and research generates lots of data Research involves asking lots of questions with strong statistical aspects Research involves asking lots of questions with strong statistical aspects The evaluation of new treatments, procedures and preventative measures relies on statistical concepts in both design and analysis The evaluation of new treatments, procedures and preventative measures relies on statistical concepts in both design and analysis Statisticians are consulted at early stage of a medical study Statisticians are consulted at early stage of a medical study

Research process Research question Primary and secondary endpoints Study design Sampling and/or randomisation scheme Power and sample size calculation Pre-define analyses methods Analyse data Interpret results Disseminate

A form of systematic error that can affect scientific research A form of systematic error that can affect scientific research Selection bias – well defined inclusion / exclusion criteria, randomisation Selection bias – well defined inclusion / exclusion criteria, randomisation Assessment bias – blinding Assessment bias – blinding Response bias, lost-to-follow-up bias – maximise response Response bias, lost-to-follow-up bias – maximise response Questionnaire bias – careful wording and good interviewer training Questionnaire bias – careful wording and good interviewer training Bias

Continuous Continuous age, weight, height, blood pressure Percentages Percentages % of households owning a dog Counts Counts Number of pre-term babies Binary Binary yes/no, male/female, sick/healthy Ordinal Ordinal taste of biscuits: strongly dislike, dislike, neutral, like, strongly like Nominal categorical Nominal categorical Ethnicity: European, Maori, Pacific Islander, Chinese etc. Some common data types

Descriptive statistics for continuous data – the average Mean Mean (sum of values)/(number in group) Median Median The middle value, 50 th percentile Mode Mode The value that occurs the most often medianmode=8 mean=11.54

Descriptive statistics for continuous data – the spread Range Range Minimum and maximum numbers Interquartile range Interquartile range Quartiles divide data into quarters Standard deviation Standard deviation A statistic that tells us how far away from the mean the data is spread (95% of the data lies between 2 SD) √  (x i - x) 2 /(n-1) 0, 1, 2, 5, 8, 8, 9, 10, 12, 14, 18, 20 21, 23, 25, 27, 34, numbers Q1 Q2 Q3

– Estimation: determine value of a variable and its likely range (ie. 95% confidence intervals) Statistical inference is a process of generalising results calculated from a sample to a population Statistical inference is a process of generalising results calculated from a sample to a population We are interested in some numerical characteristic of a population (called a parameter). e.g. the mean height or the proportion of pregnant women with hypertension We are interested in some numerical characteristic of a population (called a parameter). e.g. the mean height or the proportion of pregnant women with hypertension We take a sample from the population and calculate an estimate of this parameter We take a sample from the population and calculate an estimate of this parameter Estimation

We want to estimate the mean height of 10 years old boys We want to estimate the mean height of 10 years old boys Take a random sample of 100 ten years old boys and calculate the sample mean Take a random sample of 100 ten years old boys and calculate the sample mean The mean height of my random sample is 141cm The mean height of my random sample is 141cm Based on our random sample, we estimate the mean height of 10 years old boys is 141cm Based on our random sample, we estimate the mean height of 10 years old boys is 141cm Estimation – a simple example

It is essential to know the distribution of your data so you can choose the appropriate statistical method to analyse the data It is essential to know the distribution of your data so you can choose the appropriate statistical method to analyse the data Data can be distributed (spread out) in different ways Data can be distributed (spread out) in different ways Continuous data: There are many cases when the data tends to be around a central value with no bias to the left or right – normal distribution Continuous data: There are many cases when the data tends to be around a central value with no bias to the left or right – normal distribution Distribution of Data

Many parametric methods assumes data is normally distributed Many parametric methods assumes data is normally distributed Bell curve Bell curve Peak at a central value Peak at a central value Symmetric about the centre Symmetric about the centre Mean=median=mode Mean=median=mode The distribution can be described by two parameters – mean and standard deviation The distribution can be described by two parameters – mean and standard deviation Distribution of data – Normal distribution

Standard deviation – shows how much variation or ‘dispersion’ exists in the data. Standard deviation – shows how much variation or ‘dispersion’ exists in the data. 95% of the data are contained within 2 standard deviations 95% of the data are contained within 2 standard deviations Standard deviation

A simulated example – Birth weight Mean=3250gSD=550g Histogram of birth weight

Some common distributions Some common distributions – Binomial distribution – gestational diabetes (Yes/No) – Uniform distribution - throwing a die, equal (uniform) probability for each of the six sides – And many many more… Some other common distributions

Because of random sampling, the estimated value will be just an estimate – not exactly the same as the true value Because of random sampling, the estimated value will be just an estimate – not exactly the same as the true value If repeated samples are taken from a population then each sample and hence sample mean and standard deviation is different. This is known as Sampling Variability If repeated samples are taken from a population then each sample and hence sample mean and standard deviation is different. This is known as Sampling Variability Sampling variability

In practice we do not repeat the sampling to measure sampling variability we endeavour to obtain a random sample and use statistical theory to quantify the error In practice we do not repeat the sampling to measure sampling variability we endeavour to obtain a random sample and use statistical theory to quantify the error Fundamental principle to justify our estimate is reasonable: If it were possible to repeat a study over and over again, in the long run the estimates of each study would be distributed around the true value Fundamental principle to justify our estimate is reasonable: If it were possible to repeat a study over and over again, in the long run the estimates of each study would be distributed around the true value If we have a random sample then the sampling variability depends on the size of the sample and the underlying variability of the variable being measured If we have a random sample then the sampling variability depends on the size of the sample and the underlying variability of the variable being measured Sampling variability