Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential.

Slides:



Advertisements
Similar presentations
AP Statistics Course Review.
Advertisements

Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician 1.
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
Section 1.3 Experimental Design © 2012 Pearson Education, Inc. All rights reserved. 1 of 61.
Section 1.3 Experimental Design.
Chapter 1 Data Presentation Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness.
Statistical Tests Karen H. Hagglund, M.S.
Chapter 1 The Where, Why, and How of Data Collection
QUANTITATIVE DATA ANALYSIS
Statistics for Decision Making Descriptive Statistics QM Fall 2003 Instructor: John Seydel, Ph.D.
Copyright (c) Bani Mallick1 Lecture 2 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #2 Population and sample parameters More on populations.
Statistics Lecture 2. Last class began Chapter 1 (Section 1.1) Introduced main types of data: Quantitative and Qualitative (or Categorical) Discussed.
Social Research Methods
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Statistics 300: Introduction to Probability and Statistics Section 2-2.
Chapter 3 Goals After completing this chapter, you should be able to: Describe key data collection methods Know key definitions:  Population vs. Sample.
Ana Jerončić, PhD Department for Research in Biomedicine and Health.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Southampton Education School Southampton Education School Dissertation Studies Quantitative Data Analysis.
● Midterm exam next Monday in class ● Bring your own blue books ● Closed book. One page cheat sheet and calculators allowed. ● Exam emphasizes understanding.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
Statistics 3502/6304 Prof. Eric A. Suess Chapter 3.
Active Learning Lecture Slides For use with Classroom Response Systems Exploring Data with Graphs and Numerical Summaries.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Experimental Design 1 Section 1.3. Section 1.3 Objectives 2 Discuss how to design a statistical study Discuss data collection techniques Discuss how to.
Statistics Definition Methods of organizing and analyzing quantitative data Types Descriptive statistics –Central tendency, variability, etc. Inferential.
Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential.
Introduction to Probability and Statistics Consultation time: Ms. Chong.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Chapter 1: The Nature of Statistics
S TATISTICS Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Introduction Biostatistics Analysis: Lecture 1 Definitions and Data Collection.
N318b Winter 2002 Nursing Statistics Lecture 2: Measures of Central Tendency and Variability.
Biostatistics Case Studies 2008 Peter D. Christenson Biostatistician Session 5: Choices for Longitudinal Data Analysis.
S TATISTICS Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
STA Lecture 51 STA 291 Lecture 5 Chap 4 Graphical and Tabular Techniques for categorical data Graphical Techniques for numerical data.
What is SPSS  SPSS is a program software used for statistical analysis.  Statistical Package for Social Sciences.
EDPSY Chp. 2: Measurement and Statistical Notation.
Describing and Displaying Quantitative data. Summarizing continuous data Displaying continuous data Within-subject variability Presentation.
Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician 1.
Data: Presentation and Description Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
EDCI 696 Dr. D. Brown Presented by: Kim Bassa. Targeted Topics Analysis of dependent variables and different types of data Selecting the appropriate statistic.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 6: Case Study.
Medical Statistics as a science
Describing Data: Graphical Methods ● So far we have been concerned with moving from asking a research question to collecting good quality empirical data.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 1: Quantitative and Inferential Issues.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size for Precision or Power.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Discovering Mathematics Week 5 BOOK A - Unit 4: Statistical Summaries 1.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Section 1.3 Experimental Design.
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Biostatistics Case Studies 2016 Youngju Pak, PhD. Biostatistician Session 1 Understanding hypothesis testing, P values, and sample size.
Section 1.3 Objectives Discuss how to design a statistical study Discuss data collection techniques Discuss how to design an experiment Discuss sampling.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
PSY 325 AID Education Expert/psy325aid.com FOR MORE CLASSES VISIT
Biostatistics Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
2 NURS/HSCI 597 NURSING RESEARCH & DATA ANALYSIS GEORGE MASON UNIVERSITY.
Prof. Eric A. Suess Chapter 3
Biostatistics Case Studies 2016
Article & Final Reviews
Basic Statistics Overview
Presentation transcript:

Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential Issues I

Why Statistics ? For Today’s Graduate, Just One Words: Statistics, NY Times, Aug 5, 2009 " I keep saying that the sexy job in the next 10 years will be statisticians," said Hal Varian, chief economist at Google. "I am not much given to regret, so I puzzled over this one a while. Should have taken much more statistics in college, I think. :)" —Max Levchin, Paypal Co-founder, Slide founder

Who am I? Dr. Youngju Pak Originally come from South Korea. PhD-Biostatistics, MS-Stat., BA-Stat. Assistant Professor of Biostatistics at MU until 2012 Joined LA BioMED in March 2013 Practicing Biostatistics since 2000

Who are you? Name Career Aspirations

Class webpage & Session Schedule Class Webpage: Select "Courses" at (use Explore. Chrome is not quite working with this website somehow) All class material are posted and will be updated on the class webpage There will be some pop-up Quizzes There will be some HW assignments. The TOP THREE will be announced and rewarded at the last session.

Session 1 Objectives General quantitative needs in biological research Overview of statistical issues using a published paper How to run Statistical software, MYSTAT

General Quantitative Needs Descriptive: Appropriate summarization to meet scientific questions: e.g., changes, or % changes, or reaching threshold? mean, or minimum, or range of response? average time to death, or chances of dying by a fixed time?

General Quantitative Needs, Cont’d Inferential: Could results be spurious, a fluke, due to “natural” variations or chance? Inferential statistics: 95% confidence intervals, p-values, etc. Sensitivity/Power: How many subjects are needed?

Professional Statistics Software Package Output Enter code; syntax. Stored data; access- ible.

Microsoft Excel for Statistics Primarily for descriptive statistics. Limited output. No analyses for %s.

Free Statistics Software: Mystat

Free Study Size Software

Session 1 Objectives General quantitative needs in biological research Overview of statistical issues using a published paper How to run Statistical software, MYSTAT

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results

Paper with Common Statistical Issues Case Study:

McCann, et al., Lancet 2007 Nov 3;370(9598): Food additives and hyperactive behaviour in 3-year-old and 8/9-year- old children in the community: a randomised, double-blinded, placebo- controlled trial. Objective: test whether intake of artificial food color and additive (AFCA) affects childhood behavior Target population: 3-4, 8-9 years old children Study design: randomized, double-blinded, controlled, crossover trial Sample size: 153 (3 years), 144(8-9 years) in Southampton UK Sampling: Stratified sampling based on SES Baseline measure: 24h recall by the parent of the child’s pretrial diet Group: three groups (mix A, mix B, placebo) Outcomes: ADHD rating scale IV by teachers, WWP hyperactivity score by parents, classroom observation code, Conners continuous performance test II (CPTII)  GHA score

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results

Representative or Random Samples How were the children to be studied selected (second column on the first page)? The authors purposely selected "representative" social classes. Is this better than a "randomly" chosen sample that ignores social class? Often hear: Non-random = Non-scientific.

Case Study: Participant Selection No mention of random samples.

Case Study: Participant Selection It may be that only a few schools are needed to get sufficient individuals. If, among all possible schools, there are few that are lower SES, none of these schools may be chosen. So, a random sample of schools is chosen from the lower SES schools, and another random sample from the higher SES schools.

Selection by Over-Sampling It is not necessary that the % lower SES in the study is the same as in the population. There may still be too few subjects in a rare subgroup to get reliable data. Can “over-sample” a rare subgroup, and then weight overall results by proportions of subgroups in the population. The CDC NHANES( ) studies do this.

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results

Basic Study Designs 1. Prospective (longitudinal) :Risk Factor (2014)  Disease status (2020) 2. Retrospective(Case-Control) : Disease status (2014)  Risk Factor (2000) 3. Cross sectional : Disease status (2014)  Risk Factor (2014) 4. Experimental or Randomized- Control : Risk Factor (2014)  Disease status (2020) with assignment of Risk Factor

Random Samples vs. Randomization We have been discussing the selection of subjects to study, often a random sample. An observational study would, well, just observe them. An interventional study assigns each subject to one or more treatments in order to compare treatments. Randomization refers to making these assignments in a random way.

Why Randomize? So that groups will be similar except for the intervention. So that, when enrolling, we will not unconsciously choose an “appropriate” treatment for a particular subject. Minimizes the chances of introducing bias when attempting to systematically remove it, as in plant yield example.

Case Study: Crossover Design Each child is studied on 3 occasions under different diets. Is this better than three separate groups of children? Why, intuitively? How could you scientifically prove your intuition?

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results

Blocked vs. Unblocked Studies AKA matched vs. unmatched. AKA paired vs. unpaired. Block = Pair = Set receiving all treatments. Set could be an individual at multiple times (pre and post), or left and right arms for sunscreen comparison; twins or family; centers in multi- center study, etc. Block ↔ Homogeneous. Blocking is efficient because treatment differences are usually more consistent among subjects than each separate treatment is.

Potential Efficiency Due to Pairing … … A BA B Δ=B-A … …. Δ 33 3 Unpaired A and B Separate Groups Paired A and B in a Paired Set

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results

Outcome Measures Generally, how were the outcome measures defined (third page)? They are more complicated here than for most studies. What are the units (e.g., kg, mmol, $, years)? Outcome measures are specific and pre- defined. Aims and goals may be more general.

Summarization of Data with Descriptive Statistics

What is the difference between Table 1 and Table 2 in terms of methods used to summarize the data?

Variable CategoricalNumerical Ordinal Categories are mutually exclusive and ordered Examples: Disease stage, Education level, 5 point likert scale Counts Integer values Examples: Days sick per year, Number of pregnancies, Number of hospital visits Measured (continuous) Takes any value in a range of values Examples: weight in kg, height in feet, age (in years) QualitativeQuantitative Nominal Categories are mutually exclusive and unordered Examples: Gender, Blood group, Eye colour, Marital status Types of Data

It is critical to identify the type of data since the choice of an appropriate statistical test as well as how to summarize the data depend on the type of the data.

36 Describing categorical & quantitative data Categorical Data –Binary, Nominal, or Ordinal data Disease status ( yes, no) Education level The assignment of the treatment Cancer stage Marital Status –Frequency tables (one, two, or multi way tables) are usually used Quantitative Data –Counts or Continuous Data Weight Blood pressure Age Length of hospital stay in days The total number of ER visits per year –Means or Medians are used for the measure of the central tendency. –Standard deviations or percentiles are used for the measure of variability. –When data is skewed, Medians & percentiles are better summary statistics

How to display Data A picture is worth a thousand words ! To getting a ‘feel’ for the data. Categorical data –Frequency tables, Contingency tables (cross tables), Bar charts, Pie-charts Quantitative data –Dot plots, Histograms, Box-Whisker plots*, Scatter plots

Frequency Tables

Contingency Tables (Crosstabulations)

Bar Charts

Pie Charts

Histograms To catch the patterns of the data Divide up the data points into several mutually exclusive intervals –Categorize the data points.

Scatter plots Usually used to illustrate a relationship b/w two variables.

Box-Whisker Plots

What have we learn today?

Assignments HW #1 is posted on the course website Pre-Step for HW #1 –Install MYSTAT in your labtop or a computer in your school computer lab with permission from your school (Ask Ms. Aberle for help) –Download Survey.sav (SPSS data file) from the course website (under Session 1) Submit the hard copy of the completed HW in next session. Read the article focusing on contents in Table 3 &4 and Figure 4.