Doing Data Science – Chapter 12: Epidemiology Vast amounts of individual patient medical data is available – Detailed – visits, prescriptions, outcomes,

Slides:



Advertisements
Similar presentations
Chapter 2 - What is Science?
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Chance, bias and confounding
Chapter 11: Sequential Clinical Trials Descriptive Exploratory Experimental Describe Find Cause Populations Relationships and Effect Sequential Clinical.
Using Machine Learning to Model Standard Practice: Retrospective Analysis of Group C-Section Rate via Bagged Decision Trees Rich Caruana Cornell CS Stefan.
Midterm Review Goodness of Fit and Predictive Accuracy
Machine Learning Risk Adjustment of the C-section Rate: Impact by Provider Cynthia J. Sims MD, Obstetrics, Gynecology & Reproductive Sciences, Magee Womens.
THREE PILLARS OF EXPERIMENTAL RESEARCH
Behavioral Research Chapter Four Studying Behavior.
Covariate Selection for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Association vs. Causation
EPIDEMIOLOGY Why is it so damn confusing?. Disease or Outcome Exposure ab cd n.
Research Design Interactive Presentation Interactive Presentation
8-10% of AP Exam. » Does sleeping less than seven hours a day reduce how long you will live? » Do violent video games make people more aggressive? » Can.
Psychology as a Science
Multiple Choice Questions for discussion
Mayfield – Data Handling Lo: To understand which is the appropriate graph to test each hypothesis. To be able to self analyse and adapt my own work.
Copyright © 2010 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Chapter 13 Notes Observational Studies and Experimental Design
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
Quick Pre-Assessment For these questions, refer to the situation described below: In an experiment designed to determine whether watching violent scenes.
Experimental Design making causal inferences Richard Lambert, Ph.D.
Julio A. Ramirez, MD, FACP Professor of Medicine Chief, Infectious Diseases Division, University of Louisville Chief, Infectious Diseases Section, Veterans.
Experimental Design All experiments have independent variables, dependent variables, and experimental units. Independent variable. An independent.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Economic evaluation of health programmes Department of Epidemiology, Biostatistics and Occupational Health Class no. 19: Economic Evaluation using Patient-Level.
Chapter 3.1.  Observational Study: involves passive data collection (observe, record or measure but don’t interfere)  Experiment: ~Involves active data.
 Producing Data: Experiments Vs. Surveys Chapter 5.
Producing Data: Samples and Experiments Chapter 5.
Unit 3: Credibility of Health Claims. Credibility of health claims How do you know what to believe? What makes information reliable? Can you really lose.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Sifting through the evidence Sarah Fradsham. Types of Evidence Primary Literature Observational studies Case Report Case Series Case Control Study Cohort.
Scientific Method. Observation:Question:Hypothesis:
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
1 EPI235: Epi Methods in HSR April 5, 2005 L3 Evaluating Health Services using administrative data 2: Advanced Topics in Risk Adjustment (Dr. Schneeweiss)
Student’s t test This test was invented by a statistician WS Gosset ( ), but preferred to keep anonymous so wrote under the name “Student”. This.
September 27, 2012 Do Now: Answer the following question on a separate sheet of paper: Do you consider psychology to be a real science? Explain your answer.
Assessing the benefits of a stratified treatment strategy which improves average HbA1c in a proportion of patients with Type 2 diabetes: a MASTERMIND study.
Handbook for Health Care Research, Second Edition Chapter 11 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 11 Statistical Methods for Nominal Measures.
Selecting Valid Statistical Test for Evidence Based Medicine Chapter 1 Overview: 1.1 Why Selecting Valid Statistical Tests are Important? 1.2 Factors to.
Direct method of standardization of indices. Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the.
Guidelines for building a bar graph in Excel and using it in a laboratory report IB Biology (December 2012)
Definition Slides Unit 2: Scientific Research Methods.
Definition Slides Unit 1.2 Research Methods Terms.
(www).
Research Methods Systematic procedures for planning research, gathering and interpreting data, and reporting research findings.
Chapter 15 Analysis of Variance. The article “Could Mean Platelet Volume be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor,
Chapter 12 Quantitative Questions and Procedures.
Experimental Research
Statistics Use of mathematics to ORGANIZE, SUMMARIZE and INTERPRET numerical data. Needed to help psychologists draw conclusions.
Associations of Maternal Antidepressant Use During the First Trimester of Pregnancy With Preterm Birth, Small for Gestational Age, Autism Spectrum Disorder,
The comparative self-controlled case series (CSCCS)
Statistics in Clinical Trials: Key Concepts
OHDSI Method Evaluation
FMS1204S: Fraud, deception and data
به نام خدا كاربرد كامپيوتر در مهندسي صنايع نرم افزار spss
Lecture 3: Introduction to confounding (part 1)
Chapter 4 Studying Behavior
Ch. 1 Questions Answers.
Establishing the Direction of the Relationship
Use your Chapter 1 notes to complete the following warm-up.
Additional notes on random variables
Additional notes on random variables
Selecting the Right Predictors
Thinking critically with psychological science
Vocab unit 2 Research.
Stratified Covariate Balancing Using R
Alcohol, Other Drugs, and Health: Current Evidence May–June 2019
Presentation transcript:

Doing Data Science – Chapter 12: Epidemiology Vast amounts of individual patient medical data is available – Detailed – visits, prescriptions, outcomes, etc. – Records cover lifetimes – Largest databases have records on 80 million people However many medical studies are observational – Not founded on data – Results effect actions of doctors and insurance regulators

Confounder Problem and Stratification Confounding problem: an extraneous variable which correlates to both the dependent and independent statistical variable, giving an incorrect perception of cause and effect Stratification: partitioning a case into subcases and evaluating just the subcases to reach conclusions about the top level case – Weighted average is one way of evaluating subcases Example [p ]: – In study where equal number of women (50) and men (50) had treatment but different numbers (80 women, 20 men) were in the control group – Original causal effect is 10% – Stratified causal effect is 5% for men and 11.25% for women – This does NOT prove that the treatment side effects are twice as strong for women Problem – errors in causality if the numbers in the groups after stratification are too different to give meaningful statistics

Data Driven Studies Analysis of 50 studies of drug/outcome pairs – 5000 analyses for each pair on nine databases – Example: ACE inhibitors (treatment for hypertension)/swelling of the heart Results varied between databases from 3X risk to 6X risk – For 20 of 50 pairs, risk or no risk was database dependent – By adjusting factors of databases, confounders, and time windows, all studies can show risk or no risk

Data Driven Studies Observational Medical Outcomes Partnership (OMOP) – See how well current methods predict things we already know – 10 large medical databases containing records for 200 million people – $25M – Determined an ROC curve. Area Under the Curve (AUC) was 0.65, not much better than a random 0.5 – Databases are self-consistent – using one database gave better accuracy (0.92 in one case) – Graphs below show ~80% sensitivity with ~10% false-positive rate [p.302]

“The epidemiologists in general don’t believe the results of this study.” In other words, they prefer to rely on observational rather than data driven conclusions

References