Biostatistics 760 Random Thoughts.

Slides:



Advertisements
Similar presentations
Grant review at NIH for statistical methodology Jeremy M G Taylor Michelle Dunn Marie Davidian.
Advertisements

CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Departments of Medicine and Biostatistics
HSRP 734: Advanced Statistical Methods July 24, 2008.
Model Assessment, Selection and Averaging
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
Maximum likelihood (ML) and likelihood ratio (LR) test
Biostatistics 760 Random Thoughts. Upcoming Classes Bios 761: Advanced Probability and Statistical Inference Bios 763: Generalized Linear Model Theory.
Estimation from Samples Find a likely range of values for a population parameter (e.g. average, %) Find a likely range of values for a population parameter.
Parametric Inference.
Biostatistics Frank H. Osborne, Ph. D. Professor.
July 3, A36 Theory of Statistics Course within the Master’s program in Statistics and Data mining Fall semester 2011.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Maximum likelihood (ML)
Survival Analysis for Risk-Ranking of ESP System Performance Teddy Petrou, Rice University August 17, 2005.
The Paradigm of Econometrics Based on Greene’s Note 1.
Survival Analysis: From Square One to Square Two
Prognostic Modelling and Profiling of Breast Cancer Patients after Surgery Ian Jarman School of Computer and Mathematical Sciences Liverpool John Moores.
Chapter VIII: Elements of Inferential Statistics
Simple Linear Regression
Calibration Guidelines 1. Start simple, add complexity carefully 2. Use a broad range of information 3. Be well-posed & be comprehensive 4. Include diverse.
Introduction: Why statistics? Petter Mostad
D:/rg/folien/ms/ms-USA ppt F 1 Assessment of prediction error of risk prediction models Thomas Gerds and Martin Schumacher Institute of Medical.
Advanced Higher Statistics Data Analysis and Modelling Hypothesis Testing Statistical Inference AH.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Empirical Efficiency Maximization: Locally Efficient Covariate Adjustment in Randomized Experiments Daniel B. Rubin Joint work with Mark J. van der Laan.
POSTER TEMPLATE BY: Weighted Kaplan-Meier Estimator for Adaptive Treatment Strategies in Two-Stage Randomization Designs Sachiko.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
Academic Research Academic Research Dr Kishor Bhanushali M
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University.
Introduction Sample Size Calculation for Comparing Strategies in Two-Stage Randomizations with Censored Data Zhiguo Li and Susan Murphy Institute for Social.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
1 Optimal design which are efficient for lack of fit tests Frank Miller, AstraZeneca, Södertälje, Sweden Joint work with Wolfgang Bischoff, Catholic University.
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4.
6 - 1 © 2000 Prentice-Hall, Inc. Statistics for Business and Economics Sampling Distributions Chapter 6.
Joint Modelling of Accelerated Failure Time and Longitudinal Data By By Yi-Kuan Tseng Yi-Kuan Tseng Joint Work With Joint Work With Professor Jane-Ling.
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
Estimating standard error using bootstrap
Elizabeth Garrett-Mayer, PhD Associate Professor of Biostatistics
Modeling and Simulation CS 313
Core Research Competencies:
PSY 626: Bayesian Statistics for Psychological Science
Data Analysis.
Advanced Higher Statistics
The Importance of Adequately Powered Studies
Biostatistics 760 Random Thoughts.
Modeling and Simulation CS 313
Statistical Approaches to Support Device Innovation- FDA View
Survival Analysis: From Square One to Square Two Yin Bun Cheung, Ph.D. Paul Yip, Ph.D. Readings.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Chapter 1 Introduction to Chemistry 1.3 Thinking Like a Scientist
Behavioral Statistics
CJT 765: Structural Equation Modeling
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
PSY 626: Bayesian Statistics for Psychological Science
(or why should we learn this stuff?)
Mai Zhou Dept. of Statistics, University of Kentucky
Pattern Recognition and Machine Learning
Chengyaun yin School of Mathematics SHUFE
If we can reduce our desire,
The loss function, the normal equation,
Mathematical Foundations of BME Reza Shadmehr
Advanced Statistical Methods for Translational Research
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Presentation transcript:

Biostatistics 760 Random Thoughts

Upcoming Classes Bios 761: Advanced Probability and Statistical Inference Bios 767: Longitudinal Data Analysis Bios 780: Theory and Methods for Survival Analysis Bios 841: Statistical Consulting

Bios 761 Frequentist and Bayesian decision theory Hypothesis testing: UMP tests, etc. Bootstrap and other methods of inference High dimensional data methods

Bios 780 Time-to-event data Right censoring Counting processes; martingales Semiparametric approaches Kaplan-Meier estimator Log-rank statistic Cox model Data analysis

Bios 841 Consulting versus collaboration Bringing it all together to solve problems Communicating about statistics Three real problems Three journal style reports One final oral presentation Real time problem solving What is the role of statistical theory?

A Few War Stories As a student: thesis on surrogates As a postdoc: infectious diseases As a new professor: cystic fibrosis (CF)* Working on tenure: empirical processes Empirical processes and cancer* Chair of the DSMC for NICHD Artificial intelligence and NSCLC

CF Neonatal Screening 1992: Joined Phil Farrell’s CF study team 1997: Farrell, Kosorok, Laxova, et al, published in NEJM 2004 (Oct. 15): CDC recommended CF newborn screening: the 1997 article was judged the only valid randomized trial States offering CF newborn screening: 3 in 1997, 12 in 2004, all 50 today

What Role Did “Theory” Play? Used state-of-the-art statistical methods that were robust (GEE) In other CF research we have used: Current status methods (parametric, robust) Constrained regression estimation Semiparametric bootstrap inference Martingale based survival analysis New work using artificial intelligence

Empirical Processes and Cancer Non-Hodgkin’s Lymphoma Prognostic Factors Project (1993, NEJM) Cox proportional hazards model employed to ascertain risks of 5 prognostic factors: Age, performance Status, serum lactate dehydrogenase Level, number of extranodal disease Sites, tumor Stage Diagnostics show the model fits poorly

What is the Problem? Poor survival function prediction Possibly incorrect interpretation of risk factor effects A model that adds a single parameter to the Cox model was developed and fit This new model fits well (Kosorok,Lee and Fine, 2004) Inference for the new model is complicated

What Does Theory Tell Us? We can derive valid inferential tools for the new model: estimation and bootstrap Robustness was also studied: we learn theoretically that the Cox model is robust to this kind of model misspecification: The direction of the regression coefficients is preserved Should use robust variance for Cox model

Theory Versus Applications The title implies there is conflict between theory and applications This isn’t true! Theory provides a basis for correct thinking and problem solving for applications Applications drive new theoretical development

Theory Can Be Impractical Law of iterated logarithm: needs sample size of 108 (“asymptopia”). Sometimes higher order approximations are needed before it becomes useful. Sometimes computational properties of asymptotically optimal estimators are poor. Some hard problems take years to solve.

Why Theory is Needed Often it does work for practical sample sizes. Can reveal properties that are universally valid: simulation studies are limited to the scenarios investigated. Theory can lead toward methodological solutions (Cook and Kosorok, 2004 JASA). Theory can drive scientific discovery. Some results are beautiful.

Data Mining Versus Inference Data mining is summarizing and representing data no matter how complicated Inference is determining valid measures of uncertainty Patterns obtained from data mining can be misleading Inference without data mining may miss important structure

The Core of Statistics Statistics is the science of science How do we learn from our world and draw meaningful and valid conclusions from it? Need both data mining and valid inference Requires a unique kind of intuition Needs many different intellectual perspectives One of the most challenging of all fields

Everyone Needs Core Literacy All statisticians need to know enough theory to have core literacy about statistics and to be able to problem solve All statisticians need to know enough about applications to know what is important All biostatisticians need to know enough statistical methods to be useful in practice The purpose of a Ph.D. in Biostatistics is to enable the creation of new methodology

Semiparametric Inference The study of statistical models with parametric and/or nonparametric parts Can achieve trade-off between scientific meaning and model “robustness” Estimation and inference are often hard There exists an efficiency bound for parametric and some nonparametric parts NPMLE, testing and estimating equations

Empirical Processes Tools for complex model inference and high dimensional data Can determine universal properties of semiparametric methods: Consistency Rate of convergence Limiting distributions Valid inference (empirical process bootstrap) Empirical processes are everywhere

The Road Ahead Whatever you choose to do, the core statistical theory classes will help you. Be patient as your learn. Be willing to work hard (struggle is good). It takes many different kinds of thinkers with different learning styles. There are important discoveries to be made in both applications and theory.