Task 6 Statistical Approaches

Slides:



Advertisements
Similar presentations
MCMC estimation in MlwiN
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Regression and correlation methods
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Chapter 10 Curve Fitting and Regression Analysis
Part 24: Bayesian Estimation 24-1/35 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Objectives (BPS chapter 24)
Task 6 Statistical Approaches Scope of Work Bob Youngs NGA Workshop #5 March 25, 2003.

Inference for regression - Simple linear regression
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Tutorial I: Missing Value Analysis
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Data Modeling Patrice Koehl Department of Biological Sciences
Chapter 7. Classification and Prediction
Part 5 - Chapter
Linear Regression.
Parameter Estimation and Fitting to Data
Multiple Imputation using SOLAS for Missing Data Analysis
Chapter 9 Multiple Linear Regression
MISSING DATA AND DROPOUT
Model Inference and Averaging
The general linear model and Statistical Parametric Mapping
Ch3: Model Building through Regression
Chapter 11: Simple Linear Regression
Kaniz Rashid Lubana Mamun MS Student: CSU Hayward Dr. Eric A. Suess
CH 5: Multivariate Methods
The Maximum Likelihood Method
CJT 765: Structural Equation Modeling
Simple Linear Regression - Introduction
Multiple Imputation Using Stata
How to handle missing data values
The Maximum Likelihood Method
Diagnostics and Transformation for SLR
CHAPTER 29: Multiple Regression*
Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae
Introduction to Instrumentation Engineering
Modelling data and curve fitting
Model Comparison.
Discrete Event Simulation - 4
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
5.2 Least-Squares Fit to a Straight Line
Chapter 4, Regression Diagnostics Detection of Model Violation
The general linear model and Statistical Parametric Mapping
The loss function, the normal equation,
Non response and missing data in longitudinal surveys
Mathematical Foundations of BME Reza Shadmehr
Parametric Methods Berlin Chen, 2005 References:
Exploratory Data Analysis
Diagnostics and Transformation for SLR
Clinical prediction models
Generalized Additive Model
Applied Statistics and Probability for Engineers
Regression Models - Introduction
Classical regression review
Presentation transcript:

Task 6 Statistical Approaches Bob Youngs NGA Workshop #6 July 19, 2004 July 19, 2004 Peer-NGA Project

Truncated Data Unknown number of recordings where value of yi < Ztrunc , value of xi is unknown (Toro, 1981) July 19, 2004 Peer-NGA Project

Truncated Data Statistical Model July 19, 2004 Peer-NGA Project

Fit to Truncated Data Ignoring Effect July 19, 2004 Peer-NGA Project

Fit Using Truncated Data Model July 19, 2004 Peer-NGA Project

Fit to Simulated Data July 19, 2004 Peer-NGA Project

Fit to Truncated Simulated Data July 19, 2004 Peer-NGA Project

Uncertain and Missing Predictor Variables Uncertain predictors Magnitude Distance/Rupture Geometry Site parameters (discrete and continuous) Missing predictors Rupture Geometry (for smaller events) July 19, 2004 Peer-NGA Project

Predictor Variable Uncertainty General model Y = f(X) + ε Observe W which is imprecisely related to X Two types of error processes Error Model W = f(X) + U applies when one wants X, but cannot measure it precisely – “classical” measurement error Regression Calibration Model X = f(W) + U one can measure W precisely, but quantity of interest X is variable – often applies to laboratory studies July 19, 2004 Peer-NGA Project

Magnitude Uncertainty (Rhoades, BSSA, 1997) Start with random (mixed) effects model Reported magnitude, , contains error δi [N(0,si2)] Revised mixed effects model Solution obtained using “standard” approaches, including analytical inversion of variance matrix July 19, 2004 Peer-NGA Project

Magnitude Uncertainty for NGA Models likely to be non-linear in magnitude Reported magnitude, , contains error δi [N(0,si2)] Revised mixed effects model Variance matrix terms due to error in magnitude i now vary over j, - as a result not analytically invertible July 19, 2004 Peer-NGA Project

Simulation Extrapolation Approach Applied in cases where W=X+U with U N(0,s2) Simulate a series of data sets with increasingly large measurement error Wb,i(λ)=Wi + λ½Ub,i where Ub,i are simulated error terms with 0 mean and variance s2 For each value of λ average the parameters of the model Θ over many simulations to obtain an average value July 19, 2004 Peer-NGA Project

Simulation Extrapolation (continued) Define a functional relationship for Extrapolate back to λ = -1 Coefficients -1 1 2 July 19, 2004 Peer-NGA Project

Example Application of Simulation Extrapolation Approach Applied in cases where W=X+U with U N(0,s2) Simulate a series of data sets with increasingly large measurement error Wb,i(λ)=Wi + λ½Ub,i where Ub,i are simulated error terms with 0 mean and variance s2 For each value of λ average the parameters of the model Θ over many simulations to obtain an average value July 19, 2004 Peer-NGA Project

Assess the Effect of Magnitude Uncertainty Start with a “True” Model Simulate PGA values from “True” model using NGA M-R disribution Calculate mean of model parameters from simulated data sets (parametric bootstrap) Obtain simulated data set where fitted parameters are closest to “True” Model Using data set from 2, increase sigma in M using NGA M values. Obtain mean parameter from 500 simulations of uncertain M July 19, 2004 Peer-NGA Project

Simulated Data July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

Missing Predictor Variables Site classification variables VS30, NEHRP Categories, Other Site Categories, Depth to VS of 1.5 km/sec Rupture geometry variables Directivity variables Hanging wall/footwall determinations Confined to smaller events/distant recordings where effect is believed to be minimal? July 19, 2004 Peer-NGA Project

Reason for Missing Predictors Independent of all data Dependent on value of the missing predictor Dependent on the values of other predictors July 19, 2004 Peer-NGA Project

Pattern of Missing Predictors Univariate Monotone Special Random July 19, 2004 Peer-NGA Project

Missing Data Methods Complete-case analysis Easily implemented Valid inferences when missing predictors depend upon data May lead to elimination of a lot of useful information Useful starting result July 19, 2004 Peer-NGA Project

Missing Data Methods Imputation Multiple Imputation Missing X’s estimated from correlations with other X’s or X’s and Y’s Typically down weight imputed observations Multiple Imputation Simulate multiple data sets incorporating uncertainty in estimated missing X’s Provides method for incorporation effect of uncertainty in imputation on estimation July 19, 2004 Peer-NGA Project

Missing Data Methods Maximum Likelihood Bayesian Simulation Methods Need a model for joint distribution of Y and X, including missing X’s Random missing patterns will need iterative approaches Bayesian Simulation Methods e.g. Gibbs sampler Computer intensive (multiple thousands of simulations) July 19, 2004 Peer-NGA Project

Missing/Uncertain Data If missing X’s are estimated from an external model (e.g. VS30– becomes an uncertain predictor problem Simulation methods appear to be useful for both problems Implement these methods at later stage of model development to obtain final coefficients and their uncertainty Develop an implementation of each developer’s final model to quantify the effects of missing/uncertain data and provide parameter uncertainty July 19, 2004 Peer-NGA Project