Task 6 Statistical Approaches

Slides:

Advertisements

Similar presentations

MCMC estimation in MlwiN

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.

Regression and correlation methods

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.

Chapter 10 Curve Fitting and Regression Analysis

Part 24: Bayesian Estimation 24-1/35 Econometrics I Professor William Greene Stern School of Business Department of Economics.

Objectives (BPS chapter 24)

Task 6 Statistical Approaches Scope of Work Bob Youngs NGA Workshop #5 March 25, 2003.

Inference for regression - Simple linear regression

Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.

Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.

Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.

Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.

NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.

1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.

Tutorial I: Missing Value Analysis

Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.

- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.

R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.

Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.

R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)

Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.

Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.

STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.

Data Modeling Patrice Koehl Department of Biological Sciences

Chapter 7. Classification and Prediction

Part 5 - Chapter

Linear Regression.

Parameter Estimation and Fitting to Data

Multiple Imputation using SOLAS for Missing Data Analysis

Chapter 9 Multiple Linear Regression

MISSING DATA AND DROPOUT

Model Inference and Averaging

The general linear model and Statistical Parametric Mapping

Ch3: Model Building through Regression

Chapter 11: Simple Linear Regression

Kaniz Rashid Lubana Mamun MS Student: CSU Hayward Dr. Eric A. Suess

CH 5: Multivariate Methods

The Maximum Likelihood Method

CJT 765: Structural Equation Modeling

Simple Linear Regression - Introduction

Multiple Imputation Using Stata

How to handle missing data values

The Maximum Likelihood Method

Diagnostics and Transformation for SLR

CHAPTER 29: Multiple Regression*

Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae

Introduction to Instrumentation Engineering

Modelling data and curve fitting

Model Comparison.

Discrete Event Simulation - 4

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

5.2 Least-Squares Fit to a Straight Line

Chapter 4, Regression Diagnostics Detection of Model Violation

The general linear model and Statistical Parametric Mapping

The loss function, the normal equation,

Non response and missing data in longitudinal surveys

Mathematical Foundations of BME Reza Shadmehr

Parametric Methods Berlin Chen, 2005 References:

Exploratory Data Analysis

Diagnostics and Transformation for SLR

Clinical prediction models

Generalized Additive Model

Applied Statistics and Probability for Engineers

Regression Models - Introduction

Classical regression review

Presentation transcript:

Task 6 Statistical Approaches Bob Youngs NGA Workshop #6 July 19, 2004 July 19, 2004 Peer-NGA Project

Truncated Data Unknown number of recordings where value of yi < Ztrunc , value of xi is unknown (Toro, 1981) July 19, 2004 Peer-NGA Project

Truncated Data Statistical Model July 19, 2004 Peer-NGA Project

Fit to Truncated Data Ignoring Effect July 19, 2004 Peer-NGA Project

Fit Using Truncated Data Model July 19, 2004 Peer-NGA Project

Fit to Simulated Data July 19, 2004 Peer-NGA Project

Fit to Truncated Simulated Data July 19, 2004 Peer-NGA Project

Uncertain and Missing Predictor Variables Uncertain predictors Magnitude Distance/Rupture Geometry Site parameters (discrete and continuous) Missing predictors Rupture Geometry (for smaller events) July 19, 2004 Peer-NGA Project

Predictor Variable Uncertainty General model Y = f(X) + ε Observe W which is imprecisely related to X Two types of error processes Error Model W = f(X) + U applies when one wants X, but cannot measure it precisely – “classical” measurement error Regression Calibration Model X = f(W) + U one can measure W precisely, but quantity of interest X is variable – often applies to laboratory studies July 19, 2004 Peer-NGA Project

Magnitude Uncertainty (Rhoades, BSSA, 1997) Start with random (mixed) effects model Reported magnitude, , contains error δi [N(0,si2)] Revised mixed effects model Solution obtained using “standard” approaches, including analytical inversion of variance matrix July 19, 2004 Peer-NGA Project

Magnitude Uncertainty for NGA Models likely to be non-linear in magnitude Reported magnitude, , contains error δi [N(0,si2)] Revised mixed effects model Variance matrix terms due to error in magnitude i now vary over j, - as a result not analytically invertible July 19, 2004 Peer-NGA Project

Simulation Extrapolation Approach Applied in cases where W=X+U with U N(0,s2) Simulate a series of data sets with increasingly large measurement error Wb,i(λ)=Wi + λ½Ub,i where Ub,i are simulated error terms with 0 mean and variance s2 For each value of λ average the parameters of the model Θ over many simulations to obtain an average value July 19, 2004 Peer-NGA Project

Simulation Extrapolation (continued) Define a functional relationship for Extrapolate back to λ = -1 Coefficients -1 1 2 July 19, 2004 Peer-NGA Project

Example Application of Simulation Extrapolation Approach Applied in cases where W=X+U with U N(0,s2) Simulate a series of data sets with increasingly large measurement error Wb,i(λ)=Wi + λ½Ub,i where Ub,i are simulated error terms with 0 mean and variance s2 For each value of λ average the parameters of the model Θ over many simulations to obtain an average value July 19, 2004 Peer-NGA Project

Assess the Effect of Magnitude Uncertainty Start with a “True” Model Simulate PGA values from “True” model using NGA M-R disribution Calculate mean of model parameters from simulated data sets (parametric bootstrap) Obtain simulated data set where fitted parameters are closest to “True” Model Using data set from 2, increase sigma in M using NGA M values. Obtain mean parameter from 500 simulations of uncertain M July 19, 2004 Peer-NGA Project

Simulated Data July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

July 19, 2004 Peer-NGA Project

Missing Predictor Variables Site classification variables VS30, NEHRP Categories, Other Site Categories, Depth to VS of 1.5 km/sec Rupture geometry variables Directivity variables Hanging wall/footwall determinations Confined to smaller events/distant recordings where effect is believed to be minimal? July 19, 2004 Peer-NGA Project

Reason for Missing Predictors Independent of all data Dependent on value of the missing predictor Dependent on the values of other predictors July 19, 2004 Peer-NGA Project

Pattern of Missing Predictors Univariate Monotone Special Random July 19, 2004 Peer-NGA Project

Missing Data Methods Complete-case analysis Easily implemented Valid inferences when missing predictors depend upon data May lead to elimination of a lot of useful information Useful starting result July 19, 2004 Peer-NGA Project

Missing Data Methods Imputation Multiple Imputation Missing X’s estimated from correlations with other X’s or X’s and Y’s Typically down weight imputed observations Multiple Imputation Simulate multiple data sets incorporating uncertainty in estimated missing X’s Provides method for incorporation effect of uncertainty in imputation on estimation July 19, 2004 Peer-NGA Project

Missing Data Methods Maximum Likelihood Bayesian Simulation Methods Need a model for joint distribution of Y and X, including missing X’s Random missing patterns will need iterative approaches Bayesian Simulation Methods e.g. Gibbs sampler Computer intensive (multiple thousands of simulations) July 19, 2004 Peer-NGA Project

Missing/Uncertain Data If missing X’s are estimated from an external model (e.g. VS30– becomes an uncertain predictor problem Simulation methods appear to be useful for both problems Implement these methods at later stage of model development to obtain final coefficients and their uncertainty Develop an implementation of each developer’s final model to quantify the effects of missing/uncertain data and provide parameter uncertainty July 19, 2004 Peer-NGA Project