Statistical Inference and Regression Analysis: GB.3302.30 Professor William Greene Stern School of Business IOMS Department Department of Economics.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Tests of Static Asset Pricing Models
Part 15: Hypothesis Tests 15-1/18 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Econometrics I Professor William Greene Stern School of Business
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Part 17: Nonlinear Regression 17-1/26 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Chapter 6 Sampling and Sampling Distributions
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Statistical Estimation and Sampling Distributions
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Econometrics I Professor William Greene Stern School of Business
Maximum likelihood (ML) and likelihood ratio (LR) test
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 7 Sampling and Sampling Distributions
Professor William Greene Stern School of Business IOMS Department Department of Economics Statistical Inference and Regression Analysis: Stat-GB ,
Parametric Inference.
SIMPLE LINEAR REGRESSION
Part III: Inference Topic 6 Sampling and Sampling Distributions
July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.
Inferences About Process Quality
Maximum likelihood (ML)
Part 10: Qualitative Data 10-1/21 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Part 24: Multiple Regression – Part /45 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department.
SIMPLE LINEAR REGRESSION
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Random Sampling, Point Estimation and Maximum Likelihood.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
[Part 4] 1/43 Discrete Choice Modeling Bivariate & Multivariate Probit Discrete Choice Modeling William Greene Stern School of Business New York University.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Section 6.4 Inferences for Variances. Chi-square probability densities.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Computacion Inteligente Least-Square Methods for System Identification.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Applied Econometrics William Greene Department of Economics Stern School of Business.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
Conditional Expectation
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
STATISTICS POINT ESTIMATION
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Chapter 4. Inference about Process Quality
Ch3: Model Building through Regression
Discrete Event Simulation - 4
Statistics and Data Analysis
Statistics and Data Analysis
Statistical Inference and Regression Analysis: GB
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Econometrics I Professor William Greene Stern School of Business
Applied Statistics and Probability for Engineers
Presentation transcript:

Statistical Inference and Regression Analysis: GB Professor William Greene Stern School of Business IOMS Department Department of Economics

Statistics and Data Analysis Part 10 – Advanced Topics

Advanced topics Nonlinear Least Squares Nonlinear Models – ML Estimation Poisson Regression Binary Choice End of course. 4

Statistics and Data Analysis Nonlinear Least Squares

Lanczos 1 Data

Nonlinear Regression

Nonlinear Least Squares There are no explicit solutions to these equations in the form of b i = a function of (y,x).

Strategy for Nonlinear LS

NLS Strategy Pick b A. Compute y i 0 and x i 0 B. Regress y i 0 on x i 0 This obtains a new b Return to step A or exit if the new b is the same as the old b

Lanczos 1 First Iteration Now, repeat the iteration using this as b

This is the correct answer

Gauss-Marquardt Algorithm Starting with b 0 A. Compute regressors x i 0 Compute residuals e i 0 = y i – f(x i,b 0 ) B. New b 1 = b 0 + slopes in regression of e i 0 on x i 0 Return to A. or exit if estimates have converged. This is equivalent to our earlier method.

Statistics and Data Analysis Maximum Likelihood: Poisson

Application: Doctor Visits German Individual Health Care data: N=27,236 Model for number of visits to the doctor: Poisson regression Age, Health Satisfaction, Marital Status, Income, Kids

Poisson Regression

Nonlinear Least Squares

Maximum Likelihood Estimation This defines a class of estimators based on the particular distribution assumed to have generated the observed random variable. The main advantage of ML estimators is that among all Consistent Asymptotically Normal Estimators, MLEs have optimal asymptotic properties.

Setting up the MLE The distribution of the observed random variable is written as a function of the parameters to be estimated P(y i |data,β) = Probability density | parameters. The likelihood function is constructed from the density Construction: Joint probability density function of the observed sample of data – generally the product when the data are a random sample.

Likelihood for the Poisson Regression

Newton’s Method

Properties of the MLE Consistent: Not necessarily unbiased, however Asymptotically normally distributed: Proof based on central limit theorems Asymptotically efficient: Among the possible estimators that are consistent and asymptotically normally distributed Invariant: The MLE of g(  ) is g(the MLE of  )

Computing the Asymptotic Variance We want to estimate {-E[H]} -1 Three ways: (1) Just compute the negative of the actual second derivatives matrix and invert it. (2) Insert the maximum likelihood estimates into the known expected values of the second derivatives matrix. Sometimes (1) and (2) give the same answer (for example, in the Poisson regression model). (3) Since E[H] is the variance of the first derivatives, estimate this with the sample variance (i.e., mean square) of the first derivatives. This will almost always be different from (1) and (2). Since they are estimating the same thing, in large samples, all three will give the same answer.

Poisson Regression Iterations

MLE NLS

Using the Model. Partial Effects

Effect of Income Depends on Age

Effect of Income | Age

Statistics and Data Analysis Binary Choice

Case Study: Credit Modeling 1992 American Express analysis of Application process: Acceptance or rejection; Y = 0 (reject) or 1 (accept). Cardholder behavior Loan default (D = 0 or 1). Average monthly expenditure (E = $/month) General credit usage/behavior (C = number of charges) 13,444 applications in November, 1992

Proportion for Bernoulli In the AmEx data, the true population acceptance rate is =  Y = 1 if application accepted, 0 if not. E[y] =  E[(1/N)Σ i y i ] = p accept = . This is the estimator 35

Some Evidence = Homeowners Does the acceptance rate depend on home ownership?

A Test of Independence In the credit card example, are Own/Rent and Accept/Reject independent? Hypothesis: Prob(Ownership) and Prob(Acceptance) are independent Formal hypothesis, based only on the laws of probability: Prob(Own,Accept) = Prob(Own)Prob(Accept) (and likewise for the other three possibilities. Rejection region: Joint frequencies that do not look like the products of the marginal frequencies.

Contingency Table Analysis The Data: Frequencies Reject Accept Total Rent 1,845 5,469 7,214 Own 1,100 5,030 6,630 Total 2,945 10,499 13,444 Step 1: Convert to Actual Proportions Reject Accept Total Rent Own Total

Independence Test Step 2: Expected proportions assuming independence: If the factors are independent, then the joint proportions should equal the product of the marginal proportions. [Rent,Reject] x = [Rent,Accept] x = [Own,Reject] x = [Own,Accept] x =

Comparing Actual to Expected It appears that the acceptance rate is dependent on home ownership

When is the Chi Squared Large? Critical values from chi squared table Degrees of freedom = (R-1)(C-1). Critical chi squared D.F

Analyzing Default Do renters default more often (at a different rate) than owners? To investigate, we study the cardholders (only) DEFAULT OWNRENT 0 1 All All

Hypothesis Test

More Formal Model of Acceptance and Default

Probability Models zizi

Likelihood Function

American Express, 1992

Logistic Model for Acceptance

Probit Default Model

Think statistically Build models Thank you.