Stat 305 2009 Lab 7.

Slides:

Advertisements

Similar presentations

Point Estimation Notes of STAT 6205 by Dr. Fan.

Advertisements

Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.

Sampling Distributions (§ )

1 Midterm Review Econ 240A. 2 The Big Picture The Classical Statistical Trail Descriptive Statistics Inferential Statistics Probability Discrete Random.

Today Today: Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.

Chapter 6 Introduction to Sampling Distributions

Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.

Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.

Market Risk VaR: Historical Simulation Approach

Chapter 14 Inequality in Earnings. Copyright © 2003 by Pearson Education, Inc.14-2 Figure 14.1 Earnings Distribution with Perfect Equality.

1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.

Use of Quantile Functions in Data Analysis. In general, Quantile Functions (sometimes referred to as Inverse Density Functions or Percent Point Functions)

Review of normal distribution. Exercise Solution.

1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.

Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.

1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.

Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.

Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.

Stat 13 Lecture 19 discrete random variables, binomial A random variable is discrete if it takes values that have gaps : most often, integers Probability.

1 Topic 5 - Joint distributions and the CLT Joint distributions –Calculation of probabilities, mean and variance –Expectations of functions based on joint.

Chapter 18: Sampling Distribution Models

Confidence Interval & Unbiased Estimator Review and Foreword.

Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.

Chapter 9 Sampling Distributions 9.1 Sampling Distributions.

Sampling Distribution of the Sample Mean

Sampling Distributions

Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 9.1: Parameter estimation CIS Computational Probability.

Parameter, Statistic and Random Samples

Chapter 4: Basic Estimation Techniques

Statistical Estimation

Chapter 4 Basic Estimation Techniques

Chapter 7. Classification and Prediction

Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Chapter 8: Introduction to Statistics CIS Computational Probability.

Engineering Probability and Statistics - SE-205 -Chap 4

STATISTICS POINT ESTIMATION

STATISTICAL INFERENCE

Target for Today Know what can go wrong with a survey and simulation

Chapter 18: Sampling Distribution Models

Parameter, Statistic and Random Samples

Sampling Distribution of a Sample Proportion

Math 4030 – 12a Correlation.

3.1 Expectation Expectation Example

t distribution Suppose Z ~ N(0,1) independent of X ~ χ2(n). Then,

Quantitative Methods PSY302 Quiz 6 Confidence Intervals

Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 9.1: Parameter estimation CIS Computational Probability.

STATISTICS INTERVAL ESTIMATION

Stochastic Hydrology Hydrological Frequency Analysis (I) Fundamentals of HFA Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

The normal distribution

Statistical Assumptions for SLR

POINT ESTIMATOR OF PARAMETERS

Chapter 9.1: Sampling Distributions

Stat Lab 6 Parameter Estimation.

LESSON 4: MEASURES OF VARIABILITY AND PROPORTION

Sampling Distributions

AP Statistics Chapter 16 Notes.

Descriptive Statistics

Sampling Distributions (§ )

CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.

Adequacy of Linear Regression Models

Nonparametric Statistics

Quantitative Methods Varsha Varde.

Advanced Algebra Unit 1 Vocabulary

Sampling Distributions

Chapter 8 Estimation.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Applied Statistics and Probability for Engineers

Empirical Distributions

CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.

Presentation transcript:

Stat 305 2009 Lab 7

Two methods of finding estimators 1. Method of Moment estimator 2. Maximum Likelihood estimator How to compare estimators? Which one is better? How to measure the goodness of estimators of θ?

θ Closeness

Distance function d( , ): Random!!

E Mean Squared Error Random!! ≠ Var( ) In one dimension, ≠ Var( ) Mean Squared Error = Var + bias2, where bias = E( ) - θ

if has a smaller MSE than In one dimension, is better than if has a smaller MSE than for all θ

Hard to get the explicit form!! How to estimate MSE? In simulation, we can generate B independent random samples of size n, say for i = 1, …, B. True value

Assignment 2

(a) For n = 200 , write an R function that simulates the incomes of n individuals from the Pareto distribution. Return Yn and Zn. Hints: Let U be a uniformly distributed random variable over [0, 1] . Then, 1000U−1/α follows the Pareto distribution. (i) Generate a random sample, u, of size n from the uniform distribution over [0, 1]. runif(n) (ii) Transform u into 1000u−1/α, and then call the transformed sample x.

(a) For n = 200 , write an R function that simulates the incomes of n individuals from the Pareto distribution. Return Yn and Zn. myfunction1a= function( n ) { # Define Yn and Zn Yn = ??? Zn = ??? return( c(Yn, Zn) ) }

(b) for ( i in 1:1000) { Y[i] = myfunction1a(200)[1] Write an R function that repeats (a) for K = 1000 times. Then, we have 1000 different values for both Y200 and Z200 . Y = rep(0, 1000) Z = rep(0, 1000) for ( i in 1:1000) { Y[i] = myfunction1a(200)[1] Z[i] = myfunction1a(200)[2] }

(b1) boxplot(Y, Z) boxplot(Y) boxplot(Z) Compare the box-plots of Y200 and Z200 . Output the sample means and sample standard deriations for Y200 and Z200 . Plot the box-plots of Y200 and Z200 . boxplot(Y, Z) boxplot(Y) boxplot(Z) OR Same scale !!! For example, boxplot(Y, ylim=c(0.9,1.8)) boxplot(Z, ylim=c(0.9,1.8))

(b2) Define newy200 = sqrt(200)(Y− α) hist(newy200, freq=F) Plot the histogram for sqrt(200)(Y200 − α) Define newy200 = sqrt(200)(Y− α) hist(newy200, freq=F)

(c1) # Plot 4 histograms in the same window. par(mfrow=c(2,2)) Repeat the exercise in (b) for n = 400, 600, 800, 1000 . What does the histogram of sqrt(1000)(Y1000 − α) look like? # Plot 4 histograms in the same window. par(mfrow=c(2,2)) hist(newy400, freq=F, main=“n=400”) hist(newy600, freq=F) hist(newy800, freq=F) hist(newy1000, freq=F)

(c2) If you want to estimate α , which estimator, Yn or Zn , would you prefer? α =1.16 MSE B = 1000

(d) Simulate the incomes earned by a population of n = 10000 individuals as in part(a). For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. Obtain Q(p) , the proportion of income (earned by the whole population) owned by p% of the people with lowest income. Plot the Lorenz curve, i.e., Q(p) against p . The theoretical value of Q(p), the Lorenz curve, L(p): where X(p) is the p/100 th population quantile of the distribution of X, i.e. P(X < X(p)) = p/100

(d) Simulate the incomes earned by a population of n = 10000 individuals as in part(a). Obtain Q(p) , the proportion of income (earned by the whole population) owned by p% of the people with lowest income. Plot the Lorenz curve, i.e., Q(p) against p . For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. The theoretical value of Q(p), the Lorenz curve, L(p): where X(p) is the p/100 th population quantile of the distribution of X, i.e. P(X ≤ X(p)) = p/100

(d) 1/n 1/n [z] is the largest integer less than or equal to z. For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. The theoretical value of Q(p), the Lorenz curve, L(p): where x[np/100] is the [np/100] th ordered data, i.e. p% of data are less than x[np/100]. 1/n [z] is the largest integer less than or equal to z. For example, [1.89]=1, [0.23]=0 or [-1.34]=-2. 1/n

(d) Partial sum of the ordered data: For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. (d) The numerator of Q(p) Partial sum of the ordered data: Data sorting: sort(x) (default sorting: ascending) x[1]+...+x[n/100] When p = 1, The sum of the first 100 ordered x x[1]+...+x[n/100]+ x[n/100+1]+…+ x[2n/100] When p = 2, x[1]+……+x[n] When p = 100, The sum of all x

(d) Obtain Q(p) , the proportion of income (earned by the whole population) owned by p% of the people with lowest income.

(d) Plot the Lorenz curve, i.e., Q(p) against p .

(d) Plot the Lorenz curve, i.e., Q(p) against p .

The explicit form of L(p) for the Pareto distribution is

(e) = 1/(2α -1) G = 1 − 2 · Area under the Lorenz curve Find the Gini coefficient G for the simulated data obtained in (d). G = 1 − 2 · Area under the Lorenz curve = 1/(2α -1) Approximate the area under the Lorenz curve by the average of Q(p).