Stat 305 2009 Lab 7.

Slides:



Advertisements
Similar presentations
Point Estimation Notes of STAT 6205 by Dr. Fan.
Advertisements

Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Sampling Distributions (§ )
1 Midterm Review Econ 240A. 2 The Big Picture The Classical Statistical Trail Descriptive Statistics Inferential Statistics Probability Discrete Random.
Today Today: Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Chapter 6 Introduction to Sampling Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
Market Risk VaR: Historical Simulation Approach
Chapter 14 Inequality in Earnings. Copyright © 2003 by Pearson Education, Inc.14-2 Figure 14.1 Earnings Distribution with Perfect Equality.
1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Use of Quantile Functions in Data Analysis. In general, Quantile Functions (sometimes referred to as Inverse Density Functions or Percent Point Functions)
Review of normal distribution. Exercise Solution.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
Stat 13 Lecture 19 discrete random variables, binomial A random variable is discrete if it takes values that have gaps : most often, integers Probability.
1 Topic 5 - Joint distributions and the CLT Joint distributions –Calculation of probabilities, mean and variance –Expectations of functions based on joint.
Chapter 18: Sampling Distribution Models
Confidence Interval & Unbiased Estimator Review and Foreword.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Chapter 9 Sampling Distributions 9.1 Sampling Distributions.
Sampling Distribution of the Sample Mean
Sampling Distributions
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 9.1: Parameter estimation CIS Computational Probability.
Parameter, Statistic and Random Samples
Chapter 4: Basic Estimation Techniques
Statistical Estimation
Chapter 4 Basic Estimation Techniques
Chapter 7. Classification and Prediction
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Chapter 8: Introduction to Statistics CIS Computational Probability.
Engineering Probability and Statistics - SE-205 -Chap 4
STATISTICS POINT ESTIMATION
STATISTICAL INFERENCE
Target for Today Know what can go wrong with a survey and simulation
Chapter 18: Sampling Distribution Models
Parameter, Statistic and Random Samples
Sampling Distribution of a Sample Proportion
Math 4030 – 12a Correlation.
3.1 Expectation Expectation Example
t distribution Suppose Z ~ N(0,1) independent of X ~ χ2(n). Then,
Quantitative Methods PSY302 Quiz 6 Confidence Intervals
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 9.1: Parameter estimation CIS Computational Probability.
STATISTICS INTERVAL ESTIMATION
Stochastic Hydrology Hydrological Frequency Analysis (I) Fundamentals of HFA Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
The normal distribution
Statistical Assumptions for SLR
POINT ESTIMATOR OF PARAMETERS
Chapter 9.1: Sampling Distributions
Stat Lab 6 Parameter Estimation.
LESSON 4: MEASURES OF VARIABILITY AND PROPORTION
Sampling Distributions
AP Statistics Chapter 16 Notes.
Descriptive Statistics
Sampling Distributions (§ )
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Adequacy of Linear Regression Models
Nonparametric Statistics
Quantitative Methods Varsha Varde.
Advanced Algebra Unit 1 Vocabulary
Sampling Distributions
Chapter 8 Estimation.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Applied Statistics and Probability for Engineers
Empirical Distributions
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

Stat 305 2009 Lab 7

Two methods of finding estimators 1. Method of Moment estimator 2. Maximum Likelihood estimator How to compare estimators? Which one is better? How to measure the goodness of estimators of θ?

θ Closeness

Distance function d( , ): Random!!

E Mean Squared Error Random!! ≠ Var( ) In one dimension, ≠ Var( ) Mean Squared Error = Var + bias2, where bias = E( ) - θ

if has a smaller MSE than In one dimension, is better than if has a smaller MSE than for all θ

Hard to get the explicit form!! How to estimate MSE? In simulation, we can generate B independent random samples of size n, say for i = 1, …, B. True value

Assignment 2

(a) For n = 200 , write an R function that simulates the incomes of n individuals from the Pareto distribution. Return Yn and Zn. Hints: Let U be a uniformly distributed random variable over [0, 1] . Then, 1000U−1/α follows the Pareto distribution. (i) Generate a random sample, u, of size n from the uniform distribution over [0, 1]. runif(n) (ii) Transform u into 1000u−1/α, and then call the transformed sample x.

(a) For n = 200 , write an R function that simulates the incomes of n individuals from the Pareto distribution. Return Yn and Zn. myfunction1a= function( n ) { # Define Yn and Zn Yn = ??? Zn = ??? return( c(Yn, Zn) ) }

(b) for ( i in 1:1000) { Y[i] = myfunction1a(200)[1] Write an R function that repeats (a) for K = 1000 times. Then, we have 1000 different values for both Y200 and Z200 . Y = rep(0, 1000) Z = rep(0, 1000) for ( i in 1:1000) { Y[i] = myfunction1a(200)[1] Z[i] = myfunction1a(200)[2] }

(b1) boxplot(Y, Z) boxplot(Y) boxplot(Z) Compare the box-plots of Y200 and Z200 . Output the sample means and sample standard deriations for Y200 and Z200 . Plot the box-plots of Y200 and Z200 . boxplot(Y, Z) boxplot(Y) boxplot(Z) OR Same scale !!! For example, boxplot(Y, ylim=c(0.9,1.8)) boxplot(Z, ylim=c(0.9,1.8))

(b2) Define newy200 = sqrt(200)(Y− α) hist(newy200, freq=F) Plot the histogram for sqrt(200)(Y200 − α) Define newy200 = sqrt(200)(Y− α) hist(newy200, freq=F)

(c1) # Plot 4 histograms in the same window. par(mfrow=c(2,2)) Repeat the exercise in (b) for n = 400, 600, 800, 1000 . What does the histogram of sqrt(1000)(Y1000 − α) look like? # Plot 4 histograms in the same window. par(mfrow=c(2,2)) hist(newy400, freq=F, main=“n=400”) hist(newy600, freq=F) hist(newy800, freq=F) hist(newy1000, freq=F)

(c2) If you want to estimate α , which estimator, Yn or Zn , would you prefer? α =1.16 MSE B = 1000

(d) Simulate the incomes earned by a population of n = 10000 individuals as in part(a). For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. Obtain Q(p) , the proportion of income (earned by the whole population) owned by p% of the people with lowest income. Plot the Lorenz curve, i.e., Q(p) against p . The theoretical value of Q(p), the Lorenz curve, L(p): where X(p) is the p/100 th population quantile of the distribution of X, i.e. P(X < X(p)) = p/100

(d) Simulate the incomes earned by a population of n = 10000 individuals as in part(a). Obtain Q(p) , the proportion of income (earned by the whole population) owned by p% of the people with lowest income. Plot the Lorenz curve, i.e., Q(p) against p . For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. The theoretical value of Q(p), the Lorenz curve, L(p): where X(p) is the p/100 th population quantile of the distribution of X, i.e. P(X ≤ X(p)) = p/100

(d) 1/n 1/n [z] is the largest integer less than or equal to z. For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. The theoretical value of Q(p), the Lorenz curve, L(p): where x[np/100] is the [np/100] th ordered data, i.e. p% of data are less than x[np/100]. 1/n [z] is the largest integer less than or equal to z. For example, [1.89]=1, [0.23]=0 or [-1.34]=-2. 1/n

(d) Partial sum of the ordered data: For p = 1, 2, . . . , 100 , calculate the total income earned by the p% of the population with lowest income. (d) The numerator of Q(p) Partial sum of the ordered data: Data sorting: sort(x) (default sorting: ascending) x[1]+...+x[n/100] When p = 1, The sum of the first 100 ordered x x[1]+...+x[n/100]+ x[n/100+1]+…+ x[2n/100] When p = 2, x[1]+……+x[n] When p = 100, The sum of all x

(d) Obtain Q(p) , the proportion of income (earned by the whole population) owned by p% of the people with lowest income.

(d) Plot the Lorenz curve, i.e., Q(p) against p .

(d) Plot the Lorenz curve, i.e., Q(p) against p .

The explicit form of L(p) for the Pareto distribution is

(e) = 1/(2α -1) G = 1 − 2 · Area under the Lorenz curve Find the Gini coefficient G for the simulated data obtained in (d). G = 1 − 2 · Area under the Lorenz curve = 1/(2α -1) Approximate the area under the Lorenz curve by the average of Q(p).