Exercise session # 1 Random data generation Jan Matuska November, 2006 Labor Economics.

Slides:



Advertisements
Similar presentations
KNOWING WHICH TYPE OF GRAPH TO USE IN RESEARCH A foolproof guide to selecting the right image to convey your important message!
Advertisements

Sampling Distributions (§ )
Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
CONTINUOUS RANDOM VARIABLES These are used to define probability models for continuous scale measurements, e.g. distance, weight, time For a large data.
Probability Distributions – Finite RV’s Random variables first introduced in Expected Value def. A finite random variable is a random variable that can.
Continuous Random Variables and Probability Distributions
1 Engineering Computation Part 6. 2 Probability density function.
Chapter 5: Probability Concepts
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Continuous random variables Uniform and Normal distribution (Sec. 3.1, )
1 Econ 240A Power Outline Review Projects 3 Review: Big Picture 1 #1 Descriptive Statistics –Numerical central tendency: mean, median, mode dispersion:
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Plots and Random #s EXCEL Functions. Obtaining a Density Function l Create a column with a range of values of x containing a large portion of the density.
Fundamental Graphics in R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University.
Continuous Probability Distributions  Continuous Random Variable  A random variable whose space (set of possible values) is an entire interval of numbers.
Further distributions
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
AEB 6184 – Simulation and Estimation of the Primal Elluminate - 6.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
Chapter 14 Monte Carlo Simulation Introduction Find several parameters Parameter follow the specific probability distribution Generate parameter.
TA: Natalia Shestakova October, 2007 Labor Economics Exercise session # 1 Artificial Data Generation.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
7.4 – Sampling Distribution Statistic: a numerical descriptive measure of a sample Parameter: a numerical descriptive measure of a population.
Introduction to Biostatistics and Bioinformatics Exploring Data and Descriptive Statistics.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Simulation Example: Generate a distribution for the random variate: What is the approximate probability that you will draw X ≤ 1.5?
Histograms. Grouped frequency distribution Shows how many values of each variable lie in a class. Some information is lost. When presenting this information.
Total Population of Age (Years) of People. Pie Chart of Males and Females that Smoke Systematic Gender Sample Total Population: 32.
ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Statistics with TI-Nspire™ Technology Module E. Lesson 2: Properties Statistics with TI-Nspire™ Technology Module E.
MATH 4030 – 4B CONTINUOUS RANDOM VARIABLES Density Function PDF and CDF Mean and Variance Uniform Distribution Normal Distribution.
Exam Review Day 6 Chapters 2 and 3 Statistics of One Variable and Statistics of Two Variable.
Continuous Random Variables. Probability Density Function When plotted, discrete random variables (categories) form “bars” A bar represents the # of.
Appendix B: Statistical Methods. Statistical Methods: Graphing Data Frequency distribution Histogram Frequency polygon.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
South Dakota School of Mines & Technology Introduction to Probability & Statistics Industrial Engineering.
Excercises on chapter 2. Exercises on chapter 2 Complete the table,then answer the following questions Text Book : Basic Concepts and Methodology for.
Frequency and Histograms. Vocabulary:  Frequency: The number of data values in an interval.  Frequency Table: A table that groups a set of data values.
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions 
1 1 Slide Continuous Probability Distributions n The Uniform Distribution  a b   n The Normal Distribution n The Exponential Distribution.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
How do we classify uncertainties? What are their sources? – Lack of knowledge vs. variability. What type of measures do we take to reduce uncertainty?
Copyright © 2010 Pearson Addison-Wesley. All rights reserved. Chapter 3 Random Variables and Probability Distributions.
CHAPTER 5 CONTINUOUS PROBABILITY DISTRIBUTION Normal Distributions.
WARM UP: Penny Sampling 1.) Take a look at the graphs that you made yesterday. What are some intuitive takeaways just from looking at the graphs?
Statistics -Continuous probability distribution 2013/11/18.
Random Variables By: 1.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
Advanced Quantitative Techniques
Statistical Hydrology and Flood Frequency
Econ Roadmap Focus Midterm 1 Focus Midterm 2 Focus Final Intro
AP Statistics: Chapter 7
Quantitative Methods PSY302 Quiz 6 Confidence Intervals
Data Analysis Empirical Distributions Industrial Engineering
Organizing and Displaying Data
Suppose you roll two dice, and let X be sum of the dice. Then X is
Assessing Normality.
Introduction to Probability & Statistics The Central Limit Theorem
QQ Plot Quantile to Quantile Plot Quantile: QQ Plot:
Representing Data OCR Module 9
Statistics Lecture 12.
Sampling Distributions
(Approximately) Bivariate Normal Data and Inference Based on Hotelling’s T2 WNBA Regular Season Home Point Spread and Over/Under Differentials
Sampling Distributions (§ )
Chapter 3 : Random Variables
Simulate Multiple Dice
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Presentation transcript:

Exercise session # 1 Random data generation Jan Matuska November, 2006 Labor Economics

Overview : Graphing Generating random variables Generating random dummy variables from sample Drawing from multivariate distributions Throwing seeds Loops and distribution of estimated coefficients

Histograms Histograms hist z2,den - histogram of variable z2 (density) hist z2,freq - histogram of variable z2 (frequency) dotplot z2 z3 - scatter plot graph of both variables kdensity z2 - produces kernel density estimates and graphs the result b) Sample cdf-s of variables: to generate variable cz3, the cdf values for z3 cumul z3, gen(cz3) graph the sample cdf: line cz3 z3, sort or:scatter cz3 z3, sort Graphing

500 draws from the uniform distribution on [0,1] set obs 500 gen x1 = uniform() 500 draws from the standard normal distribution, mean 0, variance 1 gen x2 = invnorm(uniform()) 500 draws from the distribution N(1,2) gen x3 = 1 + 4*invnorm(uniform()) Generating random variables 1

500 draws from the uniform distribution between 3 and 12 set obs 500 gen x4 = 3 + 9*uniform() compute 500 "z" values as 4-3*x4 + 8*x2 gen z = 4 - 3*x4 + 8*x2 Generating random variables 2

set obs 1000 create data for 1000 individuals gen smoke = uniform()>.7 assume that there is 70% chance that an individual smokes at time =1 smoke = 1 if the expression is true (uniform()>0.7) smoke = 0 if the expression is not true (uniform()<=0.7) Generating random dummy variables from sample

clear mat m=(12,20,0) - matrix of means of RHS vars: y2, y3, error mat c=(5,-.6, 0 \ -.6,119,0 \ 0,0,.1) -covariance matrix of RHS vars drawnorm y2 y3 e, n(1000) means(m) cov(c) - draws a sample of 1000 observations from a normal distribution with specified means and covariances Drawing from multivariate distributions

allows you to generate a particular sample anytime again clear set obs 50 set seed 2- seed number can be any positive integer STATA default is gen z1 = invnorm(uniform()) set seed 2 gen z2 = invnorm(uniform()) set seed gen z3 = invnorm(uniform()) dotplot z1 z2 z3 – we can see that z1 and z2 are identical and different from z3 Throwing seeds

Loop: while `i'<=500 {- i is the counter “commands” local i=`i'+1 } reg z x1 x2 – regress fits a model of dependent variable on other specified variables using linear regression The loop is used to acquire many estimated coefficients b1 which are different from the actual coefficient. The mean of all estimated coefficients should be the close approximation of the true coefficient we want to get Loops and distribution of estimated coefficients

Thank you for attention