Sampling Distribution of the Mean in IML

Slides:



Advertisements
Similar presentations
Random BuiltIn Function in Stella
Advertisements

Summary Statistics/Simple Graphs in SAS/EXCEL/JMP.
The Logic of Hypothesis Testing Population Hypothesis: A description of the probabilities of the values in the unobservable population. Simulated Repeated.
Generating Correlated Random Variables Kriss Harris Senior Statistician
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Exercise session # 1 Random data generation Jan Matuska November, 2006 Labor Economics.
Outline Proc Report Tricks Kelley Weston. Outline Examples 1.Text that spans columnsText that spans columns 2.Patient-level detail in the titlesPatient-level.
Section 2.2, Part 2 Determining Normality AP Statistics.
T T02-06 Histogram (6 SD) Purpose Allows the analyst to analyze quantitative data by summarizing it in sorted format, scattergram by observation,
Probability and Sampling Theory and the Financial Bootstrap Tools (Part 2) IEF 217a: Lecture 2.b Fall 2002 Jorion chapter 4.
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Lecture 4 Ttests STAT 3120 Statistical Methods I.
Week 3 Topic - Descriptive Procedures Program 3 in course notes Cody & Smith (Chapter 2)
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Niraj J. Pandya, Element Technologies Inc., NJ.  Summarize all possible combinations of class level variables even if few categories are altogether missing.
Don't Be Loopy: Re-Sampling and Simulation the SAS® Way David L. Cassell Design Pathways Corvallis, OR.
Parallel Processing in SAS CPUCOUNT A comparison of Proc Means for the Project.
Michael Auld PhUSE Brighton PhUSE 2011 Brighton2 Skewed F-shape curve may reveal bias in the population May indicate power of trial isn’t strong.
Strong Control of the Familywise Type I Error Rate in DNA Microarray Analysis Using Exact Step-Down Permutation Tests Peter H. Westfall Texas Tech University.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
SAS Interactive Matrix Language Computing for Research I Spring 2012 Ramesh.
Haas MFE SAS Workshop Lecture 3: Peng Liu Haas School.
Section 5.4 Sampling Distributions and the Central Limit Theorem Larson/Farber 4th ed.
Latin Squares (Kirk, chapter 8) BUSI 6480 Lecture 7.
Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.
Kurtosis SAS. g1g2.sas; data EDA; infile 'C:\Users\Vati\Documents\StatData\EDA.d at'; input Y; proc means mean skewness kurtosis N; var Y; run; Analysis.
Risk Analysis Simulate a scenario of possible input values that could occur and observe key impacts Pick many input scenarios according to their likelihood.
Nested for loops.
Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3.
An Introduction Katherine Nicholas & Liqiong Fan.
Math 3 Warm Up 4/23/12 Find the probability mean and standard deviation for the following data. 2, 4, 5, 6, 5, 5, 5, 2, 2, 4, 4, 3, 3, 1, 2, 2, 3,
Customize SAS Output Using ODS Joan Dong. The Output Delivery System (ODS) gives you greater flexibility in generating, storing, and reproducing SAS procedure.
BMTRY 789 Lecture 6: Proc Sort, Random Number Generators, and Do Loops Readings – Chapters 5 & 6 Lab Problem - Brain Teaser Homework Due – HW 2 Homework.
Chapter 1 Introduction to Statistics. Section 1.1 Fundamental Statistical Concepts.
R tutorial Stat 140 Linjuan Qian
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 3 Type I, II Error, Sample Size, and Power Henian Chen, M.D., Ph.D.
SHRUG, F EB 2013: N ETWORKING EXERCISE Many Ways to Solve a SAS Problem.
Chapter 9 Sampling Distributions 9.1 Sampling Distributions.
Control Structures Hara URL:
Statistics -Continuous probability distribution 2013/11/18.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
Descriptive Statistics Experiment Simulations Confidence Intervals Sampling Distribution Simulations Main Menu.
Parameter, Statistic and Random Samples
Topic 8: Sampling Distributions
Applied Business Forecasting and Regression Analysis
Jonathan W. Duggins; James Blum NC State University; UNC Wilmington
Guide to Using Excel 2007 For Basic Statistical Applications
Simulation-Based Approach for Comparing Two Means
Stat Lab 7.
AP Statistics: Chapter 7
Univariate Data Exploration
Ch. 8 Estimating with Confidence
Bivariate Testing (Chi Square)
Creating the Example Data
Bivariate Testing (Chi Square)
Guide to Using Excel 2007 or 2010 For Basic Statistical Applications
The sampling distribution of a statistic
Sampling Distribution of Pearson Correlation
Combining Data Sets in the DATA step.
Data Analysis Module: Chi Square
Producing Descriptive Statistics
Using Simulation to Evaluate Statistical Techniques.
Building Java Programs
Wicklin, Rick. Simulating data with SAS. SAS Institute, 2013.
Introduction to Sampling Distributions
Let’s review some of the statistics you’ve learned in your first class: Univariate analyses (single variable) are done both graphically and numerically.
Presentation transcript:

Sampling Distribution of the Mean in IML

Complete code for data step version %let obs = 10; %let reps = 1000; data uniforms; call streaminit(54321); do rep = 1 to &reps; do i = 1 to &obs; x = rand("Uniform"); output; end; run; proc means data=uniforms noprint; by rep; var x; output out=MeansUni mean=Meanx; proc univariate data=meansuni; label meanx = "Sample Mean of U(0,1) Data"; histogram Meanx / normal; ods select Histogram moments;

In the IML version, each sample is stored as a row in a matrix and we use the mean function to calculate the sample means. There are no loops Three statements: Randseed, J, and Randgen generate the samples. Randgen can fill an entire matrix with random values.

%let obs = 10; %let reps = 1000; proc iml; call randseed(123); x = j(&reps,&obs); /* many samples (rows), each of size N */ call randgen(x, "Uniform"); /* 1. Simulate data */ s = x[,:]; /* 2. Compute statistic for each row */ Mean = mean(s); /* 3. Summarize and analyze ASD */ StdDev = std(s); call qntl(q, s, {0.05 0.95}); print Mean StdDev (q`)[colname={"5th Pctl" "95th Pctl"}]; /* compute proportion of statistics greater than 0.7 */ Prob = mean(s > 0.7); print Prob[format=percent7.2]; quit;

Create a data set in wide format %let obs = 10; %let reps = 1000; proc iml; call randseed(123); x = j(&reps,&obs); /* many samples (rows), each of size N */ call randgen(x, "Uniform"); /* 1. Simulate data*/ c="x1":"x&obs"; show c; create unif from x [colname=c]; append from x; close unif; quit; proc contents data=unif; run;

Summarize data set in wide format %let obs=10; data stats (keep=mean std max); set unif; mean=mean(of x1-x10); std=std(of x1-x10); max=max(of x:); run; proc means data=stats; proc univariate data=stats; var mean std max; ods select qqplot; qqplot mean std max;

Restructure data into long format using IML

%let obs=3; %let reps=5; %let seed=54321; proc iml; reset print; call randseed(54321); x = j(&reps,&obs); call randgen(x, "Uniform"); print x; rep = repeat( T(1:&reps), 1, &obs); rep = shape(rep, 0, 1); z = shape(x, 0, 1); create Long var{rep z}; append; close Long; create Long2 var{rep x}; append; close Long2; quit; proc print data=long(obs=5);run; proc print data=long2(obs=5);run;

/**********************/ /* Answer to exercise 4.8*/ proc iml; call randseed(123); x = j(10000, 10); call randgen(x, "Uniform"); * 1. Simulate data; s = x[,<>]; * 2. Compute statistic for each row; Mean = mean(s); * 3. Summarize and analyze ASD; StdDev = std(s); call qntl(q, s, {0.05 0.95}); print Mean StdDev (q`)[colname={"5th Pctl" "95th Pctl"}]; create MaxDist var {s}; append; close MaxDist; /*Sampling Normal Means, IML*/ %let N = 31; /* size of each sample */ %let NumSamples = 10000; /* number of samples */ samples=j(&numsamples,&n,.); call randgen(samples,"Normal"); samplemeans=samples[,:]; create means var {samplemean}; append from samplemeans; close means; quit;