Chapter 13: Creating Samples and Indexes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.

Slides:



Advertisements
Similar presentations
How SAS implements structured programming constructs
Advertisements

Examples from SAS Functions by Example Ron Cody
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Chapter 11: Creating and Using Macro Programs 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Iteration (Looping Constructs in VB) Iteration: Groups of statements which are repeatedly executed until a certain test is satisfied Carrying out Iteration.
1 9/29/06CS150 Introduction to Computer Science 1 Loops Section Page 255.
Professor John Peterson
CS 280 Data Structures Professor John Peterson. Big O Notation We use a mathematical notation called “Big O” to talk about the performance of an algorithm.
Programming with MATLAB. Relational Operators The arithmetic operators has precedence over relational operators.
Chapter 18: Modifying SAS Data Sets and Tracking Changes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
ETM 607 – Random Number and Random Variates
REPETITION STRUCTURES. Topics Introduction to Repetition Structures The while Loop: a Condition- Controlled Loop The for Loop: a Count-Controlled Loop.
Chapter 10:Processing Macro Variables at Execution Time 1 STAT 541 © Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 14: Generating Data with Do Loops OBJECTIVES Understand iterative DO loops. Construct a DO loop to perform repetitive calculations Use DO loops.
©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina Chapter 17 supplement: Review of Formatting Data STAT 541.
Chapter 15: Combining Data Horizontally 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 4: Loops and Files
Sampling With Replacement How the program works a54p10.sas.
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
1 Week 2: Variables and Assignment Statements READING: 1.4 – 1.6 EECS Introduction to Computing for the Physical Sciences.
Control Structures II Repetition (Loops). Why Is Repetition Needed? How can you solve the following problem: What is the sum of all the numbers from 1.
Saeed Ghanbartehrani Summer 2015 Lecture Notes #5: Programming Structures IE 212: Computational Methods for Industrial Engineering.
 2008 Pearson Education, Inc. All rights reserved Case Study: Random Number Generation C++ Standard Library function rand – Introduces the element.
Chapter 16: Using Lookup Tables to Match Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 05 (Part III) Control Statements: Part II.
Algorithm Design.
Ch. 10 For Statement Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2012.
Chapter 22: Using Best Practices 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Advanced Program Design. Review  Step 1: Problem analysis and specification –Specification description of the problem’s inputs and output –Analysis generalize.
1 Chapter 9. To familiarize you with  Simple PERFORM  How PERFORM statements are used for iteration  Options available with PERFORM 2.
ITERATIVE STATEMENTS. Definition Iterative statements (loops) allow a set of instruction to be executed or performed several until condition are met.
Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
CPTG286K Programming - Perl Chapter 4: Control Structures.
1 A Balanced Introduction to Computer Science, 2/E David Reed, Creighton University ©2008 Pearson Prentice Hall ISBN Chapter 13 Conditional.
Chapter 19: Introduction to Efficient SAS Programming 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Controlling Program Flow with Decision Structures.
ARITHMETIC OPERATORS COMPARISON/RELATIO NAL OPERATORS LOGIC OPERATORS ()Parenthesis>Greater than &&And ^Exponentiation=>=
BMTRY 789 Lecture 6: Proc Sort, Random Number Generators, and Do Loops Readings – Chapters 5 & 6 Lab Problem - Brain Teaser Homework Due – HW 2 Homework.
Expressions Methods if else Statements Loops Potpourri.
Chapter 17 Supplement: Alternatives to IF-THEN/ELSE Processing STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South.
Chapter 23: Selecting Efficient Sorting Strategies 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 21: Controlling Data Storage Space 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
ALGEBRA VOCABULARY. Vocabulary: Expression Definition: A math phrase built from constants, variables and operations EXAMPLE: 3X - 2Y.
Why Repetition? Read 8 real numbers and compute their average REAL X1, X2, X3, X4, X5, X6, X7, X8 REAL SUM, AVG READ *, X1, X2, X3, X4, X5, X6, X7, X8.
Computer Science 101 For Statement. For-Statement The For-Statement is a loop statement that is especially convenient for loops that are to be executed.
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
Chapter 14: Combining Data Vertically 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Loops and Arrays Chapter 19 and Material Adapted from Fluency Text book.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 8, 13, & 24 By Tasha Chapman, Oregon Health Authority.
Chapter 4 – C Program Control
Chapter 3: Decisions and Loops
The Selection Structure
Chapter 19: Introduction to Efficient SAS Programming
Chapter 13: Creating Samples and Indexes
Introduction to MATLAB
Chapter 18: Modifying SAS Data Sets and Tracking Changes
AP Statistics: Chapter 7
SET statement in DATA step
Chapter 24 (4th ed.): Creating Functions
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Unary Operators ++ and --
Chapter (3) - Looping Questions.
Iteration: Beyond the Basic PERFORM
Using Macros to Solve the Collation Problem
24 Searching and Sorting.
Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.
Chapter 24: Querying Data Efficiently
Output Variables {true} S {i = j} i := j; or j := i;
REPETITION Why Repetition?
COMPUTING.
Presentation transcript:

Chapter 13: Creating Samples and Indexes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina

2 Creating a Systematic Sample from a Known Number of Observations  Observations are chosen from data set at regular intervals SET data-set-name POINT= point-variable;  point-variable names a temporary numeric variable whose value is the observation number of the observation to be read, must be given a value before SET statement execution, and must be a variable and not a constant value

3 Creating a Systematic Sample from a Known Number of Observations (continued)  point-variable values should be positive integers less than or equal to the number of observations in the SAS data set  Assign the value of point-variable within the program so that it has a value when the SET statement begins execution.  The value of point-variable must change during DATA step execution so that another observation is selected.

4 Creating a Systematic Sample from a Known Number of Observations (continued)  Use the STOP statement to stop processing the current DATA step immediately and resume processing statements after the end of the current DATA step. data sasuser.everyevenrecord; do obsnum=2 to 136 by 2; set sasuser.original point=obsnum; set sasuser.original point=obsnum; output; output;end;stop;run;

5 Creating a Systematic Sample from an Unknown Number of Observations  When you don’t know the number of observations in the data set, use the NOBS= option in the SET statement to determine how many observations there are in a SAS data set. SET data-set-name NOBS= variable;  variable is a temporary numeric variable whose value is the number of observations in the input data set

6 Creating a Systematic Sample from an Unknown Number of Observations (continued) data sasuser.everyevenrecord; do obsnum=2 to totobs by 2; set sasuser.original point=obsnum nobs=totobs; set sasuser.original point=obsnum nobs=totobs; output; output;end;stop;run;

7 Creating a Random Sample with Replacement data sasuser.subset (drop=i totobs); samplesize=20; do i =1 to samplesize; obsnum=ceil(ranuni(0)*totobs); obsnum=ceil(ranuni(0)*totobs); set sasuser.original point=obsnum nobs=totobs; set sasuser.original point=obsnum nobs=totobs; output; output;end;stop;run;

8 Creating a Random Sample with Replacement (continued) The RANUNI function generates a number between 0 and 1. RANUNI (seed) where seed is a nonnegative integer less than 2,147,483,647  If 0 is the seed, the computer clock initializes the stream and the stream of random numbers is NOT replicable. Using a specific positive seed will produce replicable results.

9 Creating a Random Sample with Replacement (continued)  ranuni(0)*totobs Using a multiplier (positive integer) with the RANUNI function changes the outcome’s range to a number between 0 and the multiplier  obsnum=ceil(ranuni(0)*totobs); obsnum will have a value that ranges from 1 to totobs (total number of observations) because the CEIL function returns the smallest integer that is greater than or equal to the argument

10 Creating a Random Sample without Replacement data sasuser.subset (drop=obsleft samplesize); samplesize=20;obsleft=totobs; do while (samplesize>0); obsnum+1; obsnum+1; if ranuni(0)<samplesize/obsleft then do; if ranuni(0)<samplesize/obsleft then do; set sasuser.original point=obsnum nobs=totobs; set sasuser.original point=obsnum nobs=totobs; output; output; samplesize=samplesize-1; samplesize=samplesize-1; end; end; obsleft=obsleft-1; obsleft=obsleft-1;end;stop;run;

11 Creating a Random Sample without Replacement (continued)  Each observation in the original data set is considered for selection only once.  samplesize is the number of observations to read into the sample and decreases by 1 per DO loop iteration  obsleft is the number of observations in the original data set that have not yet been considered for selection and decreases by 1 per DO loop iteration  totobs is the total number of observations in the original data set  obsnum is the number of the observation considered for selection (starting value is 0 and increments by 1 per DO loop iteration)  When the IF-condition is true, the observation (as per obsnum value) is selected, and not selected otherwise.

Creating Indexes in the DATA step  Indexes can be created in a DATA step as readily as in PROC SQL data meddbind (index=(tos)); set meddb; data medcind (index=(td=(tos dos ))); set meddb; 12