BMTRY 789 Lecture 6: Proc Sort, Random Number Generators, and Do Loops Readings – Chapters 5 & 6 Lab Problem - Brain Teaser Homework Due – HW 2 Homework for next week - HW 3 (Libname/Cards Quiz) NO CLASS
Summer 2009BMTRY 789 Intro to SAS Programming2 Random Number Generators
Summer 2009BMTRY 789 Intro to SAS Programming3 Proc Sort Primarily used to sort the observation of your data by a certain variable or collection of variables. However, it can also be used to create a new data set, subset your data, rename, drop, or keep variables, and format or label variables. An additional very important feature is to select out duplicate records
Summer 2009BMTRY 789 Intro to SAS Programming4 Proc Sort Options Almost always a good idea to use the OUT= option when using proc sort to do anything except for a simple sort. Why? Because Proc Sort automatically writes over your data set! Go to SAS user group paper to take a look at features and examples… NODUPKEY
Summer 2009BMTRY 789 Intro to SAS Programming5 Do-Loops Do-loops are one of the main tools of SAS programming. They exist in several forms, always terminated by an END; statement
Summer 2009BMTRY 789 Intro to SAS Programming6 Do-Loops (cont.) DO; -groups blocks of statements together DO OVER arrayname; -process array elements DO VAR=start TO end ; - range of numeric values DO VAR= list-of-values; DO WHILE (expression); (expression evaluated before loop) DO UNTIL (expression); (expression evaluated after loop) – guaranteed to be executed at least once *Some of these forms can be combined.
Summer 2009BMTRY 789 Intro to SAS Programming7 Iterative Do-loop Do loops can be nested. The following example calculates how long it would take for an investment with interest compounded monthly to double: Data interest; do rate = 4, 4.5, 5, 7, 9, 20; mrate = rate / 1200; *converts from percentage; months = 0; start = 1; Do While (start <2); start = start * (1 + mrate); months = months + 1; End; years = months / 12; output; End; Keep rate years; Run;
Summer 2009BMTRY 789 Intro to SAS Programming8 Random Number Functions SAS can generate random observations from discrete and continuous distributions. Binomial (n,p) -ranbin(seed,n,p) Exponential (~=1) - ranexp(seed) Standard Normal (µ=0; sigmaSquared=1) - rannor(seed) Poisson (mean > 0) - ranpoi(seed, mean) Uniform (interval (0,1) ) - ranuni(seed)
Summer 2009BMTRY 789 Intro to SAS Programming9 Seeds A SEED - is a number used by the random number generator to start the algorithm They can be any POSITIVE NUMBER or Zero 0 seed = a different series of numbers each time you run the program. Any positive seed = a repeatable series of numbers each time you run the program.
Summer 2009BMTRY 789 Intro to SAS Programming10 Practical Ex: Randomly Assign Subject to Study Groups Data Assign; Do Subj = 1 to 20; If RANUNI(123) LE.5 Then Group = 1; Else Group = 2; Output; End; Run; Proc Print Data = Assign; Run;
Summer 2009BMTRY 789 Intro to SAS Programming11 Practical Ex: Randomly Assign Subject to Study Groups (of equal size) Data Random; Do Subj = 1 to 20; Group = RANUNI(0) ; Output; End; Run; Proc Rank Data = Random Groups=2 Out=Split; Var Group; Run; Proc Print Data = Split Noobs; Run;
Summer 2009BMTRY 789 Intro to SAS Programming12 Here is your brain teaser, in- class assignment…
Summer 2009BMTRY 789 Intro to SAS Programming13 Practical Do-Loop Example You have a SAS data set DIET which contains variables ID, DATE, and WEIGHT. There are multiple records per ID, and the records are sorted by DATE within ID. The task is to create a new SAS data set DIET2 from DIET which contains only one record per subject, with each record containing the subject ID and the mean weight. (Could use Proc Means here but for the purpose of this exercise, we want to perform this task within the Data Step). Hints: Include a By ID statement after a SET statement in the DATA step, and then use First. and Last. variables.
Summer 2009BMTRY 789 Intro to SAS Programming14 Example Data Data Set DIET IDDATEWEIGHT 110/01/ /08/ /15/ /02/ /09/ /16/ /23/92202