Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3
Making SAS Datasets From Procedures Output from SAS PROCs can be put into SAS datasets: 1.To do further processing of the information from the output 2.To reformat output to make a report 3.To restructure original SAS dataset or create new variables
Ways to Put Output into SAS Datasets Using OUTPUT statement available from many procedures Using ODS OUTPUT statement – any output table can be put into a SAS dataset
Report We Want to Generate Quartiles of Weight by Gender and Center sex clinic N P25 P50 P75 Male A Male B Male C Male D Female A Female B Female C Female D
Program 14 LIBNAME class ‘C:\SAS_Files'; * Will use SAS dataset version of TOMHS data; DATA wt; SET class.tomhsp (KEEP=ptid age sex clinic wtbl wt12 ); wtchg = wt12 - wtbl; RUN; PROC FORMAT; VALUE sexF 1 = ‘Male’ 2=‘Female’; RUN;
Create report by sex and clinic of univariate info; PROC SORT DATA = wt; BY sex clinic; PROC UNIVARIATE DATA = wt NOPRINT; BY sex clinic; VAR wt12 ; OUTPUT OUT=univinfo N = n Q1 = p25 MEDIAN = p50 Q3 = P75 ; Dataset univinfo will have one observation for each combination of sex and clinic. Statistic name = variable name Name of new dataset
PROC PRINT DATA = univinfo; FORMAT sex sexF.; RUN; Obs sex clinic n p75 p50 p25 1 Male A Male B Male C Male D Female A Female B Female C Female D
PROC PRINT DATA = univinfo NOOBS; VAR sex clinic n p25 p50 p75; FORMAT p25 p50 p ; TITLE 'Quartiles of Weight by Gender/Center'; RUN; Quartiles of Weight by Gender/Center sex clinic N P25 P50 P75 Male A Male B Male C Male D Female A Female B Female C Female D
* Output quantile table to a dataset; ODS OUTPUT quantiles = qwt; PROC UNIVARIATE DATA = wt ; VAR wtbl wt12 ; RUN; ODS OUTPUT CLOSE ; PROC PRINT DATA=qwt; RUN; Using ODS to Send Output to a SAS Dataset Syntax: ODS OUTPUT output-table = new-data-set;
Obs Varname Quantile Estimate 1 wtbl100% Max wtbl99% wtbl95% wtbl90% wtbl75% Q wtbl50% Median wtbl25% Q wtbl10% wtbl5% wtbl1% wtbl0% Min wt12100% Max wt1299% wt1295% wt1290% wt1275% Q wt1250% Median wt1225% Q wt1210% wt125% wt121% wt120% Min Display of Output Dataset Would like to put side-by-side
DATA wtbl wt12 ; SET qwt; if varname = 'wtbl' then output wtbl; else if varname = 'wt12' then output wt12; RUN; PROC DATASETS ; MODIFY wtbl; RENAME estimate = wtbl; MODIFY wt12; RENAME estimate = wt12; RUN; DATA all; MERGE wtbl wt12; DROP varname; RUN; PROC PRINT; PROC DATASETS used for changing variable names Separate the data into 2 datasets Put 2 datasets side-by-side
Obs Quantile wtbl wt % Max % % % % Q % Median % Q % % % % Min
ODS OUTPUT ParameterEstimates (persist=proc) = betas; PROC REG DATA=WT; MODEL dbpchg = wtchg age sex; RUN; PROC REG data=wt; MODEL sbpchg = wtchg age sex; RUN; ODS OUTPUT CLOSE; PROC PRINT DATA=betas; RUN;
Obs Dependent Variable Estimate StdErr tValue Probt 1 dbpchg Intercept dbpchg wtchg dbpchg age dbpchg sex sbpchg Intercept sbpchg wtchg sbpchg age sbpchg sex Display of Output Dataset - Report
PROC PRINT; VAR variable estimate stderr tvalue probt; BY dependent NOTSORTED; FORMAT estimate 7.3 stderr 7.3 probt pvalue5.2 ; Dependent=dbpchg Obs Variable Estimate StdErr tValue Probt 1 Intercept wtchg age sex Dependent=sbpchg Obs Variable Estimate StdErr tValue Probt 5 Intercept wtchg age sex Display of Output Dataset Using BY Statement
PROC RANK Used to divide observations into equal size categories based on values of a variable Creates a new variable containing the categories New variable is added to the dataset or to a new dataset Example: Divide weight change into 5 equal categories (Quinitiles)
PROC RANK SYNTAX PROC RANK DATA = dataset OUT = outdataset GROUPS = # of categories VAR varname; RANKS newvarname; Most of the time you can set OUT to be the same dataset specified in DATA. PROC RANK writes no output
PROGRAM 15 LIBNAME class ‘C:\SAS_Files'; DATA wtchol; SET class.tomhsp (KEEP=ptid clinic sex wtbl wt12 cholbl chol12); wtchg = wt12 - wtbl; cholchg = chol12 - cholbl; RUN; *This PROC will add a new variable to dataset which is the tertile of weight change. The new variable will be 0,1,or 2; PROC RANK DATA = wtchol GROUPS=3 OUT = wtchol; VAR wtchg; RANKS twtchg; Name of new variable
PARTIAL LOG 8 DATA wtchol; 9 SET class.tomhsp (KEEP=ptid clinic sex wtbl wt12 cholbl chol12); 10 wtchg = wt12 - wtbl; 11 cholchg = chol12 - cholbl; 12 RUN; NOTE: There were 100 observations read from the data set CLASS.TOMHSP. NOTE: The data set WORK.WTCHOL has 100 observations and 9 variables. PROC RANK DATA = wtchol GROUPS=3 OUT = wtchol; 20 VAR wtchg; RANKS twtchg; 21 RUN; NOTE: The data set WORK.WTCHOL has 100 observations and 10 variables.
PROC FREQ DATA = wtchol; TABLES twtchg; RUN; OUTPUT: Rank for Variable wtchg Cumulative Cumulative twtchg Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Frequency Missing = 8
PROC PRINT DATA = wtchol (obs=20); VAR ptid wtchg twtchg; TITLE 'Partial Listing of Datset wtchol with new variable added'; RUN; Partial Listing of Datset wtchol with new variable added Obs PTID wtchg twtchg 1 A A A A A A A A A A
PROC MEANS N MEAN MIN MAX MAXDEC=2; VAR cholchg wtchg; CLASS twtchg; TITLE 'Mean Cholesterol Change by Tertile of Weight Change'; RUN;
Mean Cholesterol Change by Tertile of Weight Change The MEANS Procedure Rank for Variable N wtchg Obs Variable N Mean Minimum Maximum cholchg wtchg cholchg wtchg cholchg wtchg Could graph this data in an x-y plot (3 points) Cutpoints for tertiles