ON PATHS LESS TRODDEN… Excursions in SAS/GRAPH® PROC TABULATE and Presented by Aaron Rabushka © Aaron Rabushka 2000
PART I: PROC TABULATE
PROC TABULATE: GENERAL Powerful Tricky has its own book: The SAS Guide to TABULATE Processing
PROC TABULATE: GENERAL Often used as a big brother to PROC FREQ to display counts and sums that require more than two levels of nesting on a page Can aggregate any level of its elements’ detail label formatting can help distinguish aggregations of variables
PROC TABULATE: TYPES OF VARIABLES Uses two kinds of variables: classification variables CLASS statement can be either numeric or character variables can be grouped with a FORMAT statement default statistic is N
CODE EXAMPLE PROC TABULATE DATA=HOLDOUT.AUDIT NOSEPS; CLASS ZONE DAYNUM SYSID;
PROC TABULATE: TYPES OF VARIABLES Uses two kinds of variables: – classification variables • CLASS statement • can be either numeric or character variables • can be grouped with a FORMAT statement • default statistic is N analysis variables VAR statement must be numeric default statistic is SUM
CODE EXAMPLE VAR CPUTIME EXCTIME; PROC TABULATE DATA=HOLDOUT.AUDIT NOSEPS; CLASS ZONE DAYNUM SYSID; VAR CPUTIME EXCTIME;
PROC TABULATE: INTERRELATING VARIABLES CONCATENATION can concatenate variables and statistics to each other in any dimension CROSSING (nesting) reveals much of the power of PROC TABULATE can cross variables and statistics with each other and also with formats
PROC TABULATE: DIMENSIONS OF DISPLAY Can aggregate and display the aggregations in three dimensions (in ascending order of necessity): column row page Dimensional specifications are separated with commas and can be quite complex
CODE EXAMPLE TABLES ZONE,DAYNUM*SYSID, N CPUTIME*SUM EXCTIME*SUM; PROC TABULATE DATA=HOLDOUT.AUDIT; CLASS ZONE DAYNUM SYSID; VAR CPUTIME EXCTIME; TABLES ZONE,DAYNUM*SYSID, N CPUTIME*SUM EXCTIME*SUM;
PROC TABULATE: DIMENSIONS OF DISPLAY Can aggregate in any dimension with ALL can use labels to distinguish levels of aggregation
CODE EXAMPLE TABLES ZONE, PROC TABULATE DATA=HOLDOUT.AUDIT NOSEPS; CLASS ZONE DAYNUM SYSID; VAR CPUTIME EXCTIME; TABLES ZONE, (DAYNUM='DAY' ALL='ALL DAYS')*(SYSID ALL='ALL SYSTEMS'), N=‘NUMBER OF STEPS’*F=9. CPUTIME*SUM EXCTIME*SUM;
PROC TABULATE: DIMENSIONS OF DISPLAY Can aggregate in any dimension with ALL can use labels to distinguish levels of aggregation Can also group data over pages with a BY statement--works if there is no need to aggregate across BY groups
PROC TABULATE: AVAILABLE STATISTICS Can display many of the statistics available through PROC MEANS Percentages: PCTN (the percentage a cell comprises of a count) and PCTSUM (the percentage that a cell comprises of a sum) it is IMPERATIVE and often tricky to use the correct denominator! wise to test on “easy” data before applying it to the real data
CODE EXAMPLE TABLES ZONE, PROC TABULATE DATA=HOLDOUT.AUDIT; CLASS ZONE DAYNUM SYSID; VAR CPUTIME EXCTIME; TABLES ZONE, (DAYNUM='DAY' ALL='ALL DAYS')*(SYSID ALL='ALL SYSTEMS'), N=‘NUMBER OF STEPS’*F=9. PCTN<SYSID ALL> CPUTIME*(SUM PCTSUM<SYSID ALL>) EXCTIME*(SUM PCTSUM<SYSID ALL>);
PROC TABULATE: DOWNLOADING OUTPUT Transfer as text May help to eliminate cell separators with the NOSEPS and/or FORMCHAR pro: this can make the output easier to work with in a spreadsheet con: this can make the output confusing to interpret by obscuring inter-cell relationships
PROC TABULATE: LIMITATIONS Can consume large (perhaps excessive) amounts of CPU time Version 8 is the first to include an OUTput dataset
PART II: SAS/GRAPH
SAS/GRAPH: OUTPUT Connects with output equipment using appropriate GOPTIONS and JCL Can print output directly from the mainframe or download to a PC and process with Adobe Acrobat® Can also output directly to Intranet/Internet
SAS/GRAPH: COORDINATES SAS/GRAPH measures coordinates from the upper left corner of the page
SAS/GRAPH: TEXT Can control any text in the output headings axes footnotes annotations Can control many aspects of text in the output font color weight slant
SAS/GRAPH: ANNOTATIONS Add text to the outputs from SAS/GRAPH PROCs Stored in a SAS dataset that includes text of the annotation a value to link it to the input for the PROC formatting information
SAS/GRAPH: COLORS Can control colors of many elements of output, including text figures borders Realization is device-dependent--colors may show up differently on (e.g.,) monitors and printers SAS can assign a default sequence of colors, or programmers can select sequences that they want
SAS/GRAPH: COLORS Can use several naming schemes: SAS idiomatic color names eight pages of choices in the Version 6 manual RGB (RED/GREEN/BLUE) scheme HLS (HUE/LIGHTNESS/SATURATION) scheme grey-scale scheme
SAS/GRAPH: PATTERNS SAS can assign a default sequence or programmers can determine a sequence Pattern designations other than SOLID and EMPTY depend on the direction and density of their lines and sometimes on the type of output
CODE EXAMPLE * ASSIGN SOME PATTERNS AND COLORS; * SOLID FILLS: ; PATTERN1 VALUE=SOLID COLOR=PAB; PATTERN2 VALUE=SOLID COLOR=BLUE; PATTERN3 VALUE=SOLID COLOR=GREEN; * SLANTED LINES: ; PATTERN4 VALUE=L2 COLOR=DER; * CROSS-HATCHING: ; PATTERN5 VALUE=X4 COLOR=ORANGE;
SAS/GRAPH: AXES Can create individual definitions of axes, then assign them in SAS/GRAPH PROCs Similar to creating SAS TITLEs Can assign attributes including length, scale, color, tick-markings, and attributes of text Vertical or horizontal status is assigned in the PROCs rather than the AXIS definitions
CODE EXAMPLE * AXIS DEFINITIONS; AXIS1 LABEL=('MAINFRAME PRINTERS' HEIGHT=3) VALUE=(HEIGHT=1) ORDER='TPK' '3800' 'PRT' 'PAYR' 'STP' 'OTH'; AXIS2 ORDER=(O TO &MAXPAGES BY 20000) LABEL=(' ') VALUE=(HEIGHT=1);
SAS/GRAPH: LEGENDS SAS/GRAPH has its own legend facility Can also use FOOTNOTEs as legends can string SPECIAL-font K’s (case-sensitive) together to produce color bands
CODE EXAMPLE FOOTNOTE1 HEIGHT=1 BOX=3 FONT=SPECIAL COLOR=VIR 'KKK ' FONT=SWISSB COLOR=BLACK 'TOPEKA ' FONT=SPECIAL COLOR=MOR 'KKK ' FONT=SWISSB COLOR=BLACK '3800 ' FONT=SPECIAL COLOR=GREEN 'KKK ' FONT=SWISSB COLOR=BLACK 'OTHER';
SAS/GRAPH: THE G-PREFIX All SAS/GRAPH PROCs begin with G- Requires a special set of SAS options--GOPTIONS
SAS/GRAPH: PROC GSLIDE Great for text screens
SAS/GRAPH: PROC GSLIDE Great for text screens every element may need an observation in an ANNOTATE dataset
CODE EXAMPLE *BUILD AN ANNOTATION DATASET FOR A SLIDE: ; DATA NOTES; SET ALLDAY; *OUTPUT THE DATE: ; X = 40; Y = 55 - (_N_ * 5); FUNCTION='LABEL'; COLOR = 'BLACK'; STYLE = 'SWISSB'; TEXT=PUT(DATE,WORDDATE18.); OUTPUT;
CODE EXAMPLE (cntd.) * OUTPUT THE NUMBER OF PAGES: ; X=90; Y=55 - (_N_ - 5); FUNCTION='LABEL'; IF COUNT >= 100000 THEN COLOR='RED'; ELSE COLOR='GREEN'; STYLE='SWISSB'; TEXT=PUT(COUNT,COMMA9.); OUTPUT;
CODE EXAMPLE (cntd.) * TEXT SLIDE FROM THE PRECEDING ANNOTATIONS: ; PROC GSLIDE ANNOTATE=NOTES GOUT=PANELS; TITLE1 HEIGHT=3 'PAGES PRINTED, BY DAY';
SAS/GRAPH: PROC GSLIDE • Great for text screens – every element may need an observation in an ANNOTATE dataset also generates some graphics of its own (e.g., bar charts)
SAS/GRAPH: PROC GCHART Provides many standard graphing capabilities: star charts block charts pie charts
SAS/GRAPH: PROC GCHART Provides many standard graphing capabilities: star charts block charts pie charts bar charts
SAS/GRAPH: PROC GCHART BAR CHARTS horizontal bars (HBAR)
SAS/GRAPH: PROC GCHART BAR CHARTS horizontal bars (HBAR) vertical bars (VBAR)
CODE EXAMPLE PROC GCHART DATA=PRINTOUT; VBAR DEVNAME / ANNOTATE=LABELS SUMVAR=PAGES PATTERNID=MIDPOINT AUTOREF MAXIS=AXIS1 RAXIS=AXIS2 CFRAME=LIGR COUTLINE=DAPB; TITLE1 "PAGES PRINTED BY DEVICE FROM &FMDATE TO &TODATE"; FOOTNOTE1…
SAS/GRAPH: PROC GCHART BAR CHARTS horizontal bars (HBAR) vertical bars (VBAR) can group bars with a GROUP statement and stack them with a SUBGROUP statement
SAS/GRAPH: PROC GCHART Interesting use of the term “midpoint” to mean “category” not used in its strict mathematical sense to mean the half-way point between the two ends of a quantitative continuum can be used for any value, grouped or elemental, of a numeric or character variable
SAS/GRAPH: PROC GPLOT Cannot connect values in PROC GCHART-- need to use GPLOT
SAS/GRAPH: PROC GPLOT Can use SAS’s default symbols, or programmers can select which to use Can assign SYMBOL definitions comparably to AXIS or TITLE definitions Can use some pre-defined SAS symbols (e.g., DOT, SQUARE, DIAMOND) or can use letters, including their analogues in SPECIAL and MARKER fonts
SAS/GRAPH: PROC GPLOT Joining symbols: INTERPOLATION options no joining required
SAS/GRAPH: PROC GPLOT Joining symbols: INTERPOLATION options no joining required lines (straight lines)
SAS/GRAPH:. PROC GPLOT Joining symbols: SAS/GRAPH: PROC GPLOT Joining symbols: INTERPOLATION options – no joining required – lines (straight lines) splines (curved lines) needles join symbols to the x-axis rather than to each other can be made to look like bars on a bar graph
CODE EXAMPLE SYMBOL1 VALUE=NONE WIDTH=90 INTERPOL=NEEDLE COLOR=RED; SYMBOL2 VALUE=N HEIGHT=2 FONT=MARKER INTERPOL=JOIN COLOR=ORANGE; SYMBOL3 VALUE=O HEIGHT=2 FONT=MARKER INTERPOL=JOIN COLOR=VIR; SYMBOL4 VALUE=P HEIGHT=2 FONT=MARKER INTERPOL=JOIN COLOR=MOR; PROC GPLOT DATA=TRNSDATE GOUT=PANELS; PLOT (PRT STP TOP _3800)*ENDDATE / HAXIS=AXIS1 VAXIS=AXIS2 OVERLAY; TITLE1…; TITLE2…; FOOTNOTE1…;
SAS/GRAPH: PROC GPLOT Several plots on a single graph OVERLAY option: several plots keyed to a single set of axes PLOT2 option: one or more plots keyed to the same x-axis as the first set of plots and to a second y-axis printed on the graph’s right
SAS/GRAPH: PROC GREPLAY Great for presenting output from several SAS/GRAPH procedures as panels on a single page
SAS/GRAPH: PROC GREPLAY Use a template partly generated through PROC GSLIDE and assigned in PROC GREPLAY
SAS/GRAPH: PROC GREPLAY Requires an internal (to the SAS job) catalog of output from preceding procedures created by GOUT (graph output) clauses in preceding procedures
CODE EXAMPLE PROC GSLIDE GOUT=PANELS; TITLE1 HEIGHT=3 "THREE PANELS ON A PAGE"; RUN;
SAS/GRAPH: PROC GREPLAY Requires an interior (to the SAS job) catalog of output from preceding procedures created by GOUT (graph output) clauses in preceding procedures referenced in GREPLAY with the IGOUT (input graph output catalog)
CODE EXAMPLE PROC GSLIDE GOUT=PANELS; TITLE1 HEIGHT=3 "THREE PANELS ON A PAGE"; RUN; PROC GREPLAY IGOUT=PANELS TC=TEMPCAT NOFS;
SAS/GRAPH: PROC GREPLAY Need to assign proper co-ordinates for each corner of each panel
CODE EXAMPLE (cntd.) * DEFINE A TEMPLATE; TDEF THREEPAN DES='THREE PANELS' /* PANEL 1--UPPER HALF */ 1/ LLX=0 LLY=50 ULX=0 ULY=95 URX=100 URY=95 LRX=100 LRY=50 /* PANEL 2--LOWER LEFT */ 2/ LLX=0 LLY=0 ULX=0 ULY=50 URX=50 URY=50 LRX=50 LRY=0 /* PANEL 3--LOWER RIGHT */ 3/ LLX=50 LLY=0 ULX=50 ULY=50 URX=100 URY=50 LRX=100 LRY=0 /* PANEL 4--FOR FRAMING THE REST */ 4/ LLX=0 LLY=0 ULX=0 ULY=100 LRX=100 LRY=0 URX=100 URY=100;
CODE EXAMPLE (cntd.) * ASSIGN THE PRECEEDING TEMPLATE FOR USE.; TEMPLATE=THREEPAN; * REPLAY PRECEDING OUTPUT INTO THE TEMPLATE'S PANELS.; TREPLAY 1:GSLIDE 2:GCHART 3:GCHART1 4:GSLIDE1;
OUTPUT EXAMPLES SCANNED BY MAURA SCHREIER-FLEMING
QUESTIONS?