1 SAS 1-liners SAS Coding Efficiencies. 2 Overview Less is more Always aim for robust, reusable and efficient code Coding efficiency versus processing.

Slides:



Advertisements
Similar presentations
Haas MFE SAS Workshop Lecture 3:
Advertisements

DT266/2 Information Systems COBOL Revision. Chapters 1 & 2 Hutty & Spence Divisions of a Cobol Program Identification Division Program-ID. Environment.
Sorting Rows. 2 home back first prev next last What Will I Learn? In this lesson, you will learn to: –Construct a query to sort a results set in ascending.
The SAS ® System Additional Information on Statistical Analysis Programming.
Examples from SAS Functions by Example Ron Cody
Outline Proc Report Tricks Kelley Weston. Outline Examples 1.Text that spans columnsText that spans columns 2.Patient-level detail in the titlesPatient-level.
Creating a Compact Columnar Output with PROC REPORT Walter R. Young Principal Clinical Programmer Analyst Wyeth.
I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture.
I OWA S TATE U NIVERSITY Department of Animal Science Getting Started Using SAS Software Animal Science 500 Lecture No. 2.
1 Creating and Tweaking Data HRP223 – 2010 October 24, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Databases Lab 5 Further Select Statements. Functions in SQL There are many types of functions provided. The ones that are used most are: –Date and Time.
Introduction to SQL Session 1 Retrieving Data From a Single Table.
Basic And Advanced SAS Programming
Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc.
Chapter 18: Modifying SAS Data Sets and Tracking Changes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
INTRO TO PROGRAMMING Chapter 2. M-files While commands can be entered directly to the command window, MATLAB also allows you to put commands in text files.
Introduction to Access By Mary Ann Chaney and Alicia Harkleroad.
SAS SQL SAS Seminar Series
Chapter 10:Processing Macro Variables at Execution Time 1 STAT 541 © Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Lecture 5 Sorting, Printing, and Summarizing Your Data.
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
PHP meets MySQL.
Chapter 10 Selected Single-Row Functions Oracle 10g: SQL.
©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina Chapter 17 supplement: Review of Formatting Data STAT 541.
Chapter 15: Combining Data Horizontally 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Visual Basic.NET BASICS Lesson 4 Mathematical Operators.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for
© Copyright 1992–2005 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Tutorial 5 – Dental Payment Application: Introducing.
Fall 2006AE6382 Design Computing1 Control Statements in Matlab Topics IF statement and Logical Operators Switch-Case Disp() vs fprintf() Input() Statement.
1 Lab 2 and Merging Data (with SQL) HRP223 – 2009 October 19, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
Grant Brown.  AIDS patients – compliance with treatment  Binary response – complied or no  Attempt to find factors associated with better compliance.
1 Efficient SAS Coding with Proc SQL When Proc SQL is Easier than Traditional SAS Approaches Mike Atkinson, May 4, 2005.
CS146 References: ORACLE 9i PROGRAMMING A Primer Rajshekhar Sunderraman
Code Generation. 2 Overview of presentation Goal Background Dynamic SQL Method Examples.
IFS Intro to Data Management Chapter 5 Getting More Than Simple Columns.
PROC FORMAT – Not Just Another Pretty Face. PROC FORMAT, because of its name, is most often used to change the appearance of data for presentation. But.
Tips & Tricks From your fellow SAS users 9/30/2004.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Chapter 7: Macros in SAS  Macros provide for more flexible programming in SAS  Macros make SAS more “object-oriented”, like R 1 © Fall 2011 John Grego.
FOR MONDAY: Be prepared to hand in a one-page summary of the data you are going to use for your project and your questions to be addressed in the project.
SAS Basics. Windows Program Editor Write/edit all your statement here.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Chapter 6 Concatenating SAS Data Sets and Creating Summary Reports Xiaogang Su Department of Statistics University of Central Florida.
Copyright © 2004, SAS Institute Inc. All rights reserved. SASHELP Datasets A real life example Barb Crowther SAS Consultant October 22, 2004.
An Introduction to Proc Transpose David P. Rosenfeld HR Consultant, Workforce Planning & Data Management City of Toronto.
2 Copyright © 2009, Oracle. All rights reserved. Restricting and Sorting Data.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
17b.Accessing Data: Manipulating Variables in SAS ®
Chapter 17 Supplement: Alternatives to IF-THEN/ELSE Processing STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South.
Chapter 21: Controlling Data Storage Space 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
XP Tutorial 3 New Perspectives on JavaScript, Comprehensive1 Working with Arrays, Loops, and Conditional Statements Creating a Monthly Calendar.
SHRUG, F EB 2013: N ETWORKING EXERCISE Many Ways to Solve a SAS Problem.
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
Session 1 Retrieving Data From a Single Table
Data Manipulation in SAS
Chapter 10 Selected Single-Row Functions Oracle 10g: SQL
Chapter 6: Modifying and Combining Data Sets
Using the Set Operators
Using the Set Operators
Chapter 7: Macros in SAS Macros provide for more flexible programming in SAS Macros make SAS more “object-oriented”, like R Not a strong suit of text ©
Combining Data Sets in the DATA step.
Lab 2 and Merging Data (with SQL)
Let’s Talk About Variable Attributes
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Using the Set Operators
Presentation transcript:

1 SAS 1-liners SAS Coding Efficiencies

2 Overview Less is more Always aim for robust, reusable and efficient code Coding efficiency versus processing efficiency

3 Dynamic Denominators Conditional clauses are very inefficient Try a ‘condition-less’ approach for different denominator values proc freq data=sasdata.adsl; table trt1pn / out=trttots; run; data _null_; set trttots; if trt1pn>.z then call symput(‘n’||compress(trt1pn),compress(count)); run;

4 Dynamic Denominators Can use a one line statement to ‘resolve’ the denominators proc freq data=sasdata.adsl noprint; table trt1pn*agegrp1 / out=agecats; run; data agecats; set agecats; if trt1pn>.z then denom=resolve('&n'||compress(trt1pn)); percent=count/denom*100; run; Even more efficient if trt1pn>.z then percent=count/resolve('&n'||compress(trt1pn))*100;

5 ‘Dummy’ Treatment Code Assignments Create artificial treatment values to be used prior to unblinding trt1pn=ceil(ranuni(0)*x); *** WHERE X CREATES X LEVELS OF TRT1PN. Create different treatment levels The code below distributes 3 groups at approximately 25%/50%/25% trt1pn=ceil(ranuni(0)*2)+ceil(ranuni(0)*2)-1;

6 Reversing or Re-arranging Values Reverse coding common in clinical trials For example, scoring survey data if var=4 then var=1; else if var=3 then var=2; else if var=2 then var=3; else if var=1 then var=4; Efficient processing-wise but many keystrokes

7 Reversing or Re-arranging Values Can use a simple one-liner to gain the same results if var>.z then var=abs(5-var); Use Proc SQL to dynamically determine levels proc sql noprint; create table d2 as select abs((count(distinct var)+1)-var) as var from d1; quit;

8 Reversing or Re-arranging Values Add a safeguard in case one level is missing altogether proc sql noprint; create table d2 as select abs(ceil((max(var)+count(distinct var))/2+1)-var) as var from d1; quit; Code switching example, more complex than straight ‘reversal’ *** RE-CODING TREATMENT IN THE FOLLOWING MANNER FOR THE JUN2009 DMC: (4=1, 3=2, 1=3, 2=4).; newtrt=abs(trt1pn-((trt1pn 2)*5));

9 Study ‘Day’ Calculation Determine a relative ‘day’ based on two date fields In clinical trials, the initial ‘study day’ is generally considered to begin at either randomization or dosing Study Day __|__________|____________|_____________|____________|________ A common approach would be: if visdt>=randdt>.z then stydy=visdt-randdt+1; else stydy=visdt-randdt;

10 Study ‘Day’ Calculation A 1-liner can be used instead! if visdt>.z & randdt>.z then stydy=visdt- randdt+(visdt>=randdt);

11 Using Formats Instead of Conditional Clauses Allows for code re-use with minimal overhead proc format library=work; invalue incat 0 -7 = = = = 4 65-high=5; run; data adsl; set adsl; age_cat=input(age,incat.); run;

12 Transposing Multiple Variables in a Single Proc Step Often transposition on more than one variable is necessary in order to derive header and sub-header columns for summaries Common approach: two PROC TRANSPOSE steps and a merge Efficient approach: One data step plus one PROC TRANSPOSE step

13 Transposing Multiple Variables in a Single Proc Step data score; set score; trt_race=(10*trt1pn)+race; run; proc sort data=demog; by trt_race; run; proc univariate data=demog noprint; by trt_race; var score; output out=univout mean=mean_score; run; proc transpose data=univout out=univout; id trt_race; var mean_score; run;

14 Transposing Multiple Variables in a Single Proc Step This results in columns of _11, _12, _13, _14, _15, _21, _22, _23, _24, & _25 containing mean values for SCORE for each treatment/race category Can amend for as many levels of each variable as needed

15 Datetime Calculation Creating datetime variables Hour or minute missing results in entire date missing Use technique below labdttm= dhms(labdate,max(0,hour(labtime)),max(0,minute(labtime)),0);

16 Identifying Special Characters Sometimes characters from sets not recognized by SAS make their way into SAS data sets Especially from Excel,.CSV files For example, line-feed characters Often do not show up visually but can affect conditional programming

17 Identifying Special Characters data random_csv(drop=i); set random_csv; %macro repair; %do i=1 %to 255; if ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz ')=0 then do; if chars(i)^=compress(chars(i),byte(&i)) then do; put "WAR" "NING: Special Character Found at byte=&i for variable " chars(i)= "Fixing!"; chars(i)=compress(chars(i),byte(&i)); end; %end; %mend repair; array chars (*) _character_; do i=1 to dim(chars); %repair; end; run;

18 Data Set Source Flag Identification data one; do i=1 to 6; output; end; run; data two; do i=7 to 12,16; output; end; run; data three; do i=13 to 15; output; end; run; data all; set one(in=in_1) two(in=in_2) three(in=in_3); source=sum((in_1=1),(in_2=1)*2,(in_3=1)*3); run;

19 Date Imputations Sometimes date imputation rules require imputing to the end of the month when the day is unknown and the end of the year when the both day & month are unknown data test; date="00/000/2008"; output; date="00/DEC/1996"; output; date="00/JAN/1996"; output; date="00/FEB/2000"; output; date="00/FEB/1999"; output; run;

20 Date Imputations data test; set test; if index(date,'00/000/')>0 then do; dateI=tranwrd(date,'00/000/','31/DEC/'); end; else if index(date,'00/')>0 then do; dateI=upcase(put(day(input("01"|| trim(left(put(month(input(compress(tranwrd(date,'00/','01/'), '/'),date7.))+1- ((index(upcase(date),'DEC')>0)*12),z2.)))|| trim(left(put(year (input(compress(tranwrd(date,'00/','01/'),'/'),date7.))+ (index(upcase(date),'DEC')>0),4.))),ddmmyy8.)-1),z2.)|| substr(trim(left(date)),3)); put "NOTE: Imputing start date day to beginning of month! " date= dateI=; end; run;

21 Date Imputations Results in… date dateI 00/000/ /DEC/ /DEC/ /DEC/ /JAN/ /JAN/ /FEB/ /FEB/ /FEB/ /FEB/1999 Creates an actual SAS date derived as the beginning of the month (day 01) for the month that follows the month given in the partial date being imputed Once this date has been created, subtracting 1 from the SAS date value always returns the last day of the preceding month (the month of the partial date)

22 Converting Letters to Numbers data test; do letter='A','B','C','D','E','F','Z'; output; end; run; data test; set test; letternum=indexc(tranwrd('ABCDEFGHIJKLMNOPQRSTUVWXYZ',trim(le ft(upcase(letter))),'*'),'*'); run; Replaces letter value within the entire alphabetic string with a star ‘*” INDEXC is used to identify the location within the alphabetic string of this arbitrary ‘*’ character

23 Acknowledgement SAS 1-liners Stephen Hunt, ICON Clinical Research, Redwood City, CA cc07.pdf