Increase Your Productivity by Doing Less

Slides:



Advertisements
Similar presentations
Axio Research E-Compare A Tool for Data Review Bill Coar.
Advertisements

Next Presentation: Presenter: Arthur Tabachneck Copy and Paste from Word or Excel to SAS Art holds a PhD from Michigan State University, has been a SAS.
I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture.
Welcome Data Imports Instant Imports & How to Create an Import File Ryan McIntire Digital Measures.
Enough really good SAS ® tips to fill a book Arthur Tabachneck, President and CEO, myqna.org.
Using Proc Datasets for Efficiency Originally presented as a Coder’s NESUG2000 by Ken Friedman Reviewed by Karol Katz.
Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc.
Efficiencies with Large Datasets Greater Atlanta SAS Users Group July 18, 2007 Peter Eberhardt.
SAS SQL SAS Seminar Series
Writing Maintainable Code with ‘Style’ Allan Page Senior Marketing Analyst Canadian Tire Bank.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
An Animated Guide©: Sending SAS files to Excel Concentrating on a D.D.E. Macro.
Key Data Management Tasks in Stata
Multiple Uses for a Simple SQL Procedure Rebecca Larsen University of South Florida.
SAS Macro: Some Tips for Debugging Stat St. Paul’s Hospital April 2, 2007.
SAS ® PROC SQL or Vanilla Flavor Cecilia Mauldin January
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for
SQL Chapter Two. Overview Basic Structure Verifying Statements Specifying Columns Specifying Rows.
San Francisco  Theme: Strength in Numbers  Big Data and Business Intelligence (BI) Applications.
Define your Own SAS® Command Line Commands Duong Tran – Independent Contractor, London, UK Define your Own SAS® Command Line Commands Duong Tran – Independent.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Copyright © 2006, SAS Institute Inc. All rights reserved. A Sampler of What's New in Base SAS 9.2
PROC FORMAT – Not Just Another Pretty Face. PROC FORMAT, because of its name, is most often used to change the appearance of data for presentation. But.
YET ANOTHER TIPS, TRICKS, TRAPS, TECHNIQUES PRESENTATION: A Random Selection of What I Learned From 15+ Years of SAS Programming John Pirnat Kaiser Permanente.
SAUSAG 69 – 20 Feb 2014 Smarter Sorts Jerry Le Breton (Softscape Solutions) & Doug Lean (DHS) Beyond the Obvious.
Copyright © 2004, SAS Institute Inc. All rights reserved. SASHELP Datasets A real life example Barb Crowther SAS Consultant October 22, 2004.
An Introduction to Proc Transpose David P. Rosenfeld HR Consultant, Workforce Planning & Data Management City of Toronto.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
Working Efficiently with Large SAS® Datasets Vishal Jain Senior Programmer.
SAS ® Global Forum 2014 March Washington, DC Arthur Tabachneck Thornhill, ON Canada Tom Abernathy New York, NY Matt Kastin Penn Valley, PA.
Better Metadata Through SAS® II: %SYSFUNC, PROC DATASETS, and Dictionary Tables.
TASS Meeting Quickly Finding Project Code March 13th, 2009 A way to quickly find all of your project code Dr. Arthur Tabachneck Director, Data Management.
2012 OrlandoFlorida April 22-25, 2012 Sometimes one needs an option with unusual dates Arthur TabachneckMatthew Kastin Thornhill, OntarioLouisville, Colorado.
Tips for Mastering Relational Databases Using SAS/ACCESS®
Ottawa Area SAS Users Society
Create Rubrics for your Project-Based Learning Activities.
Chapter 6: Modifying and Combining Data Sets
Former Chapter 23: Selecting Efficient Sorting Strategies
Lab 2 Data Manipulation and Descriptive Stats in R
Chapter 7: Macros in SAS Macros provide for more flexible programming in SAS Macros make SAS more “object-oriented”, like R Not a strong suit of text ©
SAS Essentials How SAS Thinks
PROC DOC III: Self-generating Codebooks Using SAS®
A Better Way to Flip (Transpose) a SAS® Dataset
Defining and Calling a Macro
Chance to make SAS-L History!
SESUG Web Scraping in SAS: A Macro-Based Approach
How to Create Data Driven Lists
3 Iterative Processing.
Bring the Vampire out of the Shadows: Understanding the RETAIN and COUNT functions in SAS® Steve Black.
Hunter Glanz & Josh Horstman
Optimizing Exam Schedules at Oklahoma State University
Author: Kaiqing Fan Company: Mastech Digital Inc.
Handouts Only Set Yourself Free-- Use ODS Report Writing Technology in SAS EG Instead of Dynamic Data Exchange in PC SAS Part II: SAS Code Revealed Robert.
Implementing a Discrete Event Simulation Using the American Community Survey and SAS® University Edition by Michael C. Grierson Copyright © 2010,
Lab 3 and HRP259 Lab and Combining (with SQL)
Never Cut and Paste Again
Automate Repetitive Programming Tasks: Effective SAS® Code Generators
Data Manipulation (with SQL)
Efficient Selective Unduplication Using the MODIFY Statement
a useful SAS 9.2 feature I wasn’t aware of *
A Better Way to Flip (Transpose) a SAS® Dataset
Writing Robust SAS Macros
Changing a file from being long to being wide*
Presentation transcript:

Increase Your Productivity by Doing Less Next Presentation: Increase Your Productivity by Doing Less Presenter: Arthur Tabachneck Art holds a PhD from Michigan State University, has been a SAS user since 1974, is president of the Toronto Area SAS Society and has received such recognitions as the SAS Customer Value Award (2008), SAS-L Hall of Fame (2011), SAS Circle of Excellence (2012) and, in 2013, was recognized as being the first SAS Discussion Forum participant to be awarded more than 10,000 points Copyright © 2010, SAS Institute Inc. All rights reserved. 1

Increase Your Productivity by Doing Less Arthur Tabachneck Thornhill, ON Canada Xia Ke Shan Beijing, China Joe Whitehurst Robert Virgile Atlanta, GA Lexington, MA Copyright © 2010, SAS Institute Inc. All rights reserved.

our Truth in Advertising commitment WARNING: The paper I'm about to present: was originally a minor subset of a major paper I presented in the Beyond the Basics section it was written because we thought of the idea for the original paper too late to submit to SGF2013 but, due to a cancellation, SGF had an opening for a 10 minute Coder's Corner paper but, while it is clearly a disturbing example of extreme opportunism, it may be the most significant 10 minute paper/presentation I've had the privilege of co-authoring #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 3

What the %transpose() macro is It's a SAS macro Looks and feels like PROC TRANSPOSE Has all of the PROC TRANSPOSE options and statements + two additional ones Can run between 9 and 12 as much as 200 times or more faster than using PROC SORT and PROC TRANSPOSE #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 4

Have you ever had to flip a SAS dataset from being tall to being wide? i.e., from: idnum date var1 1 2001JAN SD 2001FEB EF 2001MAR HK 2 GH 2001APR MM 2001MAY JH #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 5

flipping a SAS dataset to: idnum var1_2001JAN var1_2001FEB var1_2001MAR var1_2001APR var1_2001MAY 1 SD EF HK   2 GH MM JH #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 6

if you have, you are probably already familiar with PROC TRANSPOSE Note to self: remember to first sort the data proc transpose data=have out=want (drop=_:) prefix=var1_; by idnum; var var1; id date; run; Not difficult, but you need to know which are options which are statements and, if needed, remember to sort your data before hand #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 7

How many of you have ever: forgotten to run proc sort before running another proc that required sorted data? run proc sort but didn't include the options that can make the process more efficient (e.g., noequals, presorted and tagsort)? run a proc that only used a few of a file's variables, but didn't include a keep dataset option to limit the amount of data that had to be processed? #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 8

Compare the performance of the following two sets of almost identical code run on a file with 40,000 records and 1,002 variables PROC SORT data=have out=need; by idnum date; run; took 2.41 seconds CPU time PROC TRANSPOSE data=need out=want (drop=_:) prefix=var1_Qtr; by idnum; var var1; id date; format date Qtr1.; run; took 0.74 seconds CPU time #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 9

Compare the performance of the following two sets of almost identical code run on a file with 40,000 records and 1,002 variables PROC SORT data=have (keep=idnum date var1) out=need noequals; by idnum date; run; took 0.33 seconds CPU time PROC TRANSPOSE data=need out=want (drop=_:) prefix=var1_Qtr; by idnum; var var1; id date; format date Qtr1.; run; took 0.16 seconds CPU time 6.39 times faster #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 10

More importantly compare the real time performance of running the two versions of PROC SORT PROC SORT data=have out=need; by idnum date; run; took 109.54 seconds real time PROC SORT data=have (keep=idnum date var1) out=need noequals; by idnum date; run; took 0.57 seconds real time 192.2 times faster #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 11

You can achieve the same performance gains by running the following code %transpose(data=have, out=want (drop=_:), prefix=var1_Qtr, by=idnum, var=var1, id=date, sort=yes, format=date Qtr1.) The macro's benefits less code thus less chance for error same efficiency as the optimized code no need to know which PROC TRANSPOSE features are options and which are statements further code reductions possible by setting common defaults for the named parameters #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 12

How the %transpose() macro works First, all of the parameters are declared: %macro transpose( data=, out=, var=, prefix=, suffix=, let=, name=, label=, sort_options=, id=, sort=, idlabel=, delimiter=, by=, copy=, format=); the same options and statements you would use with PROC TRANSPOSE + two new named parameters (sort and sort_options) #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 13

If &var is null populate it with all numeric variables data _temp; set &data. (obs=1 drop=&by. &id. &copy.); run; %if %length(&var.) eq 0 %then %do; proc sql noprint; select name into :var separated by " " from dictionary.columns where libname="WORK" and memname="_TEMP" and type="num" ; quit; %end; #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 14

If the sort parameter eq 'yes' run proc sort %if %sysfunc(upcase("&sort.")) eq "YES" %then %do; proc sort data=&data out=_temp by &by. &id.; run; %let data=_temp; %end; (keep=&by. &id. &var. &copy.) &sort_options. noequals; #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 15

using all of the specified parameters Run PROC TRANSPOSE using all of the specified parameters proc transpose data=&data. (keep=&by. &id. &var. &copy.) %if %length(&delimiter.) gt 0 %then delimiter=&delimiter.; %if %length(&label.) gt 0 %then label=&label.; %if %length(&let.) gt 0 %then let; %if %length(&name.) gt 0 %then name=&name.; %if %length(&out.) gt 0 %then out=&out.; %if %length(&prefix) gt 0 %then prefix=&prefix.; %if %length(&suffix) gt 0 %then suffix=&suffix.; ; %if %length(&by.) gt 0 %then by &by.;; %if %length(&copy.) gt 0 %then copy &copy.;; %if %length(&id.) gt 0 %then id &id.;; %if %length(&idlabel.) gt 0 %then idlabel &idlabel.;; %if %length(&var.) gt 0 %then var &var.;; %if %length(&format.) gt 0 %then format &format.;; run; #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 16

finally, delete temporary file proc delete data=work._temp; run; %mend transpose; #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 17

Potential Applications transposing files quicker using the macro as a template for creating similar macros which incorporate sort and efficiency options for other SAS procs that could benefit from the approach #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 18

Presentation Overview How this paper came about  What the %transpose macro is  The macro's benefits  How the macro works  Potential applications of the macro  #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 19

Questions? #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved. 20

Your comments and questions are valued and encouraged Contact the Authors Arthur Tabachneck, Ph.D. President, myQNA, Inc. Thornhill, ON art297@rogers.com code name: art297 Joe Whitehurst High Impact Technologies Atlanta, GA joewhitehurst@gmail.com code name: joe whitehurst Robert Virgile Robert Virgile Associates, Inc. Lexington, MA rvirgile@verizon.net code name: astounding Xia Ke Shan Beijing, China keshan.xia@gmail.com code name: ksharp #SASGF11 Copyright © 2010, SAS Institute Inc. All rights reserved.