Download presentation
Presentation is loading. Please wait.
1
Lisa Mendez, PhD & Andrew Kuligowski
Case Study: Using Base SAS® to Automate Quality Checks of Excel ® Workbooks that have Multiple Worksheets Lisa Mendez, PhD & Andrew Kuligowski
2
Overview The Process Determine how to identify the smaller problems within the larger, overwhelming problem Solve each problem using SAS code Implementing the code Lessons learned
3
Background Unfamiliar with the data – thrown into the deep end
There were 5 markets ADHD, BNZD, CNNB, CDNE, and PAIN Each market had 7 Excel Workbooks that needed to be checked Each Workbook had various multiple worksheets ADHD – 7 worksheets BNZD – 24 worksheets CNNB – 7 worksheets CDNE – 5 worksheets PAIN – 55 worksheets
4
The Overarching Problem
Let’s do the math! 5 markets multiplied by 7 workbooks (35 workbooks) that had a total of 98 workbooks that needed to be checked That’s 3,430 worksheets FOR 27 QUARTERS!!!!! For a grand total of 92,610 worksheets That can be just a little overwhelming…
5
Getting the Data into SAS®
XLSX Engine Allows you to read and write Microsoft Excel files as if they were data sets in a library Advantage is that it accesses the XLSX file directly - does not use the Microsoft data APIs as a go-between You have to have a license for SAS/ACCESS to PC Files to utilize the XLSX engine SAS University Edition, the SAS/ACCESS product is part of the that package libname Cadhd1 XLSX "C:\Users\lmendez\Documents\RMPDC\Deliverables2017_Q2\ADHD\RMPD_Patient Tracking_ADHD_NDW_2018Q2.xlsx";
6
Getting the Data into SAS®
libname Cadhd1 XLSX "C:\Users\lmendez\Documents\RMPDC\Deliverables2017_Q2\ ADHD\RMPD_Patient Tracking_ADHD_NDW_2018Q2.xlsx"; The libname statement sets up the datasets, and you will see them in the cadhd1 library, but the datasets will be empty Names of datasets are the names of the worksheets
7
Loading the Data Using PROC SQL & SAS Dictionary tables
8
Loading the Data Note: All Caps where libname="CADHD1"
9
Loading the Data
10
Loading the Data
11
Loading the Data Macro variables (will be used in the macro)
Output from the log: 52 53 %put &snamlist_1; /* show the macro variable snamlist in the log */ LOOKUP*STATE_SUBGRP*STATE_SUPERGRP*ZIP_SUBGRP_AMPH*ZIP_SUBGRP_METH*ZIP_SUBGRP_OTH_ANAL*ZIP_SUBGRP_OTH_ANTI*ZIP_SUPER 54 %put &n_1; /* show the macro variable n_1 I the log */ 8
12
Loading the Data SAS Macro
54 %put &n_1; /* show the macro variable n_1 I the log */ 8 LOOKUP*STATE_SUBGRP*STATE_SUPERGRP*ZIP_SUBGRP_AMPH*ZIP_SUBGRP_METH*ZIP_SUBGRP_OTH_ANAL*ZIP_SUBGRP_OTH_ANTI*ZIP_SUPER
13
Validating Worksheet & Variable Names
Need templates to compare Load templates each quarter Ensure permanent template library (libname statement) By Market List of variable names List of worksheet names
14
Validating Worksheet Names
Once templates are loaded, compare worksheet names
15
Validating Worksheet Names
Dataset created after PROC SQL compare for Worksheet Names All worksheet names match – no errors
16
Validating Worksheet Names
Create an error report
17
Validating Variable Names
Once templates are loaded, compare variable names Use Proc Contents to get a current list variable names
18
Validating Worksheet & Variable Names
Dataset created after PROC SQL compare for Variable Names Everything Matches Note: change variable names either before PROC SQL, or in the PROC SQL statement
19
Exporting Error Report to Excel®
The macro variable ‘x’ is used to number the reports that correspond with each workbook
20
Exporting Error Report to Excel®
Used within a macro One Excel file per Market Multiple worksheets for each workbook checked No errors for this workbook Each worksheet corresponds to a workbook
21
Exporting Error Report to Excel®
Sample of worksheet error
22
Exporting Error Report to Excel®
Lessons learned: Do not output if there are no errors, or output “no error” message, because most of the workbooks do not have variable name or worksheet name errors
23
Validating Data A macro variable was created, using the same methods as before for all the worksheet/dataset names The macro variable was used in conjunction with a macro to execute a data step multiple times to check all the data within a worksheet/dataset
24
Validating Data
25
Validating Data Similar code was written to check the products within a workbook A pre-loaded template was used to ensure the correct products were in the correct worksheet/dataset A macro was used, along with a data step, and a PROC SQL step to compare product names in the pre-loaded template with the product names of the current data
26
Validating Data An exception report was created for the values check
Utilized lesson learned from previous Excel export For these exception reports, only MS Excel workbooks were created for each worksheet only if any errors were found
27
Exporting Error Report to Excel®
28
Exporting Error Report to Excel®
29
Deleting Datasets Many macros are used to create many datasets in the process of checking one workbook To ensure there is enough space in the SAS session, PROC Datasets is used to clean up the libraries used in the program
30
Deleting Datasets To delete all files in a SAS data library at one time use the KILL option CAUTION: The KILL option deletes all members of the library immediately after the statement is submitted
31
Conclusion When faced with overwhelming task break it down
Solve one problem at a time Doing research online may help provide different solutions Find one that works for your problem, and YOU prefer Don’t be afraid to code your program and do some steps that are not as efficient (“down and dirty”) When utilizing macros, get the program to work before coding the macro(s) Enhance your program for efficiency when you have more time
32
Contact Information Name: Lisa Mendez Company: IQVIA GS Name: Andrew Kuligowski Company: HSN
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.