Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lisa Mendez, PhD & Andrew Kuligowski

Similar presentations


Presentation on theme: "Lisa Mendez, PhD & Andrew Kuligowski"— Presentation transcript:

1 Lisa Mendez, PhD & Andrew Kuligowski
Case Study: Using Base SAS® to Automate Quality Checks of Excel ® Workbooks that have Multiple Worksheets Lisa Mendez, PhD & Andrew Kuligowski

2 Overview The Process Determine how to identify the smaller problems within the larger, overwhelming problem Solve each problem using SAS code Implementing the code Lessons learned

3 Background Unfamiliar with the data – thrown into the deep end
There were 5 markets ADHD, BNZD, CNNB, CDNE, and PAIN Each market had 7 Excel Workbooks that needed to be checked Each Workbook had various multiple worksheets ADHD – 7 worksheets BNZD – 24 worksheets CNNB – 7 worksheets CDNE – 5 worksheets PAIN – 55 worksheets

4 The Overarching Problem
Let’s do the math! 5 markets multiplied by 7 workbooks (35 workbooks) that had a total of 98 workbooks that needed to be checked That’s 3,430 worksheets FOR 27 QUARTERS!!!!! For a grand total of 92,610 worksheets That can be just a little overwhelming…

5 Getting the Data into SAS®
XLSX Engine Allows you to read and write Microsoft Excel files as if they were data sets in a library Advantage is that it accesses the XLSX file directly - does not use the Microsoft data APIs as a go-between You have to have a license for SAS/ACCESS to PC Files to utilize the XLSX engine SAS University Edition, the SAS/ACCESS product is part of the that package libname Cadhd1 XLSX "C:\Users\lmendez\Documents\RMPDC\Deliverables2017_Q2\ADHD\RMPD_Patient Tracking_ADHD_NDW_2018Q2.xlsx";

6 Getting the Data into SAS®
libname Cadhd1 XLSX "C:\Users\lmendez\Documents\RMPDC\Deliverables2017_Q2\ ADHD\RMPD_Patient Tracking_ADHD_NDW_2018Q2.xlsx"; The libname statement sets up the datasets, and you will see them in the cadhd1 library, but the datasets will be empty Names of datasets are the names of the worksheets

7 Loading the Data Using PROC SQL & SAS Dictionary tables

8 Loading the Data Note: All Caps where libname="CADHD1"

9 Loading the Data

10 Loading the Data

11 Loading the Data Macro variables (will be used in the macro)
Output from the log: 52 53 %put &snamlist_1; /* show the macro variable snamlist in the log */ LOOKUP*STATE_SUBGRP*STATE_SUPERGRP*ZIP_SUBGRP_AMPH*ZIP_SUBGRP_METH*ZIP_SUBGRP_OTH_ANAL*ZIP_SUBGRP_OTH_ANTI*ZIP_SUPER 54 %put &n_1; /* show the macro variable n_1 I the log */ 8

12 Loading the Data SAS Macro
54 %put &n_1; /* show the macro variable n_1 I the log */ 8 LOOKUP*STATE_SUBGRP*STATE_SUPERGRP*ZIP_SUBGRP_AMPH*ZIP_SUBGRP_METH*ZIP_SUBGRP_OTH_ANAL*ZIP_SUBGRP_OTH_ANTI*ZIP_SUPER

13 Validating Worksheet & Variable Names
Need templates to compare Load templates each quarter Ensure permanent template library (libname statement) By Market List of variable names List of worksheet names

14 Validating Worksheet Names
Once templates are loaded, compare worksheet names

15 Validating Worksheet Names
Dataset created after PROC SQL compare for Worksheet Names All worksheet names match – no errors

16 Validating Worksheet Names
Create an error report

17 Validating Variable Names
Once templates are loaded, compare variable names Use Proc Contents to get a current list variable names

18 Validating Worksheet & Variable Names
Dataset created after PROC SQL compare for Variable Names Everything Matches Note: change variable names either before PROC SQL, or in the PROC SQL statement

19 Exporting Error Report to Excel®
The macro variable ‘x’ is used to number the reports that correspond with each workbook

20 Exporting Error Report to Excel®
Used within a macro One Excel file per Market Multiple worksheets for each workbook checked No errors for this workbook Each worksheet corresponds to a workbook

21 Exporting Error Report to Excel®
Sample of worksheet error

22 Exporting Error Report to Excel®
Lessons learned: Do not output if there are no errors, or output “no error” message, because most of the workbooks do not have variable name or worksheet name errors

23 Validating Data A macro variable was created, using the same methods as before for all the worksheet/dataset names The macro variable was used in conjunction with a macro to execute a data step multiple times to check all the data within a worksheet/dataset

24 Validating Data

25 Validating Data Similar code was written to check the products within a workbook A pre-loaded template was used to ensure the correct products were in the correct worksheet/dataset A macro was used, along with a data step, and a PROC SQL step to compare product names in the pre-loaded template with the product names of the current data

26 Validating Data An exception report was created for the values check
Utilized lesson learned from previous Excel export For these exception reports, only MS Excel workbooks were created for each worksheet only if any errors were found

27 Exporting Error Report to Excel®

28 Exporting Error Report to Excel®

29 Deleting Datasets Many macros are used to create many datasets in the process of checking one workbook To ensure there is enough space in the SAS session, PROC Datasets is used to clean up the libraries used in the program

30 Deleting Datasets To delete all files in a SAS data library at one time use the KILL option  CAUTION: The KILL option deletes all members of the library immediately after the statement is submitted

31 Conclusion When faced with overwhelming task break it down
Solve one problem at a time Doing research online may help provide different solutions Find one that works for your problem, and YOU prefer Don’t be afraid to code your program and do some steps that are not as efficient (“down and dirty”) When utilizing macros, get the program to work before coding the macro(s) Enhance your program for efficiency when you have more time

32 Contact Information Name: Lisa Mendez Company: IQVIA GS Name: Andrew Kuligowski Company: HSN


Download ppt "Lisa Mendez, PhD & Andrew Kuligowski"

Similar presentations


Ads by Google