Download presentation
Presentation is loading. Please wait.
Published byReynard King Modified over 8 years ago
1
Copyright © 2004, SAS Institute Inc. All rights reserved. SASHELP Datasets A real life example Barb Crowther SAS Consultant October 22, 2004
2
Copyright © 2004, SAS Institute Inc. All rights reserved. 2 Situation The Canadian government requires all financial institutions to report suspicious financial transactions, including: Cash Money Orders Casino Chips Real Estate ….
3
Copyright © 2004, SAS Institute Inc. All rights reserved. 3 Situation – Data The data is extracted from many different applications and saved to a flat file as a ‘report’. Excluding the header and subheader, there were 9 different parts for each report Data formatting was very specific and varied from part to part The number of fields varied from 4 to 35, Field lengths varied between 1 and 400 The part lengths varied between 139 and 507 characters.
4
Copyright © 2004, SAS Institute Inc. All rights reserved. 4 Types of Required Information ……
5
Copyright © 2004, SAS Institute Inc. All rights reserved. 5 The first kicker Not all reports needed all parts. Some parts were always required, others only some of the time. Which parts were included was dependant on the data. But if the part was required, it had to be perfect!!!
6
Copyright © 2004, SAS Institute Inc. All rights reserved. 6 The second kicker The users wanted to be able to edit each part before it was sent to the government – but because of the tool they used, they could not insert (or delete) a missing part. So even all the fields were missing from the source data, the part had to be included
7
Copyright © 2004, SAS Institute Inc. All rights reserved. 7 So………. The application had to insert each required part The only information I would get is a sequence number.
8
Copyright © 2004, SAS Institute Inc. All rights reserved. 8 Attempt #1 – Hard code the data step For each of the nine parts….. …. For each of the 10 to 35 fields per part…. …I could write a “length variable ($) n;” statement. Oh, and by the way, did I tell you that the government regularly changes part content?
9
Copyright © 2004, SAS Institute Inc. All rights reserved. 9 Attempt #2 – Investigate system options Options obs=0; Data parta1; set input.parta1; Run; Options obs=max; The problem with this code is the output dataset has no observations and I needed one, even if there was no data.
10
Copyright © 2004, SAS Institute Inc. All rights reserved. 10 Attempt #3 – Look at SASHELP.datasets The SASHELP datasets contain information about the current SAS session including all the members of all the libraries (SASHELP.VMEMBER) all the columns of each member (SASHELP.VCOLUMN)
11
Copyright © 2004, SAS Institute Inc. All rights reserved. 11 VCOLUMN Contents Variable NameVariable LabelCreate Date … FormatColumn Format InformatColumn Informat LabelColumn Label LengthColumn Length LibnameLibrary Name MemnameMember Name TypeColumn Type …
12
Copyright © 2004, SAS Institute Inc. All rights reserved. 12 Accessing V* Tables Accessing the V* Tables can be done using PROC, SQL, or Data statements proc print data=sashelp.vtable; where libname='WORK'; run; proc sql; create view work.options as select * from dictionary.options;
13
Copyright © 2004, SAS Institute Inc. All rights reserved. 13 So how does this help me? Step 1: Get a list of all the variables (and their attributes) required for the “empty dataset”. Step 2: Move all that information into macro variables Step 3: Create a dataset template Step 4: Create the empty dataset.
14
Copyright © 2004, SAS Institute Inc. All rights reserved. 14 Step 1: Variables and their attributes proc sql noprint; create table &table._vars as select name, type, format, length, label from sashelp.vcolumns where upcase(libname)=upcase("&inset") and upcase(memname) = upcase("&table") ; quit; For this example, our inset will be Work and our table Txns.
15
Copyright © 2004, SAS Institute Inc. All rights reserved. 15 Step 1: Work.Txns_vars VCOLUMNS Output NameTypeFormatLengthLabel Tran_dateChar8 Tran_Post_DateChar8 Tran_CurrencyChar$3.003Currency_code Tran_timeChar4 Tran_AmountNum8 Teller_idChar$15 …
16
Copyright © 2004, SAS Institute Inc. All rights reserved. 16 Step 2: Create macro variables using Txns_vars from Step 1 data _null_; set &table._vars end=eof; call symput('var'||left(put(_n_,3.)),name); if format ne ' ' then call symput('fmt'||left(put(_n_,3.)),format); else if upcase(type) = 'CHAR' then call symput('fmt'||left(put(_n_,3.)),'$'|| put(length,3.)||'.'); if label ne ' ' then call symput('label'||left(put(_n_,3.)),label); if eof then call symput('var_cnt',left(put(_n_,3.))) ; run;
17
Copyright © 2004, SAS Institute Inc. All rights reserved. 17 Step 2: Macro Output from the SASLOG
18
Copyright © 2004, SAS Institute Inc. All rights reserved. 18 Step 3: Create the dataset template Work.Txn_tpl using the Step 2 macro variables %let i = 1; data &table._tpl; %do i=1 %to &var_cnt; attrib &&var&i format= &&fmt&i label = "&&label&i"; %end;
19
Copyright © 2004, SAS Institute Inc. All rights reserved. 19 Step 3: Create the dataset template generated code to create Work.Txn_tpl Value of iGenerated Code 1Attrib transaction_key format = 20. label = ‘transaction_key”; 2Attrib transaction currency format = $3. label=‘Currency Code’; …
20
Copyright © 2004, SAS Institute Inc. All rights reserved. 20 Step 3: Create a dataset template Get a list of the required variables %global &table._var_list; proc sql noprint; select distinct name into :&table._var_list separated by ' ' from &table._vars ; quit; Results in a macro variable called txn_var_list with a value of TRANSACTION_KEY TRAN_CURRENCY …
21
Copyright © 2004, SAS Institute Inc. All rights reserved. 21 So where are we? We have a report with a known sequence number, but no data We know what variables are required &txn_var_list We know the variables’ attributes &&var&i format= &&fmt&i label = "&&label&i";
22
Copyright © 2004, SAS Institute Inc. All rights reserved. 22 Step 4: Create the empty dataset
23
Copyright © 2004, SAS Institute Inc. All rights reserved. 23 Step 4: Code to generate the dataset data &table._miss_data; retain &&&table._var_list; set result (keep=seq_num); if _n_ = 1 then set &table._tpl(drop=seq_num); run;
24
Copyright © 2004, SAS Institute Inc. All rights reserved. 24 Thoughts… Writing the macros took longer than hard coding the attribute statements. But, if there are any future changes, I won’t have to do very much (if any). The macros can be used in other applications…
25
Copyright © 2004, SAS Institute Inc. All rights reserved. 25 Suggested readings The SASHELP Library: It Really Does Help You Manage Data by Melinda Thielbar http://support.sas.com/sassamples/bitsandbytes/sashel p.htmlhttp://support.sas.com/sassamples/bitsandbytes/sashel p.html You Could Look It Up: An Introduction to SASHELP Dictionary Tables by Michael Davis You Could Look It Up: An Introduction to SASHELP Dictionary Tables http://www2.sas.com/proceedings/sugi26/p017-26.pdf
26
Copyright © 2004, SAS Institute Inc. All rights reserved. 26 Copyright © 2004, SAS Institute Inc. All rights reserved. 26
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.