Basic SAS Functions in Version 8.2 Kim Michalski Office of the Actuary Rick Andrews Office of Research, Development, and Information
Where to begin? n There are over 300 SAS Function in Version 8.2 n This presentation will describe approximately 2 dozen in the following areas: Truncation Character Conversion Date and Time n For a complete list visit: - Base SAS Software - SAS Language and Reference Dictionary - Functions and CALL Routines
Truncation n 1.A - Round up to the next largest integer - CEIL n 1.B - Round down to the next smallest integer - FLOOR n 1.C - Remove the fractional part of a number - INT n 1.D - Round a numeric value to specified unit - ROUND
Descriptive Statistics n 2.A - Compute sum of values in a list - SUM n 2.B - Compute average values of a list - MEAN n 2.C - Determine smallest value in a list - MIN n 2.D - Determine largest value in a list - MAX
Character n 3.A - Split text containing a delimiter (1)- SCAN n 3.B - Split text containing a delimiter (2) - SCAN n 3.C - Justify character values - LEFT n 3.D - Return part of character expression - SUBSTR n 3.E - Return location of character string - INDEX Should we add TRIM?
Character n 3.A - Split text containing a delimiter (1)- SCAN n 3.B - Split text containing a delimiter (2) - SCAN n 3.C - Justify character values - LEFT n 3.D - Return part of character expression - SUBSTR n 3.E - Return location of character string - INDEX Should we add TRIM?
Date and Time n 5.A - Various date functions - Multiple n 5.B - Intervals of a time span - INTCK n 5.C - Advance a date or time interval - INTNX n 5.D- Compute age as of last birthday- YRDIF
1.A - Round up to the next largest integer ( CEIL ) DATA temp1a; rate1 = CEIL(3.2); rate2 = CEIL(3.8); rate3 = CEIL(-3.2); rate4 = CEIL(-3.8); payment1 = CEIL(999.22); payment2 = CEIL(999.88); RUN; ******************************************************** rate1 rate2 rate3 rate4 payment1 payment ********************************************************
1.B - Round down to the next smallest int. ( FLOOR ) DATA temp1b; rate1 = FLOOR(3.2); rate2 = FLOOR(3.8); rate3 = FLOOR(-3.2); rate4 = FLOOR(-3.8); payment1 = FLOOR(999.22); payment2 = FLOOR(999.88); RUN; ******************************************************** rate1 rate2 rate3 rate4 payment1 payment ********************************************************
1.C - Remove the fractional part of a number ( INT ) DATA temp1c; rate1 = INT(3.2); * Same as FLOOR for positive numbers; rate2 = INT(3.8); rate3 = INT(-3.2); * Same as CEIL for negative numbers; rate4 = INT(-3.8); payment1 = INT(999.22); payment2 = INT(999.88); RUN; ********************************************************** rate1 rate2 rate3 rate4 payment1 payment **********************************************************
1.D - Round a numeric value to specified unit ( ROUND ) DATA temp1d; rate1 = ROUND(3.2); rate2 = ROUND(3.8); rate3 = ROUND(-3.2); marg1 = ROUND(12.49); marg2 = ROUND(12.49,.1); dec2 = ROUND( ,.01); five = ROUND(73.3,5); ten = ROUND(23.99,10); RUN; *********************************************************** rate1 rate2 rate3 marg1 marg2 dec2 five ten ***********************************************************
Descriptive Statistics n 2.A - Compute sum of values in a list - SUM n 2.B - Compute average values of a list - MEAN n 2.C - Determine smallest value in a list - MIN n 2.D - Determine largest value in a list - MAX
2.A - Compute sum of values in a list ( SUM ) DATA temp2a1; x=100; y=200; z=.; tot1=SUM(x,y,z); tot2=x+y+z; *<-- BEWARE: tot2 will be missing *; * since z is missing *; RUN; ********************************************************* x y z tot1 tot *********************************************************
2.A - Compute sum of values in a list ( SUM ) DATA temp2a2; a1=300; a2=.; a3=400; a4=500; tot3=SUM(a1,a2,a3,a4); tot4=SUM(OF a1-a4); * If the OF argument is not used *; tot5=SUM(a1-a4); * then a4 will be subtracted a1 *; RUN; ********************************************************* a1 a2 a3 a4 tot3 tot4 tot *********************************************************
2.A - Compute sum of values in a list ( SUM ) DATA temp2a3; x=100; y=200; z=.; a1=300; a2=.; a3=400; a4=500; tot6=SUM(OF a1-a4,x,y,z); tot7=SUM(100,200,.,300,.,400,500); tot8=SUM(z,a2); * Add a zero to the list if a *; tot9=SUM(0,z,a2); * missing value is not desired *; RUN; * when all values are missing *; ********************************************************* tot6 tot7 tot8 tot *********************************************************
Conversion n 4.A - Convert Character date to SAS date- INPUT n 4.B - Use of the YEARCUTOFF Option- INPUT n 4.C - Convert character to numeric - INPUT n 4.D - Convert numeric to character - PUT
3.A - Split text containing a delimiter ( SCAN ) DATA temp3a; dir = 'G:\IMG\Data'; level1 = SCAN(dir, 1,'\'); level2 = SCAN(dir, 2,'\'); level3 = SCAN(dir, 3,'\'); RUN; ***************************************** dir level1 level2 level G:\IMG\Data G: IMG Data *****************************************
3.B - Split text containing a delimiter ( SCAN ) DATA temp3b; state_cnty = 'Montgomery, MD'; county = SCAN(state_cnty,1,','); state = SCAN(state_cnty,2,','); RUN; ***************************************** state_cnty county state Montgomery, MD Montgomery MD Note the space! *****************************************
3.C - Justify character values ( LEFT / RIGHT) DATA temp3c; SET temp3b; state = LEFT(state); OUTPUT; state = RIGHT(state); OUTPUT; RUN; ***************************************** state_cnty county state Montgomery, MD Montgomery MD *****************************************
3.D - Return part of character expression ( SUBSTR ) DATA temp3d; * ; state_cnty = 'Montgomery, MD'; county = SUBSTR(state_cnty, 1,10 ); state = SUBSTR(state_cnty, 13,2 ); RUN; ***************************************** state_cnty county state Montgomery, MD Montgomery MD *****************************************
3.E - Return location of character string ( INDEX ) DATA temp3e; state_cnty = 'Montgomery, MD'; idx = INDEX(state_cnty,','); county = SUBSTR(state_cnty,1, idx-1 ); state = SUBSTR(state_cnty, idx+2,2); RUN; ***************************************** state_cnty idx county state Montgomery, MD 11 Montgomery MD ***************************************** Determine location of the comma. Use that location within the substring.
4.A - Convert CHAR date into SAS date ( INPUT ) OPTIONS YEARCUTOFF=1900; * Can cause an ERROR *; * for 2-digit years *; DATA temp4a; * ; * Month / Day / Year *; char_date = '02/29/2004'; sas_date = INPUT(char_date,MMDDYY10.); OUTPUT; char_date = ' '; sas_date = INPUT(char_date,MMDDYY10.); OUTPUT; char_date = ' '; sas_date = INPUT(char_date,MMDDYY8.); OUTPUT; char_date = '022903'; sas_date = INPUT(char_date,?? MMDDYY6.); OUTPUT; Note that question marks can be used to suppress error messages in the log.
4.A - Convert CHAR date into SAS date ( Continued ) ********************************************************* char_date sas_date Format /29/ /29/2004 MMDDYY /29/2004 MMDDYY /29/1904 MMDDYY ?? MMDDYY6. Note the invalid date of February 29, 2003 ! *********************************************************
4.B - Use of the YEARCUTOFF Option ( INPUT ) OPTIONS YEARCUTOFF=1920; *<-- Set date greater than largest expected 2-digit year !!!; DATA temp4b; * ; * Day / Month / Year *; char_date = '02/29/04'; sas_date = INPUT(char_date,DDMMYY8.); OUTPUT; char_date = '02/29/2004'; sas_date = INPUT(char_date,DDMMYY10.); OUTPUT; * Year / Month / Day *; char_date = ' '; sas_date = INPUT(char_date,YYMMDD8.); OUTPUT; char_date = '2004/02/29'; sas_date = INPUT(char_date,YYMMDD10.); OUTPUT;
4.B - Convert CHAR date into SAS date ( Continued ) DATA temp4b; * ; * Day / Char Month / Year *; char_date = '29feb04'; sas_date = INPUT(char_date,DATE7.); OUTPUT; char_date = '29feb2004'; sas_date = INPUT(char_date,DATE9.); OUTPUT; ********************************************************* char_date sas_date Format /03/04 03/01/2003 DDMMYY8. 02/03/ /01/2003 DDMMYY /31/2003 YYMMDD /02/29 01/31/2003 YYMMDD10. 29feb04 01/01/2003 DATE7. 29feb /01/2003 DATE9. *********************************************************
4.C - Convert Character to Numeric ( INPUT ) DATA temp4c; LENGTH char $25; FORMAT num 15.2; char = '123'; num = char + 0; OUTPUT; * Note ADDITION *; char = '123.45'; num = INPUT(char,6.2); OUTPUT; char = '12345'; num = INPUT(char,5.); OUTPUT; char = ' '; num = INPUT(char,15.); OUTPUT; char = '123,456,789,012'; num = INPUT(char,COMMA15.); OUTPUT; RUN;
4.C - Convert Character to Numeric ( Continued ) ********************************************************** char num Statement char INPUT(char,6.2) INPUT(char,5.) INPUT(char,15.) 123,456,789, INPUT(char,COMMA15.) **********************************************************
4.D - Convert Numeric to Character ( PUT ) DATA temp4d; LENGTH char $25; FORMAT num 15.2; num = 123; char = num || ''; OUTPUT; * Note CONCATENATION *; num = ; char = PUT(num,6.2); OUTPUT; num = 12345; char = PUT(num,5.); OUTPUT; num = ; char = PUT(num,15.); OUTPUT; num = ; char = PUT(num,COMMA15.); OUTPUT; RUN;
********************************************************** num char Statement num || '' PUT(num,6.2) PUT(num,5.) PUT(num,15.) ,456,789,012 PUT(num,COMMA15.) PUT(num,Z5.) ********************************************************** 4.D - Convert Numeric to Character ( Continued ) Note the leading zero. The “ Z5. “ informat can be used to convert numeric procedure codes imported from Excel. Note the leading spaces when using concatenation operator.
5.A - Date Functions DATA temp5a; var1 = TODAY(); OUTPUT; var1 = DATETIME(); OUTPUT; var1 = DATEPART(var1); OUTPUT; var1 = MONTH( TODAY() ); OUTPUT; var1 = YEAR( TODAY() ); OUTPUT; var1 = DAY( TODAY() ); OUTPUT; var1 = QTR( TODAY() ); OUTPUT; RUN;
5.A - Date Functions ****************************************** var1 Statement TODAY() DATETIME() DATEPART( var1 ) 1 MONTH( TODAY() ) 2007 YEAR( TODAY() ) 10 DAY( TODAY() ) 1 QTR( TODAY() ) ******************************************
5.B - Intervals of a Time Span ( INTCK ) DATA temp5b; FORMAT beg_date end_date MMDDYY10.; beg_date = '01JAN2001'D; end_date = '31DEC2002'D; example = INTCK('DAY', beg_date,end_date); OUTPUT; example = INTCK('MONTH',beg_date,end_date); OUTPUT; example = INTCK('MONTH',beg_date,end_date + 1); OUTPUT; example = INTCK('YEAR', beg_date,end_date); OUTPUT; example = INTCK('YEAR', beg_date,end_date + 1); OUTPUT; example = INTCK('QTR', beg_date,end_date); OUTPUT; RUN;
5.B - Intervals of a Time Span ( Continued ) ********************************************************** beg_date (b) end_date (e) example Statement /01/ /31/ INTCK('DAY', b,e) 01/01/ /31/ INTCK('MONTH',b,e) 01/01/ /31/ INTCK('MONTH',b,e + 1) 01/01/ /31/ INTCK('YEAR', b,e) 01/01/ /31/ INTCK('YEAR', b,e + 1) 01/01/ /31/ INTCK('QTR', b,e) **********************************************************
5.C - Advance a Date or Time Interval ( INTNX ) DATA temp5c; FORMAT orig_date example MMDDYY10.; orig_date = '31DEC2002'D; new_date = INTNX('DAY', orig_date,0); OUTPUT; new_date = INTNX('DAY', orig_date,1); OUTPUT; new_date = INTNX('MONTH',orig_date,1); OUTPUT; new_date = INTNX('MONTH',orig_date,3); OUTPUT; new_date = INTNX('YEAR', orig_date,1); OUTPUT; new_date = INTNX('YEAR', orig_date,2,'END'); OUTPUT; RUN;
5.C - Advance a Date or Time Interval ( Continued ) ***************************************************** orig_date(d) new_date Statement /31/ /31/2002 INTNX('DAY',d,0) 12/31/ /01/2003 INTNX('DAY',d,1) 12/31/ /01/2003 INTNX('MONTH',d,1) 12/31/ /01/2003 INTNX('MONTH',d,3) 12/31/ /01/2003 INTNX('YEAR',d,1) 12/31/ /31/2004 INTNX('YEAR',d,2,'END') ******************************************************
5.D - Compute Age as of Last Birthday ( YRDIF ) DATA temp5d; format dob1 dob2 mmddyy10.; dob = '23DEC1970'd; tmp = YRDIF(dob, TODAY(), 'ACTUAL'); age = FLOOR(tmp); RUN; ****************************** dob tmp age /23/ ****************************** Note - Age in this example correspond to TODAY;
About the Speakers Kim Michalski (410) Rick Andrews (410) Centers for Medicare and Medicaid Services 7500 Security Boulevard Baltimore, MD 21244