Lesson 12 Topics Macro example Exporting data Character Functions

Slides:



Advertisements
Similar presentations
The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Examples from SAS Functions by Example Ron Cody
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Creating SAS® Data Sets
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
SAS PROC REPORT PROC TABULATE
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Prepared by: Luigi Muro – Consultant
Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4.
EPIB 698C Lecture 2 Notes Instructor: Raul Cruz 2/14/11 1.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
Introduction to SAS Essentials Mastering SAS for Data Analytics
Forms and Server Side Includes. What are Forms? Forms are used to get user input We’ve all used them before. For example, ever had to sign up for courses.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Lesson 6 - Topics Reading SAS datasets Subsetting SAS datasets Merging SAS datasets.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3.
Lesson 12 More SGPLOT examples Exporting data Macro variables Table Generation - PROC TABULATE Miscellaneous Topics.
Lecture 4 Ways to get data into SAS Some practice programming
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
03/20161 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 29, 2016 Dr. N. Birkett, School of Epidemiology, Public Health.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Development Environment
Today: Feb 28 Reading Data from existing SAS dataset One-way ANOVA
Miscellaneous Excel Combining Excel and Access.
PubH 6420 Introduction to SAS Programming
Mail Merge for Lotus Notes and Excel User Guide
SAS Programming Training
Introduction to SPSS.
Lesson 2 Topic - Reading raw data into SAS
Lesson 6 - Topics Formatting Output Working with Dates
SAS Programming Training
Have you signed up (or had) your meeting?
Instructor: Raul Cruz-Cano 7/9/2012
Chapter 2: Getting Data into SAS
Two “identical” programs
SAS Programming Introduction to SAS.
Intro to PHP & Variables
ECONOMETRICS ii – spring 2018
Lesson 9 - Topics Restructuring datasets LSB: 6:14
Chapter 1: Introduction to SAS
Instructor: Raul Cruz-Cano
Lesson 8 - Topics Creating SAS datasets from procedures
Exploring Microsoft® Access® 2016 Series Editor Mary Anne Poatsy
Variables In programming, we often need to have places to store data. These receptacles are called variables. They are called that because they can change.
Lesson 11 - Topics Statistical procedures: PROC LOGIST, REG
Number and String Operations
Lesson 7 - Topics Reading SAS data sets
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
Working With Dates: Dates Come in Many Ways
Defining and Calling a Macro
Fundamentals of Data Structures
SAS Programming Training
Working With Dates: Dates Come in Many Ways
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Have you signed up (or had) your meeting?
Intro to Excel CSCI-150.
Data Manipulation (with SQL)
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

Lesson 12 Topics Macro example Exporting data Character Functions Background to final assignment Welcome to lesson 12, the last lesson for the class. In this lesson we will briefly touch on several topics, some of which you may find useful in your SAS programming future. We will look at more SGPLOT examples for line graphs so you have examples all in one place; we will then see how to export SAS datasets to other applications such as Excel and to other computers. We will cover some basic examples of using a table generating procedure call proc tabulate and finally give a brief introduction to macros and macro variables.

proc tabulate data=tomhs noseps; class group; var dbp12; ---------------------------------------------------------------------- | | Diastolic BP at 12-Months | | |---------------------------------------| | | N | Mean | Std | Min | Max | |----------------------------+-------+-------+-------+-------+-------| |Study Group (1-6) | | | | | | |1 | 15| 77.8| 9.3| 68.0| 94.0| |2 | 17| 81.7| 7.1| 72.0| 100.0| |3 | 14| 78.1| 7.6| 67.0| 90.0| |4 | 14| 77.7| 6.0| 66.0| 89.0| |5 | 13| 79.6| 8.5| 66.0| 99.0| |6 | 19| 79.6| 7.3| 64.0| 95.0| |All | 92| 79.2| 7.6| 64.0| 100.0| proc tabulate data=tomhs noseps; class group; var dbp12; table (group all), (dbp12)*(n*f=7.0 mean*f=7.1 std*f=7.1 min*f=7.1 max*f=7.1)/rts=30; run;

MACRO BRKSPSS: Creates tabulate table for each var in dlist by group %macro brkspss (grp,dlist,data=_last_,dec=3,all=all); %do I = 1 %to 100; %let depvar = %scan(&dlist,&i); %let %length(&depvar) = 0 %then %goto done; proc tabulate data=&data noseps; class &grp; var &depvar; table (&grp &all), (&depvar)*(n*f=7.0 mean*f=7.&dec std*f=7.&dec min*f=7.&dec max*f=7.&dec)/rts=30; run; %end; %done: %mend brkspss; %brkspss(group,dbp12 sbp12 chol12);

MACRO BRKSPSS: Creates tabulate table for each var by group LIBNAME t '~/PH6420/2017/Data/'; DATA stat; set t.tomhs; RUN; * Example calls; %brkspss(group,dbp12 sbp12 chol12); %brkspss(group,dbp12 sbp12 chol12, dec=1); * Just 1-decimal; %brkspss(group,dbp12 sbp12 chol12, all=); * No totals;

Output from last call: First 2 variables. ---------------------------------------------------------------------- | | Diastolic BP at 12-Months | | |---------------------------------------| | | N | Mean | Std | Min | Max | |----------------------------+-------+-------+-------+-------+-------| |Study Group (1-6) | | | | | | |1 | 15| 77.800| 9.314| 68.000| 94.000| |2 | 17| 81.706| 7.078| 72.000|100.000| |3 | 14| 78.071| 7.580| 67.000| 90.000| |4 | 14| 77.714| 5.954| 66.000| 89.000| |5 | 13| 79.615| 8.540| 66.000| 99.000| |6 | 19| 79.579| 7.313| 64.000| 95.000| | | Systolic BP at 12-Months | |1 | 15|120.200| 12.537| 93.000|141.000| |2 | 17|124.118| 11.280|108.000|142.000| |3 | 14|117.429| 9.436|104.000|135.000| |4 | 14|127.571| 11.876|112.000|149.000| |5 | 13|123.154| 18.348| 94.000|158.000| |6 | 19|129.895| 12.987|105.000|154.000|

At beginning of program before you call it Where to put macro? At beginning of program before you call it %macro brkspss(parameters); … macro code %mend brkspss; data tomhs; set t.tomhs; run; %brkspss (group, dbp12 sbp12, data=tomhs); Save as separate sas file and %include file on top of program. %include ‘/folderpath/brkspss.sas’; %brkspss(group, dbp12 sbp12, data=tomhs);

%condesf macro: Very Useful Condescriptive of: tomhscv (Obs=902, Nvar=15 Created: 27NOV17 15:52) Seq Name T Format Variable Label N Mean Std Dev Minimun Maximum ----------------------------------------------------------------------------------------------------------------------------- 1 ptid C6 Patient ID 902 A00001 D02136 2 group N8 Study Group (1-6) 902 3.788248 1.787413 1 6 3 age N8 Age (y) at Randomization 902 54.77273 6.40394 44 69 4 sex N8 1=Male 2=Female 902 1.382483 0.486263 1 2 5 eversmk N8 Ever Smoke Cigarettes (1=Y, 2=N) 899 1.532814 0.4992 1 2 6 nowsmk N8 Now Smoke Cigarettes (1=Y, 2=N) 424 1.768868 0.422055 1 2 7 sbpbl N8 Systolic BP at Baseline 902 140.3636 12.4446 113.5 190 8 sbp12 N8 Systolic BP at 12-Months 848 124.1002 15.18918 87 187 9 cholbl N8 Total Cholesterol at Baseline 900 228.2511 38.41697 113 357 10 chol12 N8 Total Cholesterol at 12-Months 849 220.8386 38.86243 111 456 11 hdlbl N8 HDL Cholesterol at Baseline 900 43.61222 11.61247 17 97 12 hdl12 N8 HDL Cholesterol at 12 Months 849 45.49234 12.10597 18 102 13 glucosbl N8 Blood Glucose at Baseline 902 100.9246 15.61577 67 206 14 glucos12 N8 Blood Glucose at 12 Months 845 98.67219 16.85603 68 294 15 cvd N8 Cardiovascular Event (1=yes,2=no) 902 1.875831 0.329957 1 2

Steps to Using %condesf Download the macro from class website Place macro in folder Use %include to make the macro available Call the macro

Using %condesf libname t ‘/folders/myfolders/’; %include ‘/folders/myfolders/condesf.sas’; * if using SAS University edition; %condesf(t.tomhscv,folder=/folders/myfolders/); * Generates two files: tomhscv.condes and tomhs.pdf;

LIBNAME t ‘C:\SAS_Files’; DATA tomhs; SET t.tomhs; * Exporting Data; LIBNAME t ‘C:\SAS_Files’; DATA tomhs; SET t.tomhs; KEEP ptid clinic randdate group educ wt12 sbp12; RUN; * Export data to a comma delimited file; PROC EXPORT DATA=tomhs OUTFILE = 'C:\SAS_Files\tomhs.csv' DBMS = csv REPLACE; There are times when you might want to export a SAS dataset to another format so that other software can access the data. An example would be sending your dataset to excel so you can use Excel to create some graphics (for users not familiar with SAS SGPLOT!) You can do this using PROC EXPORT, the compliment procedure of PROC IMPORT. Here we export several variables from the sescore dataset created previously. We use a short DATA step to create a dataset called temp, reading sescore and keeping variable listed in the KEEP statement. In the PROC EXPORT we output the dataset temp to a CSV file, naming the file se.csv. The DBMS option is set to csv, which can be omitted if we use the csv extension on the output file. The replace option tells SAS to overwrite the export file if it already exists.

Contents of file ‘tomhs.csv' ptid,clinic,randdate,group,educ,wt12,sbp12 A00083,A,02/05/1987,2,7,125,113 A00301,A,02/17/1987,6,9,,, A00312,A,04/08/1987,3,4,131,113 This file can be read by other software program, e.g. excel or R. If you view the exported file it should look like what is displayed here. The first row will contain the variable names. You see the data is comma delimited and that missing data is written as multiple commas. This file can be opened in excel by clicking on the file. You can then save the file as an excel worksheet. In PC SAS, You can also export the SAS dataset directly to an excel file by using an xls extension on the output file. The DBMS is then set to excel, or can be omitted if the xls extension is used.

Moving a SAS Dataset to another computer Transfer SAS dataset directly - Easy and works on most systems - Can send as e-mail attachment Use PROC CPORT and PROC CIMPORT Works on all systems but requires you to create an xport (.xpt) file. Can transfer multiple datasets in one file. You may want to send a SAS dataset to another computer for you or another person to use. First of all, SAS must be installed on the other computer, otherwise it will do them no good. In most cases you can transfer the dataset by e-mail (as an attachment) or ftp. This will work for PC and UNIX systems. The person on the other end saves the attached file and is ready to go, using an appropriate LIBNAME statement in their SAS program to point to the file. In some cases you will first need to create what is called a SAS export file. One advantage of doing this is that the export file can hold multiple SAS datasets. You create an export file using PROC CPORT. Then after the export file is sent to the other computer the SAS datasets are extracted using PROC CIMPORT.

Creating a SAS Export File * Run this on the your computer ; LIBNAME mylib ‘C:\SAS_Files'; FILENAME tranfile ‘C:\SAS_Files\classdata.xpt'; PROC CPORT LIB=mylib FILE=tranfile; SELECT sescore tomhs; RUN; * Run this on the other computer ; FILENAME tranfile 'C:\My SAS Datasets\classdata.xpt'; PROC CIMPORT LIB=work FILE=tranfile; PROC CONTENTS VARNUM DATA=sescore; PROC CONTENTS VARNUM DATA=tomhs; RUN; Here we illustrate the call to PROC CPORT. The example shown here creates an export file of two datasets, sescore and tomhsp. PROC CPORT takes these datasets located in the mylib folder (pointing to ‘C:\SAS_Files’) and writes out these datasets to the export file called classdata.xpt. XPT is the file extension for SAS export files. This can be then be sent to the other computer or user. The person on the other end saves the file and uses PROC CIMPORT to extract the SAS dataset. The example here will extract the datasets to the work folder. To save them permanently a different libname would need to be give. You will then usually run a proc contents on the new datasets to see what you “got”.

* Character functions; Start with 3 names all in caps: fname = GREGORY lname = GRANDITS mi = A Create a new variable fullname = Gregory A. Grandits In some cases you may be working with character data such as names, addresses, etc. Often times the data will be entered into the computer as separate variables for say the first, middle initial, and last names. However, for some reports or listings you may want to have the entire name be one variable as shown here for my name. SAS has several character functions that can help you do this task. 14

Functions/Operators SUBSTR Takes a subset of characters from a character variable LOWCASE Changes characters to lower case (also UPCASE and PROPCASE) || Concatenates variables or strings var1 = 'abc'; var2 = 'def'; var3 = var1||var2; var3 has value 'abcdef‘ CAT (CATX) Functions that concatenate vars/strings SCAN Picks off “words” from a char variable Section 3.3 LSB Cody and Smith have an entire chapter covering character functions. If you are doing a lot of work with character data you may want to read through the entire chapter. I will illustrate a few character functions that help do common tasks. The first is the SUBSTR function which takes a subset of characters from a variable and places them into a new variable. The LOWCASE function changes all upper case letters in a variable to lower case. There is also an upper case function and a proper case function. The latter capitalizes the first character of each word and makes lower case all other letters. The || is the concatenate operator. It is used to “concat” character variables or strings together as illustrated here. The CAT function is newly added to SAS; it can replace the concatenate operator. There are several other CAT type functions; one CATX inserts a character in between each variable or string. We will use that in program 10. The SCAN function picks off words based on word boundaries where boundaries are blanks or other special characters. 15

INFORMAT fname $20. lname $20. mi $1. ; INPUT lname fname mi ; DATA names; INFILE DATALINES DSD; INFORMAT fname $20. lname $20. mi $1. ; INPUT lname fname mi ; LENGTH fnamemix $20. lnamemix $20. fullname $44.; fnamemix = PROPCASE(fname); lnamemix = PROPCASE(lname); miperiod = CAT(mi,'.'); fullname = CATX(' ',fnamemix,miperiod,lnamemix); DATALINES; GRANDITS, GREGORY, A SIU, YI, W ; Obs fnamemix lnamemix miperiod fullname 1 Gregory Grandits A. Gregory A. Grandits 2 Yi Siu W. Yi W. Siu Let’s see how some of these functions are used. In program 10 we read in the last name, first name, and middle initial (there are just 2 rows). Our goal is to create a single variable containing the names, changing upper case characters that are not the first character of a name to lower case. We also want to add a period to the middle initial. We start by computing a new variable called fnamemix. This will contain the first name in mixed case. This is done by using the PROPCASE function. The same function is used for the last name to create the last name as mixed case. We use the cat function to add a period to the middle initial. The result is stored in the variable miperiod. Finally, we use the CATX function to concatenate the three names, inserting a blank in-between each name. All other blanks are removed. The length statements are used to define the maximum number of characters the variable can contain. SAS will sometimes make the concatenated new variable much longer than they need to be. It is usually a good idea to set the length of a character variable upfront to be the largest number of characters the variable could be. Take a little time to study these example. Working with character variables like this can be a little intimidating, although less so with some of the new SAS functions. . 16

LENGTH fname $20. lname $20. mi $2.; * Start with one variable for full name but wish to create separate variables for each. Use the SCAN function ; DATA names; INFILE DATALINES DSD; INFORMAT fullname $44.; INPUT fullname ; LENGTH fname $20. lname $20. mi $2.; fname = SCAN(fullname,1); *Take 1st word; mi = SCAN(fullname,2,' '); *Take 2nd word; lname = SCAN(fullname,3); *Take 3rd word; DATALINES; Gregory A. Grandits Yi W. Siu ; Sometimes you want to go in the other direction, i.e. the full name is contained in one variable and you would like to make each part of the name a separate variable. This is fairly easy to do using the SCAN function. The SCAN function picks off “words” where words are defined by delimiters. Here fname is the first word of fullname, mi is the second word, and lname is the third word. Since by default a period is a delimiter, we need to specify for the middle initial that just a blank should be used as a delimiter. Otherwise you would not get the period as part of the middle initial. 17

PROC PRINT DATA=names; VAR fullname fname mi lname; TITLE 'Original Variable and new variables'; RUN; Obs fullname fname mi lname 1 Gregory A. Grandits Gregory A. Grandits 2 Yi W. Siu Yi W. Siu Here is a display of each of the variables. You see that we have correctly separated the name into three variables. As mentioned Cody and Smith have an entire chapter on character functions. Much of that you may never need but if you find yourself with a task of combining and separating character variables, you may want to refer to that chapter and the program we just covered. 18

Background to Final Assignment Assessing individual risk of CVD is important to determining treatment options Age, blood pressure, cholesterol, smoking, and diabetes are risk factors for CVD The Framingham study has baseline RF and long-term follow-up for CVD that allows estimating long-term risk based on baseline RF

Framingham Equation Uses baseline smoking, systolic BP, total cholesterol, HDL-C, and diabetes status to quantify risk of CVD Uses Cox-regression (similar to logistic regression) with above RF in model to estimate CVD risk. Plug in above RF values into formula to estimate probability of CVD in 10-years.

Framingham Equation (Formula for Women) 2.33 * log of age + 1.21 * log of total cholesterol - 0.71 * log HDL cholesterol + 2.76 * log systolic BP + 0.53 * smoking (0 or 1) + 0.69 * diabetes (0 or 1) Compute total score for individual Average total score = 26.19

Framingham Equations (Formula for Women) difference= (total score for person – average score) P = 1 – 0.95exp(difference) Simple example: If total score = average score P = 1 – 0.95exp(0) = 1 – 0.95 = 0.05

Framingham Equation (Example for Women) age = 61 total cholesterol = 180 HDL-C = 47 Systolic BP = 124 Current smoker (1) Not diabetic (0) Score = 2.33 * log(61) + 1.21 * log(180) – 0.71 * log(47) + 2.76 * log(124) + 0.53 * 1 + 0.69 * 0 = 26.97 p = 1-0.95exp(26.97-26.19) = 0.1048

SAS Functions Needed log = natural log exp = exponentiation ** = raised to a power Score = 2.33*log(age) + … ; Risk = 1 – 0.95**exp(score – average score);