Download presentation
Presentation is loading. Please wait.
Published byAugust Torgersen Modified over 6 years ago
1
SAS Programming I Matthew A. Lanham Doctoral Student
September 20, 2018 SAS Programming I Matthew A. Lanham Doctoral Student Virginia Polytechnic Institute and State University Pamplin College of Business Department of Business Information Technology
2
September 20, 2018
3
Short Course Outline A brief history of SAS
SAS Programming I September 20, 2018 Short Course Outline A brief history of SAS How is SAS different from other software packages? SAS Environment SAS Data Libraries Getting data into SAS using the IMPORT Wizard Data Example: National Longitudinal Mortality Survey SAS Programs – DATA and PROC Examples Creating data from scratch Merging data sets Saving data sets Basic Data Analysis PROCs References September 20, 2018 14:35
4
History of SAS SAS Institute, Cary, NC
About SAS SAS Programming I September 20, 2018 History of SAS SAS Institute, Cary, NC Jim Goodnight and Anthony Barr co-founders Began as a research project at NC State in 1966 Funding from NIH from 1976, 100+ customers, left NC State to form the SAS Institute, Inc. 1980s, one of Inc. Magazine’s fast growing companies, 1500 employees 1990s, 7000 employees #1 on Fortune’s best places to work 2011, 11,000 employees, more than 50,000 customer sites and 200 products Today Holds largest market share for advanced analytics Used by 79% of Fortune 500 companies September 20, 2018 14:35
5
How is SAS different from other software packages like R?
Similarities and Differences SAS Programming I September 20, 2018 How is SAS different from other software packages like R? SAS versus R SAS will require a license ($), R is open-source Any statistical analysis you can do in SAS, you could probably do in R SAS is commercial software so additions take longer. R has new libraries being added frequently. SAS help and documentation is nice (like MATLAB). R documentation is lean. SAS handles large data sets with ease. R stores everything in RAM making it vulnerable. SAS and R are fairly easy to learn, but R feels more like typical programming. SAS can be used like a database (PROC SQL) because of its design. Popularity Tiobe Sofware which ranks software popularity currently has SAS ranked #22 and R #24. KDNuggets 2013 software poll for data science or big data: R (37%), SAS (11%), MATLAB (10%) Rblogger.com - Number of Jobs on Indeed.com requesting skills in analytics: #1 SAS (12,272), #2 SPSS (3,289), #3 R (1,693) SAS and R comparison code Suggestions: Learn the basics of both and use as needed. You can work with both together. I’ll use R packages if I don’t want to write methods myself and SAS doesn’t have a PROC for me to use. September 20, 2018 14:35
6
SAS Environment SAS Programming I September 20, 2018 SAS Environment When you begin SAS, you should see 3 windows by default: Log window – displays messages about your SAS session and submitted programs Output window – “LISTING” output from programs submitted. Most results will show up in the “Results Viewer” Enhanced Editor window – used to write and edit your programs Log window Output Window Editor Window September 20, 2018 14:35
7
SAS Environment One the left hand side is where you will find:
SAS Programming I September 20, 2018 SAS Environment One the left hand side is where you will find: Explorer window – allows you to view and manage your SAS files Results window – helps you navigate and manage output from programs submitted. Uses a tree structure to list various types of output. September 20, 2018 14:35
8
SAS Statements SAS Programming I September 20, 2018 SAS Statements Just like other languages you need to write code to get your program to do what you want it do to. In SAS, we write SAS Statements which Usually begin with a SAS keyword (DATA or PROC) Always end with a semicolon (;) SAS Statements are free-format (can begin/end anywhere on a line, a statement can continue to multiple lines, or can have several statements on one line) Blanks or special characters separate words Usually end with a run; or quit; September 20, 2018 14:35
9
Temporary vs. Permanent SAS data libraries
SAS Programming I September 20, 2018 SAS Data Libraries SAS Data Library – a group of SAS files that are stored in the same directory. Note: Only files with a SAS file extension (.sas7bat) are recognized, but you can store other files here. Temporary vs. Permanent SAS data libraries Temporary Permanent Data files only last for the current SAS session Files remain there even after SAS closes until you delete them in the future If you don’t specify a library name your file is stored in the “work” data library If you specify a library, other than work, you are creating a permanent data set September 20, 2018 14:35
10
SAS Data Libraries - Examples
SAS Programming I September 20, 2018 SAS Data Libraries - Examples General Form: Temporary You can qualify the filename with work, but its not necessary General Form: Permanent We defined a library called “thesis” If we closed our SAS session now what do you think would happen? DATA filename_new; SET libref.filename_current; RUN; LIBNAME libref ‘SAS-data-library-location’; DATA libref.filename_new; SET libref.filename_current; RUN; September 20, 2018 14:35
11
National Longitudinal Mortality Survey
NLMS Data SAS Programming I September 20, 2018 National Longitudinal Mortality Survey Survey given by the Census Bureau. Public access data - This is the 3rd release (June 1, 2008) 38 variables; 988,346 observations September 20, 2018 14:35
12
Creating a SAS Data Set Using the Import Wizard
SAS Programming I September 20, 2018 Creating a SAS Data Set Using the Import Wizard File Import Data… Click Finish Analysts’ Steps Review the log for errors Review data 1 5 2 6 7 3 4 September 20, 2018 14:35
13
Typical formats for Import Wizard
Import Wizard Formats SAS Programming I September 20, 2018 Typical formats for Import Wizard Microsoft Excel Workbook(*.xls *.xlsb *.xlsm *.xlsx) Microsoft Access Database (*.mdb *.accdb) Comma Separated File (*.csv) Tab Delimited Files (*.txt) JMP File (*.jmp) SPSS File (*.spss) Stata File (*.dta) Others.. September 20, 2018
14
Creating a SAS Data Set Using the DATA step
Import Data Using DATA step SAS Programming I September 20, 2018 Creating a SAS Data Set Using the DATA step In the Enhanced Editor: Your SAS Log Window should look something like this: Review the data set using the contents procedure (PROC CONTENTS): 1 2 3 September 20, 2018 14:35
15
What was different among the Import Wizard and DATA step?
Outline SAS Programming I September 20, 2018 What was different among the Import Wizard and DATA step? September 20, 2018 14:35
16
What was different among the Import Wizard and DATA step?
Outline SAS Programming I September 20, 2018 What was different among the Import Wizard and DATA step? September 20, 2018 14:35
17
DATA steps versus PROC steps
SAS Programming I September 20, 2018 DATA steps versus PROC steps DATA steps – typically create or modify data sets, also used for custom reports. PROC (procedure) steps – invoke or call pre-written routines and typically present the data in the form of a report Examples OPTIONS PROC SORT PROC PRINT PROC CONTENTS PROC COPY PROC UNIVARIATE PROC MEANS PROC FREQ PROC SUMMARY PROC CORR PROC REPORT Number/nonumber Date/nodate pageno = 1 pagesize = 60 Yearcutoff = 1920 firstobs = 2 September 20, 2018 14:35
18
Creating a new data set with assignment statements
DATA set manipulation SAS Programming I September 20, 2018 Creating a new data set with assignment statements Use the DATA step with a SET statement Specify new data set (libref.filename) You can use the “keep” option if you want certain variables After SET specify SAS dataset to use Everything else is optional. Here we stated that we only want data where martial status, death, and age is known. We then created two dummy variables. One for married male “mar_male” and unmarried male “unmar_male” DATA libref.filename_new; SET libref.filename_old; <optional SAS statements>; RUN; 1 2 3 September 20, 2018 14:35
19
DROP and KEEP statements
DATA set manipulation SAS Programming I September 20, 2018 DROP and KEEP statements You may use DROP or KEEP statements in either: DATA statement, or SET statement We could use keep after SET and it tells SAS to look at these variables only. Previous code: We had to “keep” all 3 newly created variables here Alternative code: We don’t need too here, but must include indea b/c used in where DATA libref.filename_new; SET libref.filename_old; <optional SAS statements>; RUN; September 20, 2018 14:35
20
IF-THEN-ELSE Conditioning
DATA set manipulation SAS Programming I September 20, 2018 IF-THEN-ELSE Conditioning SAS IF statements are similar to other languages General Form: Alternative code: Previously we created two new variables, but we could have created one Alternative code 3: Here we used “mar_male” as our lone dummy variable IF <expression 1> THEN <statement>; ELSE IF <expression2> THEN <statement2>; ELSE <statement3>; September 20, 2018 14:35
21
Conditional syntax Examples of conditional SAS syntax:
DATA set manipulation SAS Programming I September 20, 2018 Conditional syntax Examples of conditional SAS syntax: September 20, 2018 14:35
22
PROCT SORT PROC SORT – sorts data according to specified variables
SAS Programming I September 20, 2018 PROCT SORT PROC SORT – sorts data according to specified variables – must be used before a PROC PRINT statement General Form: By default, data is sorted in ascending order, but you may specify descending. <options> OUT= creates a data set of the sorted data. If not used, SORT overwrites the data set. NODUPKEY removes duplicate observations that have the same values for the BY variables. Example code: We sorted the data set by mar_male first, age second, etc. Is our sorted data a temporary or permanent data set? What would have happened if we didn’t use the out=nlms_ex4 statement? PROC SORT data=libref.filename <options>; BY <descending> variables; RUN; September 20, 2018 14:35
23
PROC CONTENTS SAS Programming I September 20, 2018 PROC CONTENTS PROC CONTENTS – use to create SAS output that describes either The contents of a library The descriptor portion of an individual data set General Form: Examples: What do you think happens when the nods option is removed? PROC CONTENTS data=libref.filename <options>; RUN; September 20, 2018 14:35
24
Creating Data Sets From Scratch
DATALINES, LINES, CARDS SAS Programming I September 20, 2018 Creating Data Sets From Scratch You can create your own data set from scratch using a DATA step by using the DATALINES statement. General Form: In this case, a RUN; is not needed. Examples: DATA data=libref.filename; INPUT <variables>; DATALINES; <data goes here> ; September 20, 2018 14:35
25
Creating Data Sets From Scratch
DATALINES, LINES, CARDS SAS Programming I September 20, 2018 Creating Data Sets From Scratch More Examples: September 20, 2018 14:35
26
Combining data sets SAS Programming I September 20, 2018 Combining Data Sets Concatenating/Appending – stacks each data set on the other variables that are not in both data sets will have missing values for those variables not found in both General Form: Example: Note: If you prefer you can use PROC APPEND instead of using the DATA step DATA libref.filename; SET filename1 filename2; RUN; September 20, 2018 14:35
27
Combining data sets SAS Programming I September 20, 2018 Combining data sets Merging – allows you to match observations from one data set to another. Data must be sorted first to work (PROC SORT) Two types of Merges: One-to-One Match Merge – A single observation in one data set corresponds to a single record in other data sets. One-to-Many Match Merge – A single observation in one data set corresponds to multiple observations in other data sets General Form: Example: DATA libref.filename; MERGE filename1 filename2; BY variable1 variable2; RUN; September 20, 2018 14:35
28
One-to-many Match Merge
Combining data sets SAS Programming I September 20, 2018 One-to-many Match Merge One-to-Many Match Merge – A single observation in one data set corresponds to multiple observations in other data sets Example: September 20, 2018 14:35
29
Saving data sets SAS Programming I September 20, 2018 Saving Data Sets You can use:(1) DATA statement, (2) PROC EXPORT, or (3) drop-down menu General Form: PROC EXPORT: Using drop-down: LIBNAME libref ‘SAS-data-library-location’; DATA libref.filename_new; SET libref.filename_current; RUN; 2 3 1 4 September 20, 2018 14:35
30
Data Analysis SAS Programming I September 20, 2018 Data Analysis PROC UNIVARIATE and PROC MEANS are two procedures you might use to get an idea of the distribution of your attributes. General Forms: PROC UNIVARIATE libref.filename <options>; VAR <variable list>; RUN; PROC MEANS libref.filename <options>; VAR <variable list>; RUN; September 20, 2018 14:35
31
Data Analysis SAS Programming I September 20, 2018 Data Analysis PROC CORR will also provide some simple statistics like mean, standard deviation, etc. General Forms: PROC CORR libref.filename <options>; VAR <variable list>; WITH <variable list>; RUN; September 20, 2018 14:35
32
References September 20, 2018 References SAS Programming I
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.