SAS Programming I Matthew A. Lanham Doctoral Student

Slides:



Advertisements
Similar presentations
The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training.
Statistics in Science  Introducing SAS ® software Acknowlegements to David Williams Caroline Brophy.
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Good Data Management Practices Patty Glynn 10/31/05
Microsoft ® Office Word 2007 Training Mail Merge II: Use the Ribbon and perform a complex mail merge [Your company name] presents:
Creating SAS® Data Sets
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
Math 3400 Computer Applications of Statistics Lecture 1 Introduction and SAS Overview.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
Chapter 1: Introduction to SAS  SAS programs: A sequence of statements in a particular order  Rules for SAS statements: –Every SAS statement ends in.
5/30/2010 SAS Macro Language Group 6 Pradnya Nimkar, Li Lin, Linsong Zhang & Loc Tran.
Chapter 17 Creating a Database.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
1 EPIB 698E Lecture 1 Notes Instructor: Raul Cruz 7/9/13.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Lesson 13 Databases Unit 2—Using the Computer. Computer Concepts BASICS - 22 Objectives Define the purpose and function of database software. Identify.
How to start using SAS SARBAJIT MUKHERJEE. WHAT IS SAS? SAS stands for Statistical Analysis System. Useful for the following types of task: 1. Data entry,
An Introduction Katherine Nicholas & Liqiong Fan.
Chapter 11: Sequential File Merging, Matching, and Updating Programming Logic and Design, Third Edition Comprehensive.
11 Chapter 111 Sequential File Merging, Matching, and Updating Programming Logic and Design, Second Edition, Comprehensive 11.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
SAS and Other Packages SAS can interact with other packages in a variety of different ways. We will briefly discuss SPSSX (PASW) SUDAAN IML SQL will be.
Chapter 3: Getting Started with Tasks
Session 1 Retrieving Data From a Single Table
Temporary vs. Permanent SAS Data Sets
SQL and SQL*Plus Interaction
Practical Office 2007 Chapter 10
SAS Programming Training
Chapter 6: Modifying and Combining Data Sets
Chapter 2: Getting Data into SAS
SAS Programming Introduction to SAS.
Working with Data in Windows
ECONOMETRICS ii – spring 2018
Chapter 18: Modifying SAS Data Sets and Tracking Changes
Former Chapter 23: Selecting Efficient Sorting Strategies
Chapter 1: Introduction to SAS
Instructor: Raul Cruz-Cano
Tamara Arenovich Tony Panzarella
Exploring Microsoft® Access® 2016 Series Editor Mary Anne Poatsy
Creating and Modifying Queries
Database Applications – Microsoft Access
Topics Introduction to File Input and Output
CIS16 Application Programming with Visual Basic
Programming Logic and Design Fourth Edition, Comprehensive
Exploring Microsoft® Access® 2016 Series Editor Mary Anne Poatsy
SAS Programming Training
Lisa Mendez, PhD & Andrew Kuligowski
Using SQL*Plus.
Stata Basic Course Lab 2.
Instructor: Raul Cruz 9/4/13
Exploring Microsoft® Office 2016 Series Editor Mary Anne Poatsy
Microsoft Office Illustrated Fundamentals
Topics Introduction to File Input and Output
Unit J: Creating a Database
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

SAS Programming I Matthew A. Lanham Doctoral Student September 20, 2018 SAS Programming I Matthew A. Lanham Doctoral Student Virginia Polytechnic Institute and State University Pamplin College of Business Department of Business Information Technology

September 20, 2018

Short Course Outline A brief history of SAS SAS Programming I September 20, 2018 Short Course Outline A brief history of SAS How is SAS different from other software packages? SAS Environment SAS Data Libraries Getting data into SAS using the IMPORT Wizard Data Example: National Longitudinal Mortality Survey SAS Programs – DATA and PROC Examples Creating data from scratch Merging data sets Saving data sets Basic Data Analysis PROCs References September 20, 2018 14:35

History of SAS SAS Institute, Cary, NC About SAS SAS Programming I September 20, 2018 History of SAS SAS Institute, Cary, NC Jim Goodnight and Anthony Barr co-founders Began as a research project at NC State in 1966 Funding from NIH from 1966-1972 1976, 100+ customers, left NC State to form the SAS Institute, Inc. 1980s, one of Inc. Magazine’s fast growing companies, 1500 employees 1990s, 7000 employees 2010-2011 - #1 on Fortune’s best places to work 2011, 11,000 employees, more than 50,000 customer sites and 200 products Today Holds largest market share for advanced analytics Used by 79% of Fortune 500 companies September 20, 2018 14:35

How is SAS different from other software packages like R? Similarities and Differences SAS Programming I September 20, 2018 How is SAS different from other software packages like R? SAS versus R SAS will require a license ($), R is open-source Any statistical analysis you can do in SAS, you could probably do in R SAS is commercial software so additions take longer. R has new libraries being added frequently. SAS help and documentation is nice (like MATLAB). R documentation is lean. SAS handles large data sets with ease. R stores everything in RAM making it vulnerable. SAS and R are fairly easy to learn, but R feels more like typical programming. SAS can be used like a database (PROC SQL) because of its design. Popularity Tiobe Sofware which ranks software popularity currently has SAS ranked #22 and R #24. KDNuggets 2013 software poll for data science or big data: R (37%), SAS (11%), MATLAB (10%) Rblogger.com - Number of Jobs on Indeed.com requesting skills in analytics: #1 SAS (12,272), #2 SPSS (3,289), #3 R (1,693) SAS and R comparison code http://sas-and-r.blogspot.com/p/statistics-examples.html Suggestions: Learn the basics of both and use as needed. You can work with both together. I’ll use R packages if I don’t want to write methods myself and SAS doesn’t have a PROC for me to use. September 20, 2018 14:35

SAS Environment SAS Programming I September 20, 2018 SAS Environment When you begin SAS, you should see 3 windows by default: Log window – displays messages about your SAS session and submitted programs Output window – “LISTING” output from programs submitted. Most results will show up in the “Results Viewer” Enhanced Editor window – used to write and edit your programs Log window Output Window Editor Window September 20, 2018 14:35

SAS Environment One the left hand side is where you will find: SAS Programming I September 20, 2018 SAS Environment One the left hand side is where you will find: Explorer window – allows you to view and manage your SAS files Results window – helps you navigate and manage output from programs submitted. Uses a tree structure to list various types of output. September 20, 2018 14:35

SAS Statements SAS Programming I September 20, 2018 SAS Statements Just like other languages you need to write code to get your program to do what you want it do to. In SAS, we write SAS Statements which Usually begin with a SAS keyword (DATA or PROC) Always end with a semicolon (;) SAS Statements are free-format (can begin/end anywhere on a line, a statement can continue to multiple lines, or can have several statements on one line) Blanks or special characters separate words Usually end with a run; or quit; September 20, 2018 14:35

Temporary vs. Permanent SAS data libraries SAS Programming I September 20, 2018 SAS Data Libraries SAS Data Library – a group of SAS files that are stored in the same directory. Note: Only files with a SAS file extension (.sas7bat) are recognized, but you can store other files here. Temporary vs. Permanent SAS data libraries Temporary Permanent Data files only last for the current SAS session Files remain there even after SAS closes until you delete them in the future If you don’t specify a library name your file is stored in the “work” data library If you specify a library, other than work, you are creating a permanent data set September 20, 2018 14:35

SAS Data Libraries - Examples SAS Programming I September 20, 2018 SAS Data Libraries - Examples General Form: Temporary You can qualify the filename with work, but its not necessary General Form: Permanent We defined a library called “thesis” If we closed our SAS session now what do you think would happen? DATA filename_new; SET libref.filename_current; RUN; LIBNAME libref ‘SAS-data-library-location’; DATA libref.filename_new; SET libref.filename_current; RUN; September 20, 2018 14:35

National Longitudinal Mortality Survey NLMS Data SAS Programming I September 20, 2018 National Longitudinal Mortality Survey Survey given by the Census Bureau. Public access data - http://www.census.gov/did/www/nlms/publications/public.html This is the 3rd release (June 1, 2008) 38 variables; 988,346 observations September 20, 2018 14:35

Creating a SAS Data Set Using the Import Wizard SAS Programming I September 20, 2018 Creating a SAS Data Set Using the Import Wizard File Import Data… Click Finish Analysts’ Steps Review the log for errors Review data 1 5 2 6 7 3 4 September 20, 2018 14:35

Typical formats for Import Wizard Import Wizard Formats SAS Programming I September 20, 2018 Typical formats for Import Wizard Microsoft Excel Workbook(*.xls *.xlsb *.xlsm *.xlsx) Microsoft Access Database (*.mdb *.accdb) Comma Separated File (*.csv) Tab Delimited Files (*.txt) JMP File (*.jmp) SPSS File (*.spss) Stata File (*.dta) Others.. September 20, 2018

Creating a SAS Data Set Using the DATA step Import Data Using DATA step SAS Programming I September 20, 2018 Creating a SAS Data Set Using the DATA step In the Enhanced Editor: Your SAS Log Window should look something like this: Review the data set using the contents procedure (PROC CONTENTS): 1 2 3 September 20, 2018 14:35

What was different among the Import Wizard and DATA step? Outline SAS Programming I September 20, 2018 What was different among the Import Wizard and DATA step? September 20, 2018 14:35

What was different among the Import Wizard and DATA step? Outline SAS Programming I September 20, 2018 What was different among the Import Wizard and DATA step? September 20, 2018 14:35

DATA steps versus PROC steps SAS Programming I September 20, 2018 DATA steps versus PROC steps DATA steps – typically create or modify data sets, also used for custom reports. PROC (procedure) steps – invoke or call pre-written routines and typically present the data in the form of a report Examples OPTIONS PROC SORT PROC PRINT PROC CONTENTS PROC COPY PROC UNIVARIATE PROC MEANS PROC FREQ PROC SUMMARY PROC CORR PROC REPORT Number/nonumber Date/nodate pageno = 1 pagesize = 60 Yearcutoff = 1920 firstobs = 2 September 20, 2018 14:35

Creating a new data set with assignment statements DATA set manipulation SAS Programming I September 20, 2018 Creating a new data set with assignment statements Use the DATA step with a SET statement Specify new data set (libref.filename) You can use the “keep” option if you want certain variables After SET specify SAS dataset to use Everything else is optional. Here we stated that we only want data where martial status, death, and age is known. We then created two dummy variables. One for married male “mar_male” and unmarried male “unmar_male” DATA libref.filename_new; SET libref.filename_old; <optional SAS statements>; RUN; 1 2 3 September 20, 2018 14:35

DROP and KEEP statements DATA set manipulation SAS Programming I September 20, 2018 DROP and KEEP statements You may use DROP or KEEP statements in either: DATA statement, or SET statement We could use keep after SET and it tells SAS to look at these variables only. Previous code: We had to “keep” all 3 newly created variables here Alternative code: We don’t need too here, but must include indea b/c used in where DATA libref.filename_new; SET libref.filename_old; <optional SAS statements>; RUN; September 20, 2018 14:35

IF-THEN-ELSE Conditioning DATA set manipulation SAS Programming I September 20, 2018 IF-THEN-ELSE Conditioning SAS IF statements are similar to other languages General Form: Alternative code: Previously we created two new variables, but we could have created one Alternative code 3: Here we used “mar_male” as our lone dummy variable IF <expression 1> THEN <statement>; ELSE IF <expression2> THEN <statement2>; ELSE <statement3>; September 20, 2018 14:35

Conditional syntax Examples of conditional SAS syntax: DATA set manipulation SAS Programming I September 20, 2018 Conditional syntax Examples of conditional SAS syntax: September 20, 2018 14:35

PROCT SORT PROC SORT – sorts data according to specified variables SAS Programming I September 20, 2018 PROCT SORT PROC SORT – sorts data according to specified variables – must be used before a PROC PRINT statement General Form: By default, data is sorted in ascending order, but you may specify descending. <options> OUT= creates a data set of the sorted data. If not used, SORT overwrites the data set. NODUPKEY removes duplicate observations that have the same values for the BY variables. Example code: We sorted the data set by mar_male first, age second, etc. Is our sorted data a temporary or permanent data set? What would have happened if we didn’t use the out=nlms_ex4 statement? PROC SORT data=libref.filename <options>; BY <descending> variables; RUN; September 20, 2018 14:35

PROC CONTENTS SAS Programming I September 20, 2018 PROC CONTENTS PROC CONTENTS – use to create SAS output that describes either The contents of a library The descriptor portion of an individual data set General Form: Examples: What do you think happens when the nods option is removed? PROC CONTENTS data=libref.filename <options>; RUN; September 20, 2018 14:35

Creating Data Sets From Scratch DATALINES, LINES, CARDS SAS Programming I September 20, 2018 Creating Data Sets From Scratch You can create your own data set from scratch using a DATA step by using the DATALINES statement. General Form: In this case, a RUN; is not needed. Examples: DATA data=libref.filename; INPUT <variables>; DATALINES; <data goes here> ; September 20, 2018 14:35

Creating Data Sets From Scratch DATALINES, LINES, CARDS SAS Programming I September 20, 2018 Creating Data Sets From Scratch More Examples: September 20, 2018 14:35

Combining data sets SAS Programming I September 20, 2018 Combining Data Sets Concatenating/Appending – stacks each data set on the other variables that are not in both data sets will have missing values for those variables not found in both General Form: Example: Note: If you prefer you can use PROC APPEND instead of using the DATA step DATA libref.filename; SET filename1 filename2; RUN; September 20, 2018 14:35

Combining data sets SAS Programming I September 20, 2018 Combining data sets Merging – allows you to match observations from one data set to another. Data must be sorted first to work (PROC SORT) Two types of Merges: One-to-One Match Merge – A single observation in one data set corresponds to a single record in other data sets. One-to-Many Match Merge – A single observation in one data set corresponds to multiple observations in other data sets General Form: Example: DATA libref.filename; MERGE filename1 filename2; BY variable1 variable2; RUN; September 20, 2018 14:35

One-to-many Match Merge Combining data sets SAS Programming I September 20, 2018 One-to-many Match Merge One-to-Many Match Merge – A single observation in one data set corresponds to multiple observations in other data sets Example: September 20, 2018 14:35

Saving data sets SAS Programming I September 20, 2018 Saving Data Sets You can use:(1) DATA statement, (2) PROC EXPORT, or (3) drop-down menu General Form: PROC EXPORT: Using drop-down: LIBNAME libref ‘SAS-data-library-location’; DATA libref.filename_new; SET libref.filename_current; RUN; 2 3 1 4 September 20, 2018 14:35

Data Analysis SAS Programming I September 20, 2018 Data Analysis PROC UNIVARIATE and PROC MEANS are two procedures you might use to get an idea of the distribution of your attributes. General Forms: PROC UNIVARIATE libref.filename <options>; VAR <variable list>; RUN; PROC MEANS libref.filename <options>; VAR <variable list>; RUN; September 20, 2018 14:35

Data Analysis SAS Programming I September 20, 2018 Data Analysis PROC CORR will also provide some simple statistics like mean, standard deviation, etc. General Forms: PROC CORR libref.filename <options>; VAR <variable list>; WITH <variable list>; RUN; September 20, 2018 14:35

References September 20, 2018 References SAS Programming I