Chapter 1: Overview of SAS System Basic Concepts of SAS System
The SAS Programming Process Create a SAS Program Enter the SAS Program Code Process the SAS Program Code Review the Results Debug or Modify Define the Business Need
What Is SAS? SAS is a collection of components that enable you to manage, manipulate, and examine your data. Base SAS Reporting and Graphics Reporting and Graphics Analytical Visualization and Discovery Visualization and Discovery Data Access and Management Data Access and Management Business Solutions Business Solutions User Interfaces User Interfaces Application Development Application Development Web Enablement Web Enablement
Basic Functionality Access Data Manage PresentAnalyze
Types of Files Used with SAS SAS Program Files SAS DATA Sets Raw Data Files
SAS data set, Can be opened only by SAS system. SAS program. It is created by users for solving problems. It can be created by the SAS Program Editor, or by any text editing software, then, copy and paste into SAS Program Editor, in order to be executed. A raw data set in.dat format. In order to have a SAS program to read the text file, a statement linking the physical path storing the data in the SAS program is required. This is accomplished by ‘Infile’ statement in a SAS program.
contain SAS program code do not contain data values can be saved and re-used. SAS Program Files How the SAS program works
DATA steps are typically used to create SAS data sets. PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, edit data, and sort data). A SAS program is a sequence of steps that the user submits for execution. Raw Data DATA Step Report SAS Data Set PROC Step SAS Programs
Components of a SAS Program A SAS program is a sequence of steps. There are only two kinds of steps: DATA steps PROC steps A SAS Program PROC step(s) DATA step(s)
DATA Step(s) Typically, DATA steps read data, such as raw data files, SAS data set, Excel data sheet, as well as to create SAS data sets. Data File SAS Data Set Descriptor DATA Step Data type for input:.dat,.txt.,.sas7bdat, xls, etc Data type from a Data step:.sas7bdat
SAS Data Sets Data Entry External File Conversion Process SAS Data Set Descriptor Portion Data Portion Other Software Files
DATA Step(s) In addition, DATA steps can modify existing variables or create new variables as necessary. Data File SAS Data Set Descriptor DATA Step
PROC Step(s) PROC steps typically read SAS data sets to create reports, to analyze data. SAS Data Set Descriptor Report PROC Step
PROC Step(s) There are many different types of PROC steps. MEANS PRINT FREQ... PROC Step(s)
Components of a Step A SAS program is a sequence of steps: DATA steps PROC steps. A step is a sequence of one or more statements.
Components of a Step A statement usually starts with a keyword and always ends in a semicolon (;). KEYWORD... ; For example: input name $ 1-8 age 11-12; This INPUT statement can read the following data records: Peterson 21 Morgan 17 Because NAME is a character variable, a $ appears between the variable name and column numbers.
Components of a DATA Step A DATA step starts with a DATA statement and ends with a RUN statement. data _______________ ; _______________ ;... _______________ ; run; Start End
Components of a PROC Step A PROC step starts with a PROC statement and ends with a RUN statement. Start End proc _______________ ; _______________ ;... _______________ ; run;
are nonsoftware-specific files that contain records and fields can be created by a variety of software products can be read by a variety of software products consist of no special attributes such as field headings, page breaks, or titles are not reports. Raw Data Files
are files specific to SAS that contain variables and observations can be created only by SAS can be read only by SAS consist of a descriptor portion and a data portion SAS Data Sets
How is a SAS data set created? Data Entry External File Conversion Process SAS Data Set Descriptor Portion Data Portion Other Software Files Is accomplished in the DATA Step
SAS Data Sets The descriptor portion contains attribute information about the data in a SAS data set. SAS data set name, Date/time created, # of variables, # of observations. For each variable: Name, Type, Length, Position, Label. The data portion contains the data values in the form of a rectangular table made up of observations and variables. Descriptor Portion Data Portion
Rules for a Valid SAS Data Set Name and a Valid Variable of a SAS data set Can be 1 to 32 characters long Must begin with a letter (A-Z, either uppercase or lowercase) or an underscore (_) Can continue with any combination of numbers, letters or underscores Example: Policy, pOLIcY, total_bud2010_, _N_ are valid Total-budget, 2010_budget, #num_stud are NOT valid
Missing data in a SAS data set For a numeric variable, a missing data value is presented by a period (.) For a character variable, a missing data value is presented by a Blank space. Variable Length A variable is stored in terms of # of bytes. A character variable can be up to bytes long. All numeric variables have a default variable length of 8, which are stored as floating-point numbers in 8 bytes of storage, unless is specified.
Variable format is the format of outputting the variable in the SAS data set. Variable Informat is the specific format for inputting the variable into a standard SAS value. Variable Label: describe the variable in a more descriptive way. It can be up to 256 characters.
SAS Libraries Every SAS file is stored in a SAS library. SAS data set is one type of SAS file. In some operating environment, a library is a physical collection of files. In others, such as Windows and Unix environments, a library is a logical name consisting of a group of files that are stored in a physical location in a storage space. Library can be Temporary or Permanent. A SAS library must be prepared in order for a SAS program to reach the directory to either read or output a SAS data set. SAS program only need to recognize the Library reference name. Hard Drive A Library Name Path to the physical HD location
Reference a SAS file in a SAS Library A SAS library name has two-levels: LIBREF.Filename Libref is the the SAS Library name that is connected to a physical directory in a storage location in your computer. fielname is a file stored in the directory referred to the Libref.
Two types of SAS Library Temporary SAS data set: The LIBREF is always WORK, which is already available in the Libraries folder in Explore Panel of the SAS working environment. Example: WORK.admit is a temporary SAS data set. NOTE: one can ignor ‘WORK’ and specify the data set as admit, if it is stored in the WORK library as temporary library. Permanent SAS data set: The Libref is defined by the user. For example: Mylib.admit refers to a SAS data set admit which is stored in the library named Mylib.
Rules required for a Valid SAS Library name are limited to 8 characters must start with a letter or underscore can contain only letters, numbers, or underscores. Example: s575, _s575, s575_ _s575_ are valid LIBREF S-575, sta575_online are not valid
An Example of Reading a SAS Data Set The Admit data set contains admission information for patients in a wellness clinic. VariableTypeLengthDescription IDnum8patient ID number Namechar20patient name Sexchar1sex (F or M) Agenum8age in years Datenum8day of admission Heightnum8height in inches Weightnum8weight in pounds ActLevelchar4activity level (LOW, MOD, HIGH) FeeNum8Clinic admission fee
Some observations of the data set IDNameSexAgeDateHeightWeightActLevelFee 2458Murray, WM HIGH Almers, CF HIGH Bonaventure, TF LOW Johnson, RF MOD LaMance, KM LOW124.80
A SAS program does the following tasks: create a SAS library: Mylib, reads SAS data set Admit from the library clinic, select the patients with HIGH activity level, store the selected patients in the Mylib library with the SAS data set name: Admit_high, print the observations in the new data set libname Mylib ‘C:\Math707\SASData’; DATA Mylib.admit_high; set clinic.admit; if ActLevel=‘HIGH’; run; PROC print data=Mylib.admit_high; run;