Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng
Quiz Select the following multiple choices. What is SAS? a.SAS is a highly contagious disease found in the winter time in Asia. b. SAS is sardines and salmon. c. SAS is a software that compute statistics only. d. SAS is a 4 th generation computer language capable of performing full feature computer programming. e. None of the above.
SAS (SAS System) A computer software system that consists of several products that provide data retrieval, management, and analysis capabilities in addition to programming (SAS Institute, Inc.) SAS is a problem solving tool.
Heuristic Problem Solving Image Mode 1 Linguistic Mode 1 Image Mode 2 Linguistic Mode 2 The interaction between image mode and linguistic mode is called Heuristic Problem Solving.
Psychology of Communication By George Miller Coding Decoding Channel Capacity Magic number 7 plus or minus 2 For example:
Psychology of Communication By George Miller Coding Decoding Channel Capacity Magic number 7 plus or minus 2 For example: ??????????
Psychology of Communication By George Miller Coding Decoding Channel Capacity Magic number 7 plus or minus 2 For example:
SAS program source code is composed of many SAS statements, and some for PROC step, some for DATA step, and some used in either step.
SAS Syntax and SAS Data Sets SAS statements begin with an identifying keyword and end with a semicolon; SAS statements are free-format. A SAS data set is a collection of data values arranged in a rectangular tables. The columns in the table are called variables. The rows in the table are called observations (or records). There are two kinds of variables: character variables number variables
VARIABLES NAME SEX AGE HEIGHT WEIGHT observations 1 JOHN M observations 2 JAMES M observations 3 AFLRED M observations 19 ALICE F
DATA CLASS; INPUT NAME $1-8 SEX $11 AGE HEIGHT WEIGHT 21-25; CARDS; data lines PROC PRINT DATA=CLASS; PROC MEANS DATA=CLASS; VARIABLES HEIGHT WEIGHT;
Raw data DATA CLASS; INPUT NAME $1-8 SEX $11 AGE HEIGHT WEIGHT 21-25; CARDS; CLASS Creating SAS data sets
A listing of the raw data NAME SEX AGE HEIGHT WEIGHT JOHN M JAMES M ALFRED M WILLIAM M JEFFREY M RONALD M THOMAS M PHILIP M ROBERT M HENRY M JANET F JOYCE F JUDY F CAROL F JANE F LOUISE F BARBARA F MARY F ALICE F
CARDS; /* data lines */ JOHN M JAMES M ALFRED M WILLIAM M JEFFREY M RONALD M THOMAS M PHILIP M ALFRED M ROBERT M HENRY M JANET F JOYCE F JUDY F CAROL F JANE F LOUISE F BARBARA F MARY F ALICE F
PROC PRINT DATA=CLASS; SAS OBS NAME SEX AGE HEIGHT WEIGHT 1 JOHN M JAMES M ALFRED M WILLIAM M JEFFREY M RONALD M THOMAS M PHILIP M ALFRED M HENRY M JANET F JOYCE F JUDY F CAROL F JANE F LOUISE F BARBARA F MARY F ALICE F
PROC MEANS DATA=CLASS; VARIABLES HEIGHT WEIGHT; SAS VARIABLES N MEAN STANDARD MINIMUM MAXIMUM STD ERROR DEVIATION VALUE VALUE OF MEAN WEIGHT HEIGHT
THE PROC STEP The PROC (or PROCEDURE) statement is used to call a SAS procedure. SAS procedures are computer programs that: read SAS data sets, compute statistics, print results, and create SAS data sets. For example: PROC MEANS SUM MAXDEC=2 DATA=CLASS; PROC CONTENTS DATA=CLASS; PROC SORT DATA=CLASS; BY SEX DESCENDING WEIGHT;
Data Transformations Assignment statement Assignment statements are used to create new variable and to modify values of existing variables. SAS evaluates an expression and assigns the result to a variable. variable = expression; i.e. x=1+2;
Example: 1. Read three variables (YEAR, REVENUE, and EXPENSE) into a SAS data set. 2. Add a variable named INCOME, which is the difference between REVENUE and EXPENSE. 3. Change the values of YEAR from 2 digits to 4 digits. DATA PROFITS; INPUT YEAR REVENUE EXPENSE; INCOME=REVENUE–EXPENSE; YEAR = YEAR ; CARDS; PROC PRINT: SAS OBS YEAR REVENUE EXPENSE INCOME
SAS functions Selected functions that compute simple statistics. SUM sum MEAN arithmetic mean VAR variance MIN minimum value MAX maximum value STD standard deviation
Example: Given: Temperature data at a specific location are recorded every hour on the hour for several days. Each record in a file represents one day and contains the date and the 24 recorded temperatures for that date. Objective: Create a SAS data set that contains the date, the 24 hourly temperatures, the average temperature, the minimum temperature and the maximum temperature for each day. DATA TEMP; INPUT DATE (T1-T24) (2.); AVGTEMP=MEAN(OF T1-T24); MINTEMP=MIN(OF T1-T24); MAXTEMP=MAX(OF T1-T24); CARDS; data lines program data vector DATE T1... AVGTEMP MINTEMP MAXTEMP
The RETAIN statement SAS normally resets all variables in the program data vector to missing before each execution of the DATA step. A RETAIN statement can be used to: - Retain variable values from the last execution of the DATA step - Give initial values to the valuables. Example: Accumulate totals and count observations. DATA ADD; RETAIN COUNT 0 TOTAL 0; INPUT SCORE; TOTALS=TOTAL+SCORE; CARDS; PROC PRINT; program data vector COUNT TOTAL SCORE
The SUM statement The SUM statement is a special assignment statement that accumulates values from one observation to the next. It retains the values of the created variable and treats a missing value as zero. Example: Accumulate totals and count observations. DATA ADD; INPUT SCORE; COUNT + 1; TOTALS=TOTAL+SCORE; CARDS; PROC PRINT;
CONDITIONAL EXECUTION OF SAS STATEMENT IF-THEN/ELSE Statements Use of the IF-THEN statement when you want to execute a SAS Statement conditional on some expression. Numeric Comparison IF CODE=1 THEN RESPONSE=‘GOOD’; IF CODE=2 THEN RESPONSE=FAIR’; IF CODE=3 THEN RESPONSE=‘POOR; For efficiency, use ELSE statements. IF CODE=1 THEN RESPONSE=“GOOD’; ELSE IF CODE=2 THEN RESPONSE=‘FAIR’ ELSE IF CODE=3 THEN RESPONSE=‘POOR”;
Character comparison DATA CLASS; INPUT NAME $SEX $AGE HEIGHT WEIGHT; IF SEX=‘M’ THEN SEX=‘MALE’; ELSE SEX=‘FEMALE’; CARDS;
Comparison operators LT < less than GT < greater than EQ = equal than LE <= less than or equal to GE >= greater than or equal to NE not equal NL not less than NG not greater than Logical operators OR l or, either AND & and NOT not, negation
DO and END statements Execution of a DO statement specifies that all statements between the DO and its matching END statement are to be executed. For example: DATA EMPLOY; INPUT NAME $1-8 DEPNO COM SALARY 19-23; IF DEPTNO=201 THEN DO; DEPT=‘SALES’; GROSSPAY = COM+SALARY; END; ELSE DO; DEPT=‘ADMIN’; GROSSPAY = SALARY; END; CARDS;
JOHNSON MOSSER LARKIN GARRETT PROC PRINT output SAS OBS NAME DEPTNO COM SARLARY DEPT GROSSPAY 1 JOHNSON SALES MOSSER ADMIN LARKIN ADMIN GARRETT SALES 22800
PROC SORT DATA=RATE_A; BY ZIP; PROC SORT DATA=RATE_B; BY ZIP; PROC SORT DATA=RATE_C; BY ZIP; DATA TMTL; MERGE RATE_A(IN=A) CTL_TBL(IN=B); BY ZIP; IF A & B; DATA TMMR; MERGE RATE_B(IN=A) CTL_TBL(IN=B); BY ZIP; IF A & B; DATA TMCR; MERGE RATE_C(IN=A) CTL_TBL(IN=B); BY ZIP; IF A & B;
Conclusion 1.SAS is a 4th generation computer language. 2.SAS is a problem solving tool. 3.It makes your life easier (less stressful).
THE END