Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to SAS ISYS 650. What Is SAS? SAS is a collection of modules that are used to process and analyze data. It began in the late ’60s and early.

Similar presentations


Presentation on theme: "Introduction to SAS ISYS 650. What Is SAS? SAS is a collection of modules that are used to process and analyze data. It began in the late ’60s and early."— Presentation transcript:

1 Introduction to SAS ISYS 650

2 What Is SAS? SAS is a collection of modules that are used to process and analyze data. It began in the late ’60s and early ’70s as a statistical package (Statistical Analysis System). SAS is also an extremely powerful, general- purpose programming language. In recent years, it has been enhanced to provide state-of-the-art data mining tools and programs for Web development and analysis.

3 Data-Driven Tasks The functionality of the SAS System is built around the four data-driven tasks common to virtually any application: 1. data access: – addresses the data required by the application 2. data management: – shapes data into a form required by the application 3. data analysis: – summarizes, reduces, or otherwise transforms raw data into meaningful and useful information 4. data presentation: – communicates information in ways that clearly demonstrate its significance

4 An Overview of SAS Data Processing DATA steps are used to create SAS data sets. PROC steps are used to process SAS data sets.

5 Explore the SAS workspace When you first start SAS, the five main SAS windows open: – the Explorer – Results – Program Editor or Editor – Log – Output windows. Menu: – Tools: New Library

6 Demo Creating a new library: – Tools/New Library: Name Folder Enable at start up import a table to the new library from MS Access database and create a SAS data set: – File/Import data Open a SAS dataset: a SAS data set (also called a table) is a file containing descriptor information and related data values. The file is organized as a table of observations (rows) and variables (columns) that SAS can process.

7 Components of SAS programs DATA steps typically create or modify SAS data sets. For example, you can use DATA steps to – put your data into a SAS data set – compute the values for new variables – check for and correct errors in your data – produce new SAS data sets by subsetting, merging, and updating existing data sets. PROC (procedure) steps typically analyze and process data in the form of a SAS data set, and they sometimes create SAS data sets that contain the results of the procedure.

8 A program accessing the SAS data set named “student” in Mydata library DATA myStudent; set Mydata.student; run; PROC print data=myStudent; run; Note: The DATA statement creates a temporary data set that references the “student” data set in the Mydata library. Temporary data sets are stored in the Work library.

9 SAS Data Access 1. Import Wizard: File/Import Data Demo: Access, Excel

10 Process SAS Data Set 1. Reference the library name: PROC print data=mydata.emp; run; 2. Reference the Windows name directly: PROC print data="c:\mydata\emp"; run; 3. Creating a temporary SAS data set from existing SAS data set: DATA myStudent; *USE Mydata.student; *USE "C:\Mydata\student"; *SET Mydata.student; SET "C:\Mydata\student"; run; PROC print data=myStudent; run; Note: To add line comment, use “*”. To add block comment, use /* …. */

11 Creating a Permanent SAS Data Set by Using Windows’ File Name or Library.FileName in Data Statement DATA "c:\MyData\myStudent"; *USE Mydata.student; *USE "C:\Mydata\student"; *SET Mydata.student; SET "C:\Mydata\student"; run; Note: This example creates a new permanent data set from the “student” data set in MyData library. DATA MyData.myStudent; SET "C:\Mydata\student"; run;

12 Creating a Data Set Using Input Statement DATA MyData.StGPA2; INPUT SID $ Sname $ GPA; DATALINES; S1 Peter 3.2 S2 Paul 2.8 S3 Mary 3.0 run; DATA StGPA; INPUT SID $ Sname $ GPA; DATALINES; S1 Peter 3.2 S2 Paul 2.8 S3 Mary 3.0 run; Temporary data set: Permanent data set:

13 SAS Data Access 2. Using ODBC with PROC SQL PROC SQL; CONNECT TO ODBC(DSN='MySalesDB2007'); CREATE TABLE temp_sas AS SELECT * FROM CONNECTION TO ODBC(SELECT * FROM Customer); Data Customer; set Work.temp_sas; run; PROC Print DATA=Customer; run; Note: The CREATE TABLE statement creates a SAS data set from the Customer table.

14 Create a SAS Data Set as the Result of a SQL Join Statement PROC SQL; CONNECT TO ODBC(DSN='MySalesDB2007'); CREATE TABLE temp_sas AS SELECT * FROM CONNECTION TO ODBC(SELECT Customer.CID, Cname, OID, Odate FROM Customer, Orders where Customer.cid=orders.cid); Data CustomerOrder; set Work.temp_sas; run; PROC Print DATA=CustomerOrder; run;

15 SAS Data Management Creating calculated field Use DROP and KEEP to select fields Create a subset of a data set Append two data sets Merge data set – Equivalent to SQL outer join

16 Creating Calculated Field Arithmetic operators: – +, -, *, /, ** Using SAS functions: – ABS, INT, SQRT, ROUND – Date functions: TODAY(): return current date INTCK(‘interval’,from, to) – The ‘interva’ can be: DAY, WEEK, MONTH, QTR, YEAR – Example: Age = intck('year',dob,today()); Year, Month, Qtr

17 Examples DATA GPAGroup; set work.Mystudent; IF GPA <2.0 then scholarship=1000; Else scholarship=3000; IF GPA <2.0 then GRPGrp='Poor'; Else GPAGrp='Good'; run; proc print data=GPAGroup; run; DATA AgeGroup; set Mydata.student2; Age = year(today())-year(DOB); run; proc print data=AgeGroup; run;

18 DROP/KEEP DATA Student; set work.Mystudent; DROP Gender DOB; run; proc print data=Student; run;

19 Subset a Data Set with IF DATA highIncome; set Mydata.Emp; IF Salary>60000; run; proc print data=highIncome; run;

20 Vertically Merging Two Data Sets (Append) DATA StDOB; set Mydata.Student; Name=Sname; KEEP Name DOB; run; DATA EmpDOB; set Mydata.Emp; DOB=Birthdate; KEEP Name DOB; run; DATA AllDOB; SET STDOB EmpDOB; Run; proc print data=ALLDOB; run;

21 Horizontally Merging Two Data Sets (1. Must be sorted by the same field; 2. this operation is equivalent to SQL Outer Join) PROC SQL; CONNECT TO ODBC(DSN='MySalesDB2007'); CREATE TABLE temp_sas AS SELECT * FROM CONNECTION TO ODBC(SELECT * FROM Customer); CREATE TABLE temp_sas2 AS SELECT * FROM CONNECTION TO ODBC(SELECT * FROM Orders); PROC SORT Data=Work.temp_sas; BY CID; PROC SORT Data=Work.temp_sas2; BY CID; Data CustomerOrders; MERGE temp_sas temp_sas2; BY CID; KEEP CID CNAME OID ODATE SALESPERSON; run; PROC Print DATA=CustomerOrders; run;

22 A few SAS PROCs PROC PRINT PROC SORT PROC MEANS PROC SQL


Download ppt "Introduction to SAS ISYS 650. What Is SAS? SAS is a collection of modules that are used to process and analyze data. It began in the late ’60s and early."

Similar presentations


Ads by Google