SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapter 26 By Tasha Chapman, Oregon Health Authority
Topics covered… Basic syntax of PROC SQL Select From Where Order By Creating tables Joining tables Group functions CASE / WHEN logic
What is SQL? SQL stands for Structured Query Language Used for modifying and querying databases Found in many different platforms Oracle MS Access SPSS SQL Server SAS
What can you do with SQL? Data step Proc Print Data step merge Proc Freq, Proc Means, Proc Report, Proc Tabulate, etc. Proc Sort
Basic SQL Syntax
Basic SQL Query Proc SQL; Select Title, Author, ISBN From Books; Quit;
Basic SQL Query Proc SQL; Select Title, Author, ISBN From Books; Quit;
TitleAuthorISBN The Little SAS BookDelwiche SAS Survival HandbookWiseman SAS for DummiesMcDaniel Learning SAS by ExampleCody Output Delivery SystemHaworth X SAS Functions by ExampleCody Annotate: Simply the BasicsCarpenter SAS Programming ShortcutsAster Survival Analysis Using SASAllison X Longitudinal Data and SASCody SAS Macro ProgrammingBurlew
Where Clause Proc SQL; Select Title, Author, ISBN From Books Quit; Where Author = ‘Cody’;
TitleAuthorISBN The Little SAS BookDelwiche SAS Survival HandbookWiseman SAS for DummiesMcDaniel Learning SAS by ExampleCody Output Delivery SystemHaworth X SAS Functions by ExampleCody Annotate: Simply the BasicsCarpenter SAS Programming ShortcutsAster Survival Analysis Using SASAllison X Longitudinal Data and SASCody SAS Macro ProgrammingBurlew
TitleAuthorISBN Learning SAS by ExampleCody SAS Functions by ExampleCody Longitudinal Data and SASCody
Proc SQL; Select Title, Author, ISBN From Books Quit; Where Author = ‘Cody’; Select the variables (columns) you want to keep Identify the table(s) the data comes from Indicate which observations (rows) you want to select Basic SQL Query
Proc SQL; Select Title, Author, ISBN From Books Quit; Where Author = ‘Cody’; Separate variables with commas Only one semi-colon at the end of the query Basic SQL Query End with QUIT; not RUN;
Select * Use an asterisk to select all the available columns in a table Proc SQL; Select * From Books Quit;
Renaming variables Use AS to rename a variable or name a newly created variable Proc SQL; Select ISBN as Book_ID From Books Quit;
Renaming variables Use AS to rename a variable or name a newly created variable Proc SQL; Select ISBN as Book_ID, Price*0.8 as Sale_Price From Books Quit;
Labels and Formats Labels and Formats can be applied to variables in the SELECT clause Proc SQL; Select Customer label='Ordered by:', Order_Date format=mmddyy10. From Books Quit;
Functions SAS functions and other similar manipulations can be implemented in PROC SQL Proc SQL; Select Customer label='Ordered by:', year(Order_Date) as Order_Yr From Books Quit;
Order By Clause Proc SQL; Select Title, Author, ISBN From Books Quit; Order by Title; Where Author = ‘Cody’
TitleAuthorISBN Learning SAS by ExampleCody SAS Functions by ExampleCody Longitudinal Data and SASCody
TitleAuthorISBN Learning SAS by ExampleCody Longitudinal Data and SASCody SAS Functions by ExampleCody
Creating Tables
Creating Datasets Proc SQL; Create Table Newdata as Select Title, Author, ISBN From Books Quit; Where Author = ‘Cody’;
Referencing Libraries Proc SQL; Create Table Newdata as Select Title, Author, ISBN From in.Books Quit; Where Author = ‘Cody’; Libname in ‘C:\SAS\Chapmantl\’;
Referencing Libraries Libname oralib oracle user = sas_user password = my_password path = “dev.cbs.or.us” connection = unique; Libname in ‘C:\SAS\Chapmantl\’;
Joining Tables
TitleAuthorISBN The Little SAS BookDelwiche SAS Survival HandbookWiseman SAS for DummiesMcDaniel Learning SAS by ExampleCody Output Delivery SystemHaworth X SAS Functions by ExampleCody Annotate: Simply the BasicsCarpenter SAS Programming ShortcutsAster Survival Analysis Using SASAllison X Longitudinal Data and SASCody SAS Macro ProgrammingBurlew ISBNOrder Date /23/ X07/07/ /09/ /24/ X11/05/ /30/ /25/2009
Joining Tables Proc SQL; Create Table Newdata as Select books.Title, books.Author, books.ISBN, orders.order_date From books join orders on books.ISBN = orders.ISBN; Quit;
TitleAuthorISBNOrder Date SAS for DummiesMcDaniel /25/2009 Learning SAS by ExampleCody /23/2009 Output Delivery SystemHaworth X07/07/2007 SAS Functions by ExampleCody /09/2008 SAS Programming ShortcutsAster /24/2008 Survival Analysis Using SASAllison X11/05/2007 Longitudinal Data and SASCody /30/2008
Joining Tables - Inner Joins OrdersBooks Books that have been Ordered
Joining Tables Proc SQL; Create Table Newdata as Select books.Title, books.Author, books.ISBN, orders.order_date From books join orders on books.ISBN = orders.ISBN; Quit;
Joining Tables Proc SQL; Create Table Newdata as Select b.Title, b.Author, b.ISBN, o.order_date From books as b join orders as o on b.ISBN = o.ISBN; Quit;
TitleAuthorISBN The Little SAS BookDelwiche SAS Survival HandbookWiseman SAS for DummiesMcDaniel Learning SAS by ExampleCody Output Delivery SystemHaworth X SAS Functions by ExampleCody Annotate: Simply the BasicsCarpenter SAS Programming ShortcutsAster Survival Analysis Using SASAllison X Longitudinal Data and SASCody SAS Macro ProgrammingBurlew ISBNOrder Date /23/ X07/07/ /09/ /24/ X11/05/ /30/ /25/2009 BookNum
Joining Tables Proc Sort data = Orders; by ISBN; run; Proc Sort data = Books; by ISBN; run; Proc Datasets library = work; modify Orders; rename BookNum = ISBN; run; quit; Data NewDat; Merge Books Orders; by ISBN; run;
Joining Tables Proc SQL; Create Table Newdata as Select b.Title, b.Author, b.ISBN, o.order_date From books as b join orders as o on b.ISBN = o.BookNum; Quit;
TitleAuthorISBN The Little SAS BookDelwiche SAS Survival HandbookWiseman SAS for DummiesMcDaniel Learning SAS by ExampleCody Output Delivery SystemHaworth X SAS Functions by ExampleCody Annotate: Simply the BasicsCarpenter SAS Programming ShortcutsAster Survival Analysis Using SASAllison X Longitudinal Data and SASCody SAS Macro ProgrammingBurlew ISBNOrder Date /23/ X07/07/ /09/ /24/ X11/05/ /30/ /25/ /05/2009
TitleAuthorISBNOrder Date SAS for DummiesMcDaniel /25/2009 SAS for DummiesMcDaniel /05/2009 Learning SAS by ExampleCody /23/2009 Output Delivery SystemHaworth X07/07/2007 SAS Functions by ExampleCody /09/2008 SAS Programming ShortcutsAster /24/2008 Survival Analysis Using SASAllison X11/05/2007 Longitudinal Data and SASCody /30/2008
Joining Tables Proc SQL; Create Table Newdata as Select b.Title, b.Author, b.ISBN, o.order_date From books as b join orders as o on b.ISBN = o.BookNum; Quit;
Joining Tables Proc SQL; Create Table Newdata as From Books as b, Orders as o Quit; Where b.ISBN = o.ISBN; Select b.Title, b.Author, b.ISBN, o.order_date
Joining Tables – Inner Joins OrdersBooks Books that have been Ordered
Joining Tables – Left Joins OrdersBooks Books that have been Ordered
Joining Tables – Left Joins Proc SQL; Create Table Newdata as Select b.Title, b.Author, b.ISBN, o.order_date From books as b left join orders as o on b.ISBN = o.ISBN; Quit;
TitleAuthorISBNOrder Date The Little SAS BookDelwiche SAS Survival HandbookWiseman SAS for DummiesMcDaniel /25/2009 Learning SAS by ExampleCody /23/2009 Output Delivery SystemHaworth X07/07/2007 SAS Functions by ExampleCody /09/2008 Annotate: Simply the BasicsCarpenter SAS Programming ShortcutsAster /24/2008 Survival Analysis Using SASAllison X11/05/2007 Longitudinal Data and SASCody /30/2008 SAS Macro ProgrammingBurlew
Joining Tables – Right Joins OrdersBooks Books that have been Ordered
Joining Tables – Full Joins OrdersBooks Books that have been Ordered
Group Functions
One of the most basic ways to summarize data Group functions return one result per group of rows processed Examples include basic statistics like SUM MEAN COUNT MIN/MAX Use the GROUP BY clause to indicate how to group the records
Tests Table DateStudent_IDSubjectSessionScore 01/25/10A12MathA96 01/25/10B34MathA92 01/25/10C56MathA68 01/25/10D75MathA79 03/26/10B34ScienceA96 03/26/10C56ScienceA82 04/23/10A12ReadingA84 04/23/10B34ReadingA94 04/23/10C56ReadingA78 04/23/10D75ReadingA81 05/22/10A12MathB92 05/22/10B34MathB94 05/22/10C56MathB72 05/22/10D75MathB81 04/15/10B34ScienceB94 04/15/10C56ScienceB84 06/01/10A12ReadingB88 06/01/10B34ReadingB96 06/01/10C56ReadingB82 06/01/10D75ReadingB79 48
Proc SQL; Select Student_ID, mean(Score) as Avg_Score From Tests Group By Student_ID; Quit; 49 Group Functions
Student_IDAvg_Score A12 90 B C D Group Functions
Can group by more than one variable Every field in the SELECT statement that is not being grouped, must also be listed in the GROUP BY clause Even if the field does not result in additional groupings
Proc SQL; Select Student_ID, Session, mean(Score) as Avg_Score From Tests Group By Student_ID, Session; Quit; 52 Group Functions
Student_IDSessionAvg_Score A12 A90 A12 B90 B34 A94 B34 B94.67 C56 A76 C56 B79.33 D75 A80 D75B80 53 Group Functions
Proc SQL; Select Student_ID, Session, mean(Score) as Avg_Score From Tests Group By Student_ID, Session Having Avg_Score lt 80; Quit; 54 Having Clause
Student_IDSessionAvg_Score C56 A76 C56B Having Clause
CASE / WHEN Logic
Tests Table DateStudent_IDSubjectSessionScore 01/25/10A12MathA96 01/25/10B34MathA92 01/25/10C56MathA68 01/25/10D75MathA79 03/26/10B34ScienceA96 03/26/10C56ScienceA82 04/23/10A12ReadingA84 04/23/10B34ReadingA94 04/23/10C56ReadingA78 04/23/10D75ReadingA81 05/22/10A12MathB92 05/22/10B34MathB94 05/22/10C56MathB72 05/22/10D75MathB81 04/15/10B34ScienceB94 04/15/10C56ScienceB84 06/01/10A12ReadingB88 06/01/10B34ReadingB96 06/01/10C56ReadingB82 06/01/10D75ReadingB79 57 Convert “Scores” to letter grades
data NewGrade; set Tests; length Test_Grade $ 4; if 70 le Score le 79 then Test_Grade = 'C'; else if 80 le Score le 89 then Test_Grade = 'B'; else if 90 le Score le 100 then Test_Grade = 'A'; else Test_Grade = 'Fail'; run; CASE/WHEN
Case when 70 le Score le 79 then 'C' when 80 le Score le 89 then 'B' when 90 le Score le 100 then 'A' else 'Fail' end as Test_Grade CASE/WHEN
Case when 70 le Score le 79 then 'C' when 80 le Score le 89 then 'B' when 90 le Score le 100 then 'A' else 'Fail' end as Test_Grade WHEN works like “ELSE IF” CASE/WHEN
Case when 70 le Score le 79 then 'C' when 80 le Score le 89 then 'B' when 90 le Score le 100 then 'A' else 'Fail' end as Test_Grade Conclude with an “END” expression CASE/WHEN
Proc SQL; Select Date, Student_ID, Subject, Session, Score, Case when 70 le Score le 79 then 'C' when 80 le Score le 89 then 'B' when 90 le Score le 100 then 'A' else 'Fail' end as Test_Grade From Tests; Quit; 62 CASE/WHEN
Results DateStudent_IDSubjectSessionScoreTest_Grade 01/25/10A12MathA96 A 01/25/10B34MathA92 A 01/25/10C56MathA68 Fail 01/25/10D75MathA79 C 03/26/10B34ScienceA96 A 03/26/10C56ScienceA82 B 04/23/10A12ReadingA84 B 04/23/10B34ReadingA94 A 04/23/10C56ReadingA78 C 04/23/10D75ReadingA81 B 05/22/10A12MathB92 A 05/22/10B34MathB94 A 05/22/10C56MathB72 C 05/22/10D75MathB81 B 04/15/10B34ScienceB94 A 04/15/10C56ScienceB84 B 06/01/10A12ReadingB88 B 06/01/10B34ReadingB96 A 06/01/10C56ReadingB82 B 06/01/10D75ReadingB79C 63
Additional Reading PaperSummary Introduction to Proc SQL (Chapman)Introduction to the basic syntax of Proc SQL An Introduction to SQL in SAS (Lund)Another introductory paper on Proc SQL A Hands-on Tour Inside the World of Proc SQL (Lafler) Yet another introductory paper on Proc SQL Kirk’s Ten Best Proc SQL Tips and Techniques Neat features in SQL Outer Joins and WHERE ClausesDiscusses the proper way to use a WHERE clause with an outer join (left, right, or full) to prevent unintentionally excluding desired rows The Many Uses of SQL SubqueriesTwo-part series that introduces subqueries (a nested query within another query) in SQL SAS with Oracle; Writing Efficient and Accurate SQL Discusses how to write efficient and accurate SQL that will properly communicate with a non-SAS database management system (DBMS) What’s New in SAS/ACCESSPerformance boosters to speed up your queries
Read chapters 8, 13, & 24 For next week…