Download presentation
Presentation is loading. Please wait.
Published byBrenda Porter Modified over 9 years ago
1
PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science in Predictive Analytics Program
2
Data Processing Terminologies Across Data Sciences…
3
Why PROC SQL or What Can It Do For Analysts? Generate reports Generate summary statistics Retrieve data from tables or views Combine data from tables or views Create tables, views, and indexes Update the data values in PROC SQL tables Update and retrieve data from database management system (DBMS) tables Modify a PROC SQL table by adding, modifying, or dropping columns PROC SQL can be used in an interactive SAS session or within batch programs, and it Can include global statements, such as TITLE and OPTIONS.
4
An Example of Extracting, Summarizing, and Printing Using Data Step title 'Large Countries Grouped by Continent'; proc summary data=sql.countries; where Population > 1000000; class Continent; var Population; output out=sumPop sum=TotPop; run; proc sort data=SumPop; by totPop; run; proc print data=SumPop noobs; var Continent TotPop; format TotPop comma15.; where _type_=1; run; /* Extracting and summarizing */ /* Sorting to arrange the output */ /* Printing */
5
Creating The Same Using PROC SQL proc sql; title 'Population of Large Countries Grouped by Continent'; select Continent, sum(Population) as TotPop format=comma15. from sql.countries where Population gt 1000000 group by Continent order by TotPop; quit;
6
Countries Table
7
WordCityCoords Table
8
USCityCoords Table
9
UnitedStates Table
10
PostalCodes Table
11
Worldtemps Table
12
Oilprod Table
13
OILRSRVS Table
14
CONTINENTS Table
15
FEATURES Table
16
SELECT statement
17
Three Important Aspects – Describe, Print, Quit /* Helps understand the structure of the table */ PROC SQL; Describe table sql.unitedstates; Quit;
18
SELECT means PRINTING is Included Unless SELECT * /* all columns */ SELECT city, state /* specific columns */ SELECT distinct continent /* specific columns but avoid dup */ So it is possible to run this
19
The output is…
20
Suppress column headings…
21
Calculated columns and alias name…
22
Retrieving Data From Multiple Tables Means we are JOINING tables If there is no JOIN statement, it means (1) Cartesian product of records [no subset condition ] or (2) inner joins [ we need some subset condition] Alias names can be used for tables too; it helps simplify calling specific columns of a table
23
SELECT … FROM table1, table2; A Cartesian Product
25
Order the output from INNER JOIN INNER JOIN can be used explicitly
26
INNER JOIN with comparison values on another column…
27
Effect of Null Values on JOINS
28
NOT MISSING option
29
Multicolumn JOINS
30
Columns are directly comparable between two tables… Capitals FROM sql.unitedstates City FROM sql.uscitycoord Postalcodes FROM sql.postalcodes
31
Is it possible to do SELFJOIN?
32
Two Types of OUTERJOIN – LEFTJOIN and RIGHTJOIN
33
FULLJOIN …
34
SPECIALTY JOINS
35
NATURAL is applicable for both LEFT and RIGHT JOIN. The purpose is to reduce verbose to match on multiple common columns… Gives the same output; Non matching rows have missing values
36
Use COALESCE to combine multiple columns to create new matching variables
37
Using SUB QUERY or NESTED QUERY – SINGLE VALUE =
38
Correlated SUBQUERY = NESTED QUERY
39
Where “EXISTS” option
40
Multiple NESTED QUERY
41
Combine a JOIN with a SUBQUERY
42
QUERY strategies…
43
UNION is ROWWISE (PROC APPEND), while JOIN is COLUMNWISE (MERGE by) Keep the dups
44
OUTER UNION = KEEP ONLY FROM – Key word EXCEPT
47
To overlay data better: keyword CORRESPONDING
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.