PROC SQL: Tips and Translations for Data Step Users By: Gail Jorgensen Susan Marcella.

Slides:



Advertisements
Similar presentations
Haas MFE SAS Workshop Lecture 3:
Advertisements

Chapter 4 Joining Multiple Tables
A Guide to SQL, Seventh Edition. Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables.
Performing Queries Using PROC SQL Chapter 1 1 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Introduction to Structured Query Language (SQL)
1 Creating and Tweaking Data HRP223 – 2010 October 24, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Introduction to SQL Session 2 Retrieving Data From Multiple Tables.
Introduction to Structured Query Language (SQL)
Introduction to Oracle9i: SQL1 Basic SQL SELECT Statements.
Introduction to Structured Query Language (SQL)
Basic And Advanced SAS Programming
INTEGRITY Enforcing integrity in Oracle. Oracle Tables mrobbert owner granted access.
PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.
Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
Introduction to SQL J.-S. Chou Assistant Professor.
Copyright 2007, Paradigm Publishing Inc. BACKNEXTEND 3-1 LINKS TO OBJECTIVES Save a Filter as a Query Save a Filter as a Query Parameter Query Inner, Left,
Chapter 3: Combining Tables Horizontally using PROC SQL 1 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
SAS SQL SAS Seminar Series
A Concise Display of Multiple Response Items Patrick Thornton.
Introduction to Databases Chapter 7: Data Access and Manipulation.
Microsoft Access DataBase Automated Grading System
SAS SQL Part 2 Alan Elliott. Dealing with Missing Values Title "Dealing with Missing Values in SQL"; PROC SQL; select INC_KEY,GENDER, RACE, INJTYPE, case.
Niraj J. Pandya, Element Technologies Inc., NJ.  Summarize all possible combinations of class level variables even if few categories are altogether missing.
McGraw-Hill Technology Education © 2004 by the McGraw-Hill Companies, Inc. All rights reserved. Office Access 2003 Lab 3 Analyzing Data and Creating Reports.
Chapter 15: Combining Data Horizontally 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Multiple Uses for a Simple SQL Procedure Rebecca Larsen University of South Florida.
Copyright © 2008 SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks.
Chapter 9 Joining Data from Multiple Tables
SAS ® PROC SQL or Vanilla Flavor Cecilia Mauldin January
A Guide to MySQL 5. 2 Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables Use a subquery.
PhUSE 20141October 2014 Ziekte gebied/ Overall subject Name presenterMonth-Year Title presentation PhUSE 2014 Berber SnoeijerOct 2014 Simple and Efficient.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
Chapter 6 SQL: Data Manipulation (Advanced Commands) Pearson Education © 2009.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
SQL Chapter Two. Overview Basic Structure Verifying Statements Specifying Columns Specifying Rows.
1 Efficient SAS Coding with Proc SQL When Proc SQL is Easier than Traditional SAS Approaches Mike Atkinson, May 4, 2005.
CS146 References: ORACLE 9i PROGRAMMING A Primer Rajshekhar Sunderraman
Chapter 4Introduction to Oracle9i: SQL1 Chapter 4 Joining Multiple Tables.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 (Part II) INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor.
SQL Select Statement IST359.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Copyright © 2004, SAS Institute Inc. All rights reserved. SASHELP Datasets A real life example Barb Crowther SAS Consultant October 22, 2004.
INFANL01-3 ANALYSE 3 WEEK 3 March 2015 Institute voor Communication, Media en Informatietechnology.
In this session, you will learn to: Query data by using joins Query data by using subqueries Objectives.
Query Processing – Implementing Set Operations and Joins Chap. 19.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
SQL LANGUAGE TUTORIAL Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha.
There’s a particular style to it… Rob Hatton
LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapter 26 By Tasha Chapman, Oregon Health Authority.
Using Structured Query Language (SQL) NCCS Applications –MS Access queries (“show SQL”) –SAS (PROC SQL) –MySQL (the new dataserver) –Visual Foxpro Other.
Select Complex Queries Database Management Fundamentals LESSON 3.1b.
Working Efficiently with Large SAS® Datasets Vishal Jain Senior Programmer.
CSC314 DAY 9 Intermediate SQL 1. Chapter 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall USING AND DEFINING VIEWS  Views provide users controlled.
Build your Metadata with PROC CONTENTS and ODS OUTPUT Louise S. Hadden Abt Associates Inc.
Better Metadata Through SAS® II: %SYSFUNC, PROC DATASETS, and Dictionary Tables.
IFS180 Intro. to Data Management Chapter 10 - Unions.
SAS and Other Packages SAS can interact with other packages in a variety of different ways. We will briefly discuss SPSSX (PASW) SUDAAN IML SQL will be.
Advantages of sas for reporting
Chapter 6: Set Operators
An Introduction to SQL.
Outer Joins Inner joins returned only matching rows. When you join tables, you might want to include nonmatching rows as well as matching rows.
Dictionary Tables and Views, obtain information about SAS files
Combining Data Sets in the DATA step.
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
UNION Operator keywords Displays all rows from both the tables
Presentation transcript:

PROC SQL: Tips and Translations for Data Step Users By: Gail Jorgensen Susan Marcella

AGENDA SQL Syntax Review Joins Translated SQL Strengths & Uses

PROC SQL: Tips and Translations for Data Step Users Syntax Proc SQL; create table/view newdsname as select var1, var2, … varN from dsname where condition ; Quit;

PROC SQL: Tips and Translations for Data Step Users JOIN vs MERGE Types of JOINs Inner Join – selects only matching records (same as: if ina and inb) Outer Join – selects some non-matching records – Left Join – selects all records from first table, only matching records from second (same as: if ina) – Right join – selects all records from second table, only matching records from first (same as: if inb) -- Full join – selects all records from both tables (same as having no if statement)

PROC SQL: Tips and Translations for Data Step Users Inner Join proc sql; create table ds_c as select ds_a1.*, ds_b.* from ds_a1, ds_b where ds_a1.idfld = ds_b.idno; quit; data c; merge ds_a1(in=ina) ds_b(in=inb rename=(idno=idfld)); by idfld; if ina and inb; run; Dataset DS_A1Dataset DS_BB idfldcol2 1M 1N 2O 3P 4Q 5R idnocol3col4 1XC 2XD 2YF 4Z 5Z 7Z idfldcol2idnocol3col4 1M1XC 1N1XC 2O2XD 2O2YF 4Q4Z 5R5Z idfldcol2col3col4 1MXC 1NXC 2OXD 2OYF 4QZ 5RZ

PROC SQL: Tips and Translations for Data Step Users Left Join idfldcol2col5 1MA 1ND 2O 3PJ 4QK 5RN proc sql; create table sql_left as select a.*, b.* from inf_a as a left join inf_b as b on a.idfld = b.idfld; quit; data ds_left; merge inf_a(in=ina) inf_b(in=inb); by idfld; if ina ; run; idfldcol3col4 1XC 2XD 2YF 4Z 5Z 7Z Dataset inf_aDataset inf_b idfldcol2col5col3col4 1NDXC 1MAXC 2OYF 2OXD 3PJ 4QKZ 5RNZ idfldcol2col5col3col4 1NDXC 1MAXC 2OYF 2OXD 3PJ 4QKZ 5RNZ Dataset sql_leftDataset ds_left

PROC SQL: Tips and Translations for Data Step Users Right Join proc sql; create table sql_right as select a.*, b.* from inf_a as a right join inf_b as b on a.idfld = b.idfld; quit; data ds_right; merge inf_a(in=ina) inf_b(in=inb); by idfld; if inb; run; idfldcol2col5col3col4 1NDXC 1MAXC 2OYF 2OXD 4QKZ 5RNZ.Z idfldcol2col5col3col4 1MAXC 1NDXC 2OXD 2OYF 4QKZ 5RNZ 7Z

PROC SQL: Tips and Translations for Data Step Users Full Join Obsnamerecdsent 1 Amandayesno 2 Gabiyes 3 Janyes 4 Jimnoyes 5 Pamno Obsnamerecdsent 1 Alisonyes 2 Janyes 3 Pamno 4 Tomyes CList07 CList08 proc sql; create table sql_clist as select c7.name, c7.recd as recd07, c8.recd as recd08, c7.sent as sent07, c8.sent as sent08 from clist07 as c7 full join clist08 as c8 on c7.name=c8.name; quit; proc sort data=clist07; by name; run; proc sort data=clist08; by name; run; data data_clist; merge clist07 clist08 (rename=(recd=recd08 sent=sent08)); by name; run;

PROC SQL: Tips and Translations for Data Step Users Full Join (Con’t) Obs Namerecd07recd08sent07sent08 1 yes 2 Amandayesno 3 Gabiyes 4 Janyes 5 Jimnoyes 6 Pamno 7 yes Obs namerecdsentrecd08sent08 1 Alisonyes 2 Amandayesno 3 Gabiyes 4 Janyes 5 Jimnoyes 6 Pamno 7 Tomyes Sql_CListData_CList

PROC SQL: Tips and Translations for Data Step Users Handling Duplicate Variable Names To always select the variable from one dataset: – Drop unwanted version of variable (PROC SQL permits all SAS dataset options) – Select variable from specific table To keep variable from both tables: – Rename the variable in one dataset To select variable based on value: – Use CASE statement

PROC SQL: Tips and Translations for Data Step Users CASE Statement proc sql; create table NewCList as select case when missing(c7.name) then c8.name else c7.name end as name, c7.recd as recd07, c8.recd as recd08, c7.sent as sent07, c8.sent as sent08 from clist07 as c7 full join clist08 as c8 on c7.name=c8.name; quit; proc sort data=clist07; by name; run; proc sort data=clist08; by name; run; data data_clist; merge clist07 clist08 (rename=(recd=recd08 sent=sent08)); by name; run;

PROC SQL: Tips and Translations for Data Step Users CASE Statement - Results Obs namerecd07recd08sent07sent08 1 Alisonyes 2 Amandayesno 3 Gabiyes 4 Janyes 5 Jimnoyes 6 Pamno 7 Tomyes

PROC SQL: Tips and Translations for Data Step Users Down Calculations PROC SORT data=shs.exposure; by subject_id; run; DATA counters(KEEP=TableName MAXOBS TOTOBS); SET shs.exposure END=LAST; BY subject_id; length TableName $ 50; RETAIN MAXOBS OBSCNTR TOTOBS 0; TableName=“exposure"; TOTOBS+1; OBSCNTR+1; IF LAST.subject_id THEN DO; IF MAXOBS < OBSCNTR THEN MAXOBS=OBSCNTR; OBSCNTR=0; END; IF LAST THEN OUTPUT chemcnts; label maxobs='Maximum number of obs per person' totobs='Total Number obs in table'; run; proc print data=counters; run; Obs TableNameMAXOBSTOTOBS 1 exposure142124

PROC SQL: Tips and Translations for Data Step Users Down Calculations proc sql; create table sqlcounter as select distinct subject_id, count(*) as subjcnt from fshs.exposure group by subject_id; select “exposure" as TableName, max(subjcnt) as MaxObs, sum(subjcnt) as TotObs from sqlcounter; quit; TableNameMaxObsTotObs exposure Obssubject_idsubjcnt sqlcounter

PROC SQL: Tips and Translations for Data Step Users Counts and Nesting Queries proc sql; select distinct genre, count(*) from itunes group by genre; quit; proc sql outobs=1; select (select count(*) from itunes) as TotalSongs, (select count(distinct genre) from itunes) as GenreCnt, (select count(distinct artist) from itunes) as ArtistCnt, (select count(distinct album) from itunes) as AlbumCnt from itunes; quit; Genre Alternative9 Bluegrass43 Blues14 Children's Music62 Christian & Gospel88 Classical74 Country77 Easy Listening31 Electronic1 Folk16 General Folk18 Gospel & Religious40 Hip Hop/Rap2 Holiday13 Inspirational70 TotalSongsGenreCntArtistCntAlbumCnt

PROC SQL: Tips and Translations for Data Step User Dictionaries proc sql; create view detail as select * from dictionary.columns ; create view extern as select * from dictionary.members ; create view tbl as select * from dictionary.tables ; create view gotem as select trim(libname) as LibName, trim(memname) as TableName, trim(name) as ColName, label as ColLabel from sashelp.vcolumn ; quit; SAS

PROC SQL: Tips and Translations for Data Step User Dictionaries – Getting variable names proc sql; /* get names of all variables you want */ select name into :drinkvars separated by ', ' from dictionary.columns where libname=‘AUG' and memname='DEMOG' and lowcase(name) contains ‘ndrk'; /* use your newly created macro variable in your select statement */ create table drinks as select &drinkvars from aug.demog; quit;

PROC SQL: Tips and Translations for Data Step User Dictionaries – Getting variable names proc sql; /* add the table alias to the front of each variable name as you create your macro variable */ select 'd.'||name into :aliasvars separated by ', ' from dictionary.columns where libname='AUG' and memname='DEMOG' and lowcase(name) contains ('ndrk'); /* do your merge or whatever using the macro variable you just created */ create table newtable as select &aliasvars, c.expcategory from aug.demog as d left join aug.exposure as c on d.jcml_id=c.jcml_id; quit;

PROC SQL: Tips and Translations for Data Step User Views Views are ‘virtual tables’ Created with CREATE VIEW statement Can be used as if they are normal physical tables Enhance security – can construct a view of only fields and rows that user is allowed to view Enhance ease-of-use – Can combine rows and columns from multiple tables into a single view Facilitate data integrity – Can have several views on the same table, but only have to update the base table – Users always see up-to-date data proc sql; create view aug.testview as select d.subject_id, d.case_id, d.age, e.job_num, e.exposure_element from aug.demog as d, aug.exposure as e where d.subject_id=e.subject_id; quit;

PROC SQL: Tips and Translations for Data Step Users Creating Data Source Indicators Obsfamidnameinc 1 2Art Bill Paul Karl95000 dads Obsfamidfaminc96faminc97faminc faminc proc sql; create table sql_fj as select *, (dads.famid=faminc.famid) as indic, (dads.famid ~=.) as dadind, (faminc.famid ~=.) as famind, coalesce(dads.famid, faminc.famid) as fid from dads full join faminc on dads.famid=faminc.famid; quit; proc sort data=dads out=sorted_dads; by famid; run; proc sort data=faminc out=sorted_faminc; by famid; run; data ds_fj; merge sorted_dads(in=in1) sorted_faminc(in=in2); by famid; if in1 and in2 then indic=1; else indic=0; dadind=in1; famind=in2; fid=famid; run;

PROC SQL: Tips and Translations for Data Step Users Full Join - cont Obsfamidnameincfaminc96faminc97faminc98indicdadindfamindfid 1 1Bill Art Paul Karl Obsfamidnameincfaminc96faminc97faminc98indicdadindfamindfid 1 1Bill Art Paul Karl Sql_fj Ds_fj

PROC SQL: Tips and Translations for Data Step Users Additional Uses proc sql; title 'Bad Control Matches'; select c.subject_id, c.casenum, c.gender, age as CntlAge label='CntlAge', (select age from cases where subject_id=c.casenum) as CaseAge, abs(cntlage-calculated caseage) as AgeDiff from controls as c left join demog as d on c.subject_id=d.subject_id where (not within5 and not within10); quit; Scenario: For a case/control study, verify that all controls have age within 5 to 10 years of the related case age.

PROC SQL: Tips and Translations for Data Step User Merging Multiple Tables proc sql; create table sql_c3 as select a.name, b.class, case when missing(c.grade1) then "missing 1“ when missing(c.grade2) then "missing 2“ when missing(c.grade3) then "missing 3“ when missing(c.grade4) then "missing 4" else "none missing“ end as miss_grade from indat_a as a, indat_b as b, indat_c as c where c.classid=b.classid and c.perid=a.perid; quit; Obsnameclassmiss_grade 1 MaryArtnone missing 2 OliveArtnone missing 3 QuincyArtnone missing 4 NatArtnone missing 5 PatArtnone missing 6 QuincyMusicmissing 4 7 RichardMusicnone missing 8 MaryMathnone missing 9 NatMathnone missing 10 OliveMathnone missing 11 PatMathnone missing 12 QuincyMathmissing 2 13 RichardMathnone missing 14 MaryEnglishnone missing 15 NatEnglishnone missing 16 OliveEnglishnone missing 17 PatEnglishnone missing 18 QuincyEnglishnone missing 19 RichardEnglishnone missing

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.