Denormalizing Data with PROC SQL. Demoralizing Data with PROC SQL Is that a real word?? Spell Check…

Slides:



Advertisements
Similar presentations
Haas MFE SAS Workshop Lecture 3:
Advertisements

Advanced SQL (part 1) CS263 Lecture 7.
© 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Joins and Sub-queries in SQL.
A Guide to SQL, Seventh Edition. Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables.
MULTIPLE-TABLE QUERIES
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
Chapter 11 Group Functions
Introduction to Oracle9i: SQL1 Subqueries. Introduction to Oracle9i: SQL2 Chapter Objectives Determine when it is appropriate to use a subquery Identify.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Subqueries and Set Operations.
CSEN 5314 Quiz What type of join is needed when you wish to include rows that do not have matching values? A. Equi-joinB. Natural join C. Outer.
1 DDL – subquery Sen Zhang. 2 Objectives What is a subquery? Learn how to create nested SQL queries Read sample scripts and book for different kinds of.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 8 Advanced SQL.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 5: Subqueries and Set Operations.
Chapter 7 Advanced SQL Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Using Relational Databases and SQL Department of Computer Science California State University, Los Angeles Lecture 8: Subqueries.
Chapter 3 Single-Table Queries
Choose between Access and Excel Right questions, right program If you’re having trouble choosing between Access and Excel, take a moment to answer an important.
Banner and the SQL Select Statement: Part Four (Multiple Connected Select Statements) Mark Holliday Department of Mathematics and Computer Science Western.
Component 4/Unit 6f Topic VI: Create simple querying statements for the database The SELECT statement Clauses Functions Joins Subqueries Data manipulation.
Chapter 15: Combining Data Horizontally 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 9 Joining Data from Multiple Tables
1 Chapter 8: Advanced SQL. Chapter 8 2 Processing Multiple Tables – Joins Join – a relational operation that causes two or more tables with a common domain.
SQL/Lesson 4/Slide 1 of 45 Using Subqueries and Managing Databases Objectives In this lesson, you will learn to: *Use subqueries * Use subqueries with.
A Guide to MySQL 5. 2 Objectives Use joins to retrieve data from more than one table Use the IN and EXISTS operators to query multiple tables Use a subquery.
Chapter 7 © 2013 Pearson Education, Inc. Publishing as Prentice Hall 1 Modern Database Management 11 th Edition Jeffrey A. Hoffer, V. Ramesh, Heikki Topi.
Database Programming Sections 6 –Subqueries, Single Row Subqueries, Multiple-column subqueries, Multiple-row Subqueries, Correlated Subqueries 11/2/10,
1 Agenda – 10/24/2013 Answer questions from lab on 10/22. Present SQL View database object. Present SQL UNION statement.
1 © Prentice Hall, 2002 Chapter 8: Advanced SQL Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Chapter 4 Multiple-Table Queries
CpSc 3220 The Language of SQL The Language of SQL Chapters
Unit 4 Queries and Joins. Key Concepts Using the SELECT statement Statement clauses Subqueries Multiple table statements Using table pseudonyms Inner.
1 Multiple Table Queries. 2 Objectives  Retrieve data from more than one table by joining tables  Using IN and EXISTS to query multiple tables  Nested.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Programming in R SQL in R. Running SQL in R In this session I will show you how to: Run basic SQL commands within R.
Structured Query Language (SQL) Ask and ye shall receive. The Bible.
Indexes and Views Unit 7.
Chapter 12 Subqueries and Merge Statements
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
A Guide to SQL, Eighth Edition Chapter Five Multiple-Table Queries.
SQL: Sub-queries Single-value sub-queries Single-column sub-queries Sub-queries that produce tables Correlated sub-queries D. Christozov / G.Tuparov INF.
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 6 This material was developed by Oregon Health & Science.
A Guide to SQL, Eighth Edition Chapter Four Single-Table Queries.
What is Matrix Multiplication? Matrix multiplication is the process of multiplying two matrices together to get another matrix. It differs from scalar.
In this session, you will learn to: Query data by using joins Query data by using subqueries Objectives.
Database Programming Sections 6 –Subqueries, Single Row Subqueries, Multiple-row Subqueries, Correlated Subqueries.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
Chapter 7 Subqueries. Chapter Objectives  Determine when it is appropriate to use a subquery  Identify which clauses can contain subqueries  Distinguish.
BTM 382 Database Management Chapter 8 Advanced SQL Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia.
Alex Goes Grocery Shopping By Jane Ricard Alex is going to buy food at the grocery store with his Mom.
Select Complex Queries Database Management Fundamentals LESSON 3.1b.
CSC314 DAY 9 Intermediate SQL 1. Chapter 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall USING AND DEFINING VIEWS  Views provide users controlled.
IFS180 Intro. to Data Management Chapter 10 - Unions.
MySQL Subquery Source: Dev.MySql.com
Chapter 12 Subqueries and MERGE Oracle 10g: SQL
Multiple Table Queries
References: Text Chapters 8 and 9
Database Systems: Design, Implementation, and Management Tenth Edition
© 2010, Mike Murach & Associates, Inc.
Data warehouse Design Using Oracle
Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Databases & Consistency
CMPT 354: Database System I
SQL Subquery.
Combining Data Sets in the DATA step.
This shows the tables that I made for the order system.
Chapter 8 Advanced SQL.
Subqueries.
Database Systems: Design, Implementation, and Management Tenth Edition
SQL set operators and modifiers.
Presentation transcript:

Denormalizing Data with PROC SQL

Demoralizing Data with PROC SQL Is that a real word?? Spell Check…

Grocery Store

What You Have CUSTOMER_IDNAME 1Ansel 2Fiona 3James 4Kathy 5Ying 6Otto 7Costas 8Abdul 9Enrico 10Mitzu ITEM_IDITEM_NAME 1eggs 2milk 3bread 4chicken 5beef 6broccoli 7carrots 8apples 9peaches 10dog food CUSTOMER_IDITEM_ID …… Relational tables in normal form Purchase events in many-to-many relation Good for relational storage, bad for computing stats Data step: join tables using complex merge

What you want CUSTOMER_IDNAMEEGGSMILKBREAD…DOG_FOOD 1Ansel110…0 2Fiona111…1 3James010…0 4Kathy101…0 5Ying111…1 6Otto001…1 7Costas001…1 8Abdul010…0 9Enrico101…1 10Mitzu010…1 Matrix shows who bought what One row per customer One column per item Easy to compute stats

How do you get this? A few SQL examples to build up to a solution…

SQL Examples What items did customer #1 buy? select item_id from purchases where customer_id = 1; ITEM_ID

What items did customer #1 buy? Join with grocery table to get item name select P.item_id, G.item_name from from purchases P, groceries G where G.item_id = P.item_id and P.customer_id = 1; ITEM_ID ITEM_NAME eggs 2 milk 4 chicken 6 broccoli

How many customers bought eggs? Use SQL aggregate function count(). select count(*) from purchases P, groceries G where P.item_id = G.item_id and G.item_name = 'eggs' COUNT(*)

Did customer #1 buy eggs? Restrict by customer, count() function returns 0 or 1, i.e., yes or no select count(*) from groceries G, purchases P where P.item_id = G.item_id and G.item_name = 'eggs' and P.customer_id = 1; COUNT(*)

Did customer #10 buy eggs? select count(*) from groceries G, purchases P where P.item_id = G.item_id and G.item_name = 'eggs' and P.customer_id = 10; COUNT(*)

Subqueries In SQL, select clause can include a query that returns a scalar value select name, (select count(*) from purchases) num_purchases from customers NAME NUM_PURCHASES Ansel 58 Fiona 58 James 58 Kathy 58 Ying 58 Otto 58 Costas 58 Abdul 58 Enrico 58 Mitzu 58

Correlated Subqueries Relate inner and outer queries via alias select name, (select count(*) from purchases where customer_id = C.customer_id) num_purchases from customers C; NAME NUM_PURCHASES Ansel 4 Fiona 9 James 6 Kathy 3 Ying 8 Otto 7 Costas 7 Abdul 2 Enrico 7 Mitzu 5

Putting the pieces together Joins to get data from multiple tables Count() to get 0/1, yes/no Correlated subqueries rotate rows to columns Aliases to name columns

Final query select customer_id, name, (select count(*) from purchases P, groceries G where G.item_id = P.item_id and G.item_name = 'eggs' and P.customer_id = C.customer_id) eggs, (select count(*) from purchases P, groceries G where G.item_id = P.item_id and G.item_name = 'milk' and P.customer_id = C.customer_id) milk, (select count(*) from purchases P, groceries G where G.item_id = P.item_id and G.item_name = 'bread' and P.customer_id = C.customer_id) bread from customers C;

SAS Code proc sql; create table Work.Purchase_Matrix as select customer_id, name, (select count(*) from purchases P, groceries G where G.item_id = P.item_id and G.item_name = 'eggs' and P.customer_id = C.customer_id) eggs, (select count(*) from purchases P, groceries G where G.item_id = P.item_id and G.item_name = 'milk' and P.customer_id = C.customer_id) milk, (select count(*) from purchases P, groceries G where G.item_id = P.item_id and G.item_name = 'bread' and P.customer_id = C.customer_id) bread from customers C; quit;

Final Dataset CUSTOMER_ID NAME EGGS MILK BREAD Ansel Fiona James Kathy Ying Otto Costas Abdul Enrico Mitzu 0 1 0

Resources SQL may not always be the most appropriate choice for a given problem. This technique starts to get untenable as the number of columns needed in the output increases. DATA Step vs. PROC SQL: What’s a neophyte to do? Proc SQL versus The Data Step

Questions?