Joins (Part II) Tarik Booker California State University, Los Angeles
What we will cover… Join Review Primary Keys, Foreign Keys, Equi-Join, Inner Join, Multiple Joins New Joins Outer Joins Left Joins, Right Joins Full Join, Self Join Joining Multiple Tables (More Examples)
Join Review Review Primary Key – What is it? Why? Foreign Key – What is it? Why? Equi-Join, Inner Join Differences? Join Conditions?
Join Review (2) Join Conditions: USING(columnId) For what type of join (Named Column) ON(join_condition) For what type of join? (Inner Join)
Join Review (3) Multiple Joins: Give me a list of MP3’s that every studio has produced: Tables: Track titles and MP3 status are in: Tracks Studio Names are in: Studios Table that links both: Titles
Join Review (4) SELECT tr.TrackTitle, tr.mp3, s.studioname FROM Tracks tr, Titles AS ti, Studios s WHERE tr.titleid = ti.titleid AND ti.studioid = s.studioid AND tr.mp3 = 1; SELECT tr.TrackTitle, tr.mp3, s.studioname FROM Tracks tr INNER JOIN Titles AS ti ON(tr.titleid = ti.titleid) INNER JOIN Studios s ON ti.studioid = s.studioid WHERE tr.mp3 = 1;
New Joins Outer Join Left Join Right Join Full Join Self Join
Outer Join INNER JOIN – Matches rows with matching column values (that you select) Throws out unmatched rows OUTER JOIN – Same as INNER JOIN, but keeps unmatched rows Uses LEFT JOIN and RIGHT JOIN as keywords (Don’t use “OUTER JOIN” in SQL) LEFT JOIN – Keeps unmatched rows from the left (first) table RIGHT JOIN – Keeps unmatched rows from the right (second) table
Outer Join (2) Give me a list of all artistnames with matching titles Tables artists and titles – Join artists with titles SELECT artistname, title FROM artists a INNER JOIN titles t ON (a.artistid = t.artistid); Returns only matched rows However, SELECT artistname, title FROM artists a LEFT JOIN titles t ON (a.artistid = t.artistid); Returns what? Matched rows FIRST, then unmatched rows
Outer Join (3) Why? Fewer Titles than Artistnames
Outer Join (4) Returns the unmatched rows with a NULL LEFT JOIN Returns unmatched rows from left (first) table RIGHT JOIN returns unmatched rows from right (second) table What if I use a RIGHT JOIN? Returns only unmatched rows on the right (titles) table What rows were unmatched?
Outer Join (5) All were matched on titles table RIGHT JOIN produces NO unmatched rows! Different results than LEFT JOIN!
Outer Join (6) Why? What’s the point? Can make SQL show rows that don’t have matches – can now return values that don’t match anything! Can use the ON and USING Keywords SELECT attribute_list FROM a LEFT JOIN b ON join_condition; SELECT attribute_list FROM a RIGHT JOIN b USING(attribute);
Outer Join (7) Join Order for Outer Joins Makes a Difference! SELECT artistname, title FROM artists LEFT JOIN titles USING(artistid); SELECT artistname, title FROM titles LEFT JOIN artists USING(artistid); What’s the difference?
Returning Unmatched Data Using Outer Joins Sometimes you want rows that don’t match anything Classes with no students/teachers Items with no category Anything else? Zero-matching results Best way to do this is to test the primary key (from the opposite side of the outer join) for NULL How do we do this?
Returning Unmatched Data Using Outer Joins (2) For a LEFT JOIN b, returning unmatched values in a Check the primary key in b for NULL Example: Give me the artistnames of artists that don’t have any titles yet. Tables?Artists, Titles Keys?ArtistID Join?LEFT JOIN What are we checking for?NULL In which table?Titles What’s the primary key of Titles? ?
Returning Unmatched Data Using Outer Joins (3) SELECT artistname, title FROM artists LEFT JOIN titles USING(artistid) WHERE titleid is null; TitleID’s are null for unmatched rows (they don’t exist) For RIGHT JOIN, reverse the procedure SELECT artistname, title FROM titles RIGHT JOIN artists USING(artistid) WHERE titleid is null; Left join is more intuitive (easier) for most people
Returning Unmatched Data Using Outer Joins (4) To check multiple conditions for unmatched data in an Outer Join, you MUST: Put Additional Conditions in the ON (or WHERE) Clause OF THE JOIN, then Check for NULL THIS IS VERY EASY TO SCREW UP, SO PAY CLOSE ATTENTION!!! Easy way – Put Multiple Conditions in the ON() clause Check for NULL in the WHERE clause Ex:Find titles that DO NOT CONTAIN any tracks longer than 400 secs
Returning Unmatched Data Using Outer Joins (5) Find titles that DO NOT CONTAIN any tracks longer than 400 secs Steps: Get my tables: need titles and tracks Titles, Tracks Get my join conditions: key to join, and additional condition (tracks longer than 400s) Common key: titleid, Join condition: tracks longer than 400s – lengthseconds > 400 Return only null values
Returning Unmatched Data Using Outer Joins (6) SELECT t.title FROM titles t LEFT JOIN Tracks tr ON(t.titleid = tr.titleid AND Lengthseconds > 400) WHERE Lengthseconds IS NULL; Or: WHERE tr.titleid IS NULL Or: WHERE tracknum IS NULL Why?
Returning Unmatched Data Using Outer Joins (7) Why did (lengthseconds IS NULL) work? Unmatched values in the Titles table return NULL for Tracks columns in the JOIN
Returning Unmatched Data Using Outer Joins (7) Keep in mind, unmatched data are the ones that DO NOT MATCH Ex: Find titles that do not contain any tracks longer than 400 seconds Is this unmatched data? Ex: Find all titles with tracks that are not longer than 400 seconds Is this unmatched data?
Returning Unmatched Data Using Outer Joins (8) SELECT DISTINCT T.title FROM titles T JOIN tracks tr ON(T.titleid = tr.titleid) WHERE LengthSeconds <= 400; You will have a lot of practice with this on Thursday
Full Joins Similar to OUTER JOIN, but returns unmatched rows from BOTH tables Like a LEFT JOIN and RIGHT JOIN at the same time Not Supported in MySQL! Won’t use Good to know Returns NULLs on both sides (in result)
Self Join A join between two or more instances of the same table! Typically used when a foreign key references a primary key of the same table Employee Table (later) Also used to find pairs of things in the same table Good for modeling hierarchical relationships within a single table Folders (Windows): Parent-Child Hierarchy Employees: Employee Hierarchy Examples?
Self Join (2) Look at the SalesPeople Table: Who’s the boss (top level hierarchy)?
Self Join (3) Hierarchy: Scott Bull (top level) Lisa Williams Bob Bentley Clint Sanchez Why does Scott have a NULL supervisor?
Self Join (4) Let’s Look at this Database (ecst_employee): Uses the Employee Table
Self Join (5) Employee Table: Employee ID (Primary Key) FirstName LastName JobTitle Salary SupervisorID (foreign Key references EmployeeID)
Ways to Use Self Join List the Names (First, and Last) who have no supervisors (who are supervised by no one) Who has no supervisor? How can you tell? Do you need a join yet?
Ways to Use Self Join (2) SELECT Firstname, LastName FROM Employee E WHERE E.SupervisorID IS NULL;
Ways to Use Self Join (3) Who is my immediate boss?
Ways to Use Self Join (4) SELECT e.firstname, e.lastname FROM Employee e join Employee s ON(e.employeeid = s.supervisorid) WHERE s.firstname = ‘Tarik’;
Ways to Use Self Join (5) Table Aliases Think of Clones! Use paper if too hard! E S
Ways to Use Self Join (6) List all employees whose supervisors are supervised by no one Supervisor (Immediate Supervisor) IS WHAT? How do we do this?
Ways to Use Self Join (7) SELECT e.Firstname, e.Lastname FROM Employee e LEFT JOIN Employee S ON (e.SupervisorID=s.SupervisorID) WHERE s.SupervisorID IS NULL;
Other Self Joins Consider the Presidents DB (presidents.sql): create table presidents ( ID int primary key, name varChar(25), predecessorID int ); insert into presidents values(44, "Obama", 43); insert into presidents values(43, "Bush, GW", 42); insert into presidents values(42, "Clinton", 41); insert into presidents values(41, "Bush, GHW", null);
Other Self Joins (2) Questions: How can we list the name of each president with that of his predecessor? How can we include the first president in the output? How can we list each with the name of his successor? How can we get presidents and successors while showing the last entry in the table? (with a null successor ID)
Using Self Joins for Pairs of Records From the Lyric Database: List all pairs of titles for which both titles were recorded at the same studio. What’s the approach? Basic (naïve) solution: SELECT T1.Title, T2.Title FROM Titles T1 JOIN Titles T2 ON (T1.StudioID = T2.StudioID) What happens? Next…
Using Self Joins for Pairs of Records (2) Better solution: SELECT T1.Titles, T2.Titles FROM Titles T1 JOIN Titles T2 ON (T1.StudioID = T2.StudioID) WHERE T1.Titles != T2.Titles; Mixed Pairs! How to fix (tricky)?
Using Self Joins for Pairs of Records (3) Correct Solution: SELECT T1.Title, T2.Title FROM Titles T1 JOIN Titles T2 ON (T1.StudioID = T2.StudioID) WHERE T1.Title < T2.Title What does the “<“ mean here (characters)? Alphabetizes results
Be Careful!!! Ex: Find all artists that have not recorded jazz albums WRONG!: SELECT DISTINCT Artistname FROM Artists A JOIN Titles T ON (A.ArtistID = T.ArtistID) WHERE T.Genre != ‘jazz’; Why Wrong? What type of information is this?
Be Careful!!! Ex: Find all artists that have not recorded jazz albums CORRECT SELECT Artistname FROM Artists A LEFT JOIN Titles T ON(A.artistID = T.artistID AND T.Genre = ‘jazz’) WHERE TitleID IS NULL; Why? There are legitimate NULL results. Artists that haven’t recorded anything!
Be Careful!!! SELECT DISTINCT Artistname FROM Artists A JOIN Titles T ON (A.ArtistID = T.ArtistID) WHERE T.Genre != ‘jazz’; Returns artists that have recorded non-jazz titles SELECT Artistname FROM Artists A LEFT JOIN Titles T ON(A.artistID = T.artistID AND T.Genre = ‘jazz’) WHERE TitleID IS NULL; Returns artists that have not recorded jazz titles. Not the same as above!!
Be Careful!!! Think carefully! You can’t always check output to see if results are correct (large DB’s) Ex: List all members who do not belong to the bullets. What type of info is this? *WRONG* SELECT M.MemberID, M.lastname, A.ArtistName FROM Members M JOIN XrefArtistsMembers X using(MemberID) JOIN Artists A USING (ArtistID) WHERE A.Artistname != "The Bullets"; *WRONG* *CORRECT* SELECT M.MemberID, M.lastname FROM Members M LEFT JOIN XrefArtistsMembers X ON(M.MemberID = X.MemberID) LEFT JOIN Artists A ON (X.ArtistID = A.ArtistID AND A.Artistname = "The Bullets") WHERE A.Artistid IS NULL;