More queries Outer joins and summary queries
Inner and outer joins An Inner join only returns matching rows from two tables –E.g. if I join the customer and order table, I will only get customers who have orders. –If I join staff to orders, I will only get back the staff who have been in charge of an order. An Outer join can either be a left outer join or a right outer join. –A left outer join returns all rows from the left table and only the matching rows from the right table. –A right outer join returns all rows from the right table and only the matching rows from the left table.
Example ‘Find me all customers that are recorded along with any orders they have placed’. Select all Customers and any orders that are joined to a customer. –To make it easier to read, only select the company name from the customer, and the order id from the order Browse through the results to see if you can find any nulls in the order fields.
Example of a left join Select orders.orderid, cast(CompanyName as char(30))as Company from Customers left join Orders on Customers.CustomerId = Orders.CustomerId
A section of the output 10808Old World Delicatessen 10347Familia Arquibaldo 10386Familia Arquibaldo 10414Familia Arquibaldo 10512Familia Arquibaldo 10581Familia Arquibaldo 10650Familia Arquibaldo 10725Familia Arquibaldo NULLFISSA Fabrica Inter. Salchicha 10408Folies gourmandes 10480Folies gourmandes 10634Folies gourmandes 10763Folies gourmandes 10789Folies gourmandes Note – the customer whose company name is FISSA Fabrica Inter. Salchicha is registered, –but there are no orders registered against it
Uses of Outer Joins These can be used to show the difference between two tables. –Difference using an outer join Get all the rows from one side of the join and only the matching rows from the other side This means that some of the resultant rows will have NULLs in columns originating from one side of the join. Display only those with nulls on the fully populated side.
Using our example for Intersection ‘Find me all customers that are recorded that have never placed an order’. 1. Select all Customers and any orders that are joined to a customer. 2. List only those rows that DO NOT have a NULL value on the order side of the join.
Query Select distinct Customers.CustomerId, cast(CompanyName as char(30))as Company from Customers left join Orders on Customers.CustomerId = Orders.CustomerId where orders.customerId is not null
Results for Intersection CustomerId Company ALFKI Alfreds Futterkiste ANATR Ana Trujillo Emparedados y hel ANTON Antonio Moreno Taquería AROUT Around the Horn BERGS Berglunds snabbköp BLAUS Blauer See Delikatessen BLONP Blondesddsl père et fils BOLID Bólido Comidas preparadas BONAP Bon app' ………..89 rows.
Note about intersection This should be the same as an inner join: –Select distinct Customers.CustomerId, cast (CompanyName as Char(30)) as Company from Customers join Orders on Customers.CustomerId = Orders.CustomerId –Run the two and see that the result is the same.
Using our example for Difference ‘Find me all customers that are recorded that have never placed an order’. 1. Select all Customers and any orders that are joined to a customer. 2. List only those rows that have a NULL value on the order side of the join.
Query Select Customers.CustomerId, cast(CompanyName as char(30))as Company, Orders.CustomerId from Customers left join Orders on Customers.CustomerId = Orders.CustomerId where orders.customerId is null
Results CustomerId Company CustomerId FISSA FISSA Fabrica Inter. Salchicha NULL PARIS Paris spécialités NULL (2 row(s) affected) Note: In reality, the output would make more sense if we left out the Orders.CustomerId, because we know that it is null anyway.
More on outer joins Which side do we use? –If we have a 1:0 or many relationship, then the 1 side is essential –E.g. we can ask ‘what products do we have that have not been ordered?’, but it makes no sense to ask ‘what order details do we have that don’t match a product?’ –Why? Because we have used the product code as a foreign key in the [order details], thereby ensuring that an [order details] row cannot be added unless there is a corresponding product to order! –In the Northwind, all the products have been ordered at least once, so we’ll use a different pair of tables.
Look at two tables Each employee (employeeterritories table) covers one or more territory (territories table) An employee cannot cover a territory that does not exist: select EmployeeId, TerritoryDescription from employeeterritories left join territories on territories.territoryId = employeeterritories.territoryId –This shows no nulls but there may be no employee covering a territory: select EmployeeId, TerritoryDescription from employeeterritories right join territories on territories.territoryId = employeeterritories.territoryId Shows that Columbia, Austin, Bentonville and Dallas have no employees covering them.
And another thing The order of the join is important: –Employees left join Employeeterritories –Is the same as –Employeeterritories right join Employees And similarly: –Employeeterritories left join Employees –Is the same as –Employees right join Employeeterritories
Tables joined to themselves. This generally takes place in a hierarchy. Two commonplace examples are A part is part of a part –E.g. a bulb is part of a light. –A light is part of a front grille. An employee reports to an employee
Example create table part ( PartNo numeric (4,0) unique not null, PartDesc varchar(20), PartOf numeric (4,0), primary key (PartNo), foreign key (PartOf) references Part(PartNo) ) insert into Part (PartNo, PartDesc) values (1, 'Biggest part') insert into Part (PartNo, PartDesc) values (2, 'Quite big part') insert into Part values (3, 'Component part',1) insert into Part values (4,'small part',3) select * from part select * from part as component join part as container on component.partof = container.partno
Tables joined to themselves In the Northwind database, the employees table is joined to itself, by the ‘reportsto’ field. See tutorial 6 and try to list all employees and their bosses.
NorthWind ERD
Summary Queries Summary queries involve multiple rows in a table. The rows are GROUPED, using the GROUP BY clause. A function (MIN, MAX, AVG, SUM, COUNT) operates on one of the columns of the table. –E.g. select count(*) from products. –This will return the number of product ROWS in the table. –Select min(unitprice) from products Include or exclude grouped rows by using the HAVING clause
Summary Summarise the income by product, still ordered in the same way, giving totals for each product and totals for each category. NOTE: –There are two summaries here – one per product and one per category. –Use the ‘WITH CUBE’ clause.
Query Select categoryName, cast(ProductName as char(25)) as Product, cast (sum(([order details].unitPrice*Quantity)-Discount) as decimal(10,2))as income from categories join products on categories.categoryid = products.categoryid join [order details] on products.productid = [order details].productid group by categoryname, productname with cube
categoryName Product income Beverages Chai Beverages Chang Beverages Chartreuse verte Beverages Côte de Blaye Beverages Guaraná Fantástica Beverages Ipoh Coffee Beverages Lakkalikööri Beverages Laughing Lumberjack Lager Beverages Outback Lager Beverages Rhönbräu Klosterbier Beverages Sasquatch Ale Beverages Steeleye Stout Beverages NULL Condiments Aniseed Syrup Condiments Chef Anton's Cajun Season Summary row
With Rollup ROLLUP specifies that, in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. Groups are summarized in a hierarchical order, from the lowest level in the group to the highest. The group hierarchy is determined by the order in which the grouping columns are specified. Changing the order of the grouping columns can affect the number of rows produced in the result set.
Example Select categoryname, count(*)as ‘Number of products’ from categories join products on products.categoryId = categories.categoryId group by categoryname with rollup;
Result categoryname Number of products Beverages 12 Condiments 12 Confections 13 Dairy Products 10 Grains/Cereals 7 Meat/Poultry 6 Produce 5 Seafood 12 NULL 77 (9 row(s) affected)
Where and Having The ‘WHERE’ allows filtering on rows. –e.g. Show all order details where the value of the order detail is > €5,000 The ‘HAVING’ allows filtering on summary rows. –e.g. Show all orders where the value of the order is > €10,000
Where Return all orderlines that cost more than €8,000 Use Northwind /* Switches the database*/ Select OrderId, cast(((UnitPrice*Quantity)- Discount) /* price per order detail */ as decimal(10,2)) /* format for cast */ as OrderLineValue /* name the derived column */ from [Order details] where(UnitPrice*Quantity)- Discount >8000 /* where the cost of the row is > 8,000
output OrderId OrderLineValue (8 row(s) affected)
Having Return all Orders that are worth more than €10,000 Use Northwind Select Orders.OrderId, cast(sum((UnitPrice*Quantity)- Discount)as decimal(10,2))as OrderValue from Orders join [Order details] on Orders.OrderId = [Order details].OrderId group by Orders.OrderId having sum((UnitPrice*Quantity)- Discount) >10000
Having – with comments Select Orders.OrderId, /* the order number */ cast/* format the output */ (sum/* sum over the order */ ((UnitPrice*Quantity)- Discount) /* price per order detail */ as decimal(10,2))/* format for cast */ as OrderValue /* name the summary column */ from Orders join [Order details] on Orders.OrderId = [Order details].OrderId group by Orders.OrderId /* group the order details by order */ having sum((UnitPrice*Quantity)- Discount) >10000 /* sale value of order > */
output OrderId OrderValue (14 row(s) affected)
Tables joined to themselves. This generally takes place in a hierarchy. Two commonplace examples are A part is part of a part –E.g. a bulb is part of a light. –A light is part of a front grille. An employee reports to an employee
Example create table part ( PartNo numeric (4,0) unique not null, PartDesc varchar(20), PartOf numeric (4,0), primary key (PartNo), foreign key (PartOf) references Part(PartNo) ) insert into Part (PartNo, PartDesc) values (1, 'Biggest part') insert into Part (PartNo, PartDesc) values (2, 'Quite big part') insert into Part values (3, 'Component part',1) insert into Part values (4,'small part',3) select * from part select * from part as component join part as container on component.partof = container.partno
NorthWind ERD
Summary Queries Summary queries involve multiple rows in a table. The rows are GROUPED, using the GROUP BY clause. A function (MIN, MAX, AVG, SUM, COUNT) operates on one of the columns of the table. –E.g. select count(*) from products. –This will return the number of product ROWS in the table. –Select min(unitprice) from products Include or exclude grouped rows by using the HAVING clause
Summary Summarise the income by product, still ordered in the same way, giving totals for each product and totals for each category. NOTE: –There are two summaries here – one per product and one per category. –Use the ‘WITH CUBE’ clause.
Query Select categoryName, cast(ProductName as char(25)) as Product, cast (sum(([order details].unitPrice*Quantity)-Discount) as decimal(10,2))as income from categories join products on categories.categoryid = products.categoryid join [order details] on products.productid = [order details].productid group by categoryname, productname with cube
categoryName Product income Beverages Chai Beverages Chang Beverages Chartreuse verte Beverages Côte de Blaye Beverages Guaraná Fantástica Beverages Ipoh Coffee Beverages Lakkalikööri Beverages Laughing Lumberjack Lager Beverages Outback Lager Beverages Rhönbräu Klosterbier Beverages Sasquatch Ale Beverages Steeleye Stout Beverages NULL Condiments Aniseed Syrup Condiments Chef Anton's Cajun Season Summary row
With Rollup ROLLUP specifies that, in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. Groups are summarized in a hierarchical order, from the lowest level in the group to the highest. The group hierarchy is determined by the order in which the grouping columns are specified. Changing the order of the grouping columns can affect the number of rows produced in the result set.
Example Select categoryname, count(*)as ‘Number of products’ from categories join products on products.categoryId = categories.categoryId group by categoryname with rollup;
Result categoryname Number of products Beverages 12 Condiments 12 Confections 13 Dairy Products 10 Grains/Cereals 7 Meat/Poultry 6 Produce 5 Seafood 12 NULL 77 (9 row(s) affected)
Where and Having The ‘WHERE’ allows filtering on rows. –e.g. Show all order details where the value of the order detail is > €5,000 The ‘HAVING’ allows filtering on summary rows. –e.g. Show all orders where the value of the order is > €10,000
Where Return all orderlines that cost more than €8,000 Use Northwind /* Switches the database*/ Select OrderId, cast(((UnitPrice*Quantity)- Discount) /* price per order detail */ as decimal(10,2)) /* format for cast */ as OrderLineValue /* name the derived column */ from [Order details] where(UnitPrice*Quantity)- Discount >8000 /* where the cost of the row is > 8,000
output OrderId OrderLineValue (8 row(s) affected)
Having Return all Orders that are worth more than €10,000 Use Northwind Select Orders.OrderId, cast(sum((UnitPrice*Quantity)- Discount)as decimal(10,2))as OrderValue from Orders join [Order details] on Orders.OrderId = [Order details].OrderId group by Orders.OrderId having sum((UnitPrice*Quantity)- Discount) >10000
Having – with comments Select Orders.OrderId, /* the order number */ cast/* format the output */ (sum/* sum over the order */ ((UnitPrice*Quantity)- Discount) /* price per order detail */ as decimal(10,2))/* format for cast */ as OrderValue /* name the summary column */ from Orders join [Order details] on Orders.OrderId = [Order details].OrderId group by Orders.OrderId /* group the order details by order */ having sum((UnitPrice*Quantity)- Discount) >10000 /* sale value of order > */
output OrderId OrderValue (14 row(s) affected)