Digital recordkeeping and preservation I Databases : SQL ARK2100 Digital recordkeeping and preservation I 2016 Thomas Sødring thomas.sodring@hioa.no P48-R407 67238287
Structured Query Language (SQL) SQL is a database language designed for managing data in relational databases Strictly speaking SQL includes DML, DDL and DCL commands Select * from Car; Car registrationNr chassisNr colour manufacturer model LH12984 10946534 Red Volkswagen Golf DK23491 9648573 Blue Toyota Yaris BP12349 5523840 Green Skoda Fabia ZT97495 2643923 White Seat Leon
SELECT Attribute(s) FROM Relation;
SQL Examples SELECT * FROM Car; SELECT registrationNr FROM Car; SELECT registrationNr, colour FROM Car; SELECT * FROM Airport; SELECT IntCode FROM Airport; SELECT * FROM Building;
The previous examples gave us all the rows in the table, but you often only want a specific row or field in the database To filter we use WHERE
SELECT Attribute(s) FROM Relation WHERE Attribute = 'value';
SELECT with WHERE SELECT * FROM Car WHERE colour = 'red'; SELECT * FROM Car WHERE colour = 'red' AND model = 'Golf'; SELECT registrationNr, colour FROM Car WHERE registrationNr = 'ZH10045'; SELECT name FROM Airport WHERE country = 'Norway';
SQL wildcards SQL supports wildcards that can be used to increase the number of rows in a query They make the search 'fuzzy' SQL uses % and _ as wildcards % means any text up to this point %land - Ireland, England, Finland, Iceland _ means any character at this point _an - van, ban, man Can be useful if you only remember part of a string The equivalents for '%' and '_' in operating system terminals are '*' and '?'
Wildcard examples Search for any registration number that begins with LH LIKE 'LH%' Search for any registration number that contains the number '12' LIKE '%12%' Search for any registration number that begins with 'L' and the number section contains '12345' LIKE 'L_12345'
SELECT with wildcards SELECT * FROM Car WHERE registrationNr LIKE 'LH%'; registrationNr LIKE '%12%'; registrationNr LIKE 'L_12345';
SQL DISTINCT From the data in your tables, you may want a list of unique or distinct values in an attribute / table The Car table will have many cars from many different manufacturers Volkswagen, Skoda, Renault Create a list of all the distinct (unique) car manufacturers that are in the database Imagine trying to do this manually Very good example of why we digitise
SELECT with DISTINCT Create a list of the different car manufacturers Create a list of the different car models SELECT DISTINCT manufacturer FROM Car; SELECT DISTINCT model FROM Car; List of manufacturers and models?
SELECT and LIMIT You can limit the result set and specify the maximum number of rows you want back You can also limit the result set by ranges, e.g. ignore the first two rows and fetch the next 5 SELECT * FROM Car LIMIT 3; SELECT * FROM Car LIMIT 2,5;
SELECT with ORDER BY Remember a database will retrieve data a row at a time The DBMS returns results in the same order as it retrieves data from the table Sometimes it is desirable to sort these results
WHERE Attribute = 'value' SELECT Attribute(s) FROM Relation WHERE Attribute = 'value' ORDER BY Attribute [ ASC | DESC ] ;
SELECT with ORDER BY SELECT * FROM Car WHERE colour = 'red' LIMIT 20; ORDER BY chassisNr ASC LIMIT 20; ORDER BY model DESC LIMIT 20; WHERE colour= 'red' ORDER BY manufacturer LIMIT 20;
SQL GROUP BY A DBMS supports a number of built-in calculation functions These can also be used in conjunction with GROUP BY clause AVG () MAX () MIN () COUNT () MySQL supports far more than these statistics
SQL GROUP BY GROUP BY can answer questions like: How many Cars are there of each colour? How many 'Volkswagen Golf' are there of each colour? SELECT colour, COUNT( * ) AS 'Total' FROM Car GROUP BY colour; SELECT colour, COUNT( * ) AS 'Golf colours' FROM Car WHERE model = 'Golf' GROUP BY colour;
Crime Scene Investigators A red car with LH in the registration number was observed driving away from the scene of an accident Witnesses claim the car was either a Volkswagen Golf or Renault Megane With your advanced database skills can you help the police narrow down the search and provide a list of potential cars The solution can be several SQL commands
or CSI solution SELECT * FROM Car WHERE registrationNr LIKE 'LH%' AND colour = 'red' AND model = 'Golf'; colour = 'red' AND model = 'Megane'; or SELECT * FROM Car WHERE registrationNr LIKE 'LH%' AND colour = 'red' AND (model = 'Megane' OR model = 'Golf');
null Null has a special meaning in a database It means no value has been assigned In phpmyadmin, a blank field that has null will typically be denoted with the word NULL in italics You can also explicitly state that a field is blank in an insert statement But null and 'null' are NOT the same thing Using 'null' results in an assignment of the textual string 'null' to a field and as such is not null
null So we use null to say that we do not have a value The following is incorrect: INSERT INTO Car VALUES ( "RQ10000", "9815932215", null, null, "A3" ); INSERT INTO Car VALUES ( "RQ10000", "9815932215", "null", "null", "A3" );
SQL and JOINs So far we have explored SQL where the data is contained within a single table This really only corresponds to flat files and does not reflect the relational model with data across tables Data in tables can be joined in different ways Inner / Natural join Right outer join Left outer join Full outer join Anti join
SQL and JOINs We introduce two new tables, Employee and Department empNr firstname surname department department manager 1 John Smith Sales Sales Celine 2 Jim Hansen Finance Admin Oliver 3 Frank Burton Sales Finance George 4 Mary Abbot Finance Departments.Department is a foreign key that refers to Employee.Department
SQL INNER Join A SQL INNER JOIN between two tables should return all rows that match on a common attribute SELECT * FROM Employee INNER JOIN Department ON Employee.department = Department.department;
SQL INNER JOIN Employee Department
⋈ SQL INNER Join Jim Finance John Sales Mary Shipping Frank 1 2 3 4 Employee firstname department empNr Hansen Smith Abbot Burton surname Admin Sales Department department Oliver Celine manager Finance George ⋈ Jim Finance John Sales Frank 1 2 3 Employee ⋈ Department firstname department empNr Hansen Smith Burton surname George Celine manager
SQL NATURAL Join A SQL NATURAL JOIN is like an INNER JOIN but the common column is not repeated Joins on columns with the same name and same data type SELECT * FROM Employee NATURAL JOIN Department;
SQL LEFT OUTER JOIN A SQL LEFT OUTER JOIN between two tables, A and B, should return all rows in A and all rows in B that match rows in A on a common attribute SELECT * FROM Employee LEFT OUTER JOIN Department ON Employee.department = Department.department;
SQL LEFT OUTER JOIN Employee Department
⟕ SQL LEFT OUTER JOIN Jim Finance John Sales Mary Shipping Frank 1 2 3 4 Employee firstName department empNr Hansen Smith Abbot Burton surname Admin Sales Department department Oliver Celine manager Finance George ⟕ Employee ⋈ Department empNr firstname surname department manager 1 John Smith Sales Celine 2 Jim Hansen Finance George 3 Frank Burton Sales Celine 4 Mary Abbot null null
SQL RIGHT OUTER JOIN A SQL RIGHT OUTER JOIN between two tables, A and B, should return all rows in B and all rows in A that match rows in B on a common attribute SELECT * FROM Employee RIGHT OUTER JOIN Department ON Employee.department = Department.department;
SQL RIGHT OUTER JOIN Employee Department
⟖ SQL RIGHT OUTER JOIN Jim Finance John Sales Mary Shipping Frank 1 2 3 4 Employee firstName department empNr Hansen Smith Abbot Burton surname Admin Sales Department department Oliver Celine manager Finance George ⟖ Employee ⋈ Department empNr firstname surname department manager 1 John Smith Sales Celine 2 Jim Hansen Finance George 3 Frank Burton Sales Celine null null null Admin Oliver
SQL FULL OUTER JOIN Employee Department A SQL FULL OUTER JOIN between two tables, A and B, should return all rows in A, all rows in B and all rows in A and B that match on a common attribute Employee Department
SQL FULL OUTER JOIN MySQL does not support a FULL OUTER JOIN Can be emulated by combining a left and right outer join SELECT * FROM Employee LEFT OUTER JOIN Department ON Employee.department = Department.department UNION SELECT * FROM Employee RIGHT OUTER JOIN Department.department;
⟕ SQL FULL OUTER JOIN Jim Finance John Sales Mary Shipping Frank 1 2 3 4 Employee firstName department empNr Hansen Smith Abbot Burton surname Admin Sales Department department Oliver Celine manager Finance George ⟕ Employee ⋈ Department empNr firstname surname department manager 1 John Smith Sales Celine 2 Jim Hansen Finance George 3 Frank Burton Sales Celine 4 Mary Abbot null null null null null Admin Oliver
SQL ANTI JOIN A SQL Anti Join between two tables, A and B, is the set of all rows in A that do not have a matching row in B on a common attribute Can answer the following question Which Employee has no Manager / Department? SELECT * FROM Employee WHERE ( department ) NOT IN ( SELECT department FROM Department );
∆ SQL ANTI JOIN Jim Finance John Sales Mary Shipping Frank 1 2 3 4 Employee firstName department empNr Hansen Smith Abbot Burton surname Admin Department Oliver Celine manager George ∆ firstname Employee Departments
Views A view is a stored SQL query that looks like a table and becomes a sort of virtual table A view can be used to define a subset of tables / data link data from multiple tables (JOIN) You can limit a users access to a view Views can be specified with read or write access to data
View with read access CREATE VIEW AllData AS A view that combines the Employee and Department for a table / view CREATE VIEW AllData AS SELECT empNr, firstname, surname, Employee.department, manager FROM Employee, Department WHERE Employee.department = Department.department;
Index If you want to search for data from a particular attribute (column) and the table has large amounts of data, a search can take considerable time (in seconds) There are 2.9 million cars in Norway, and if you search for chassisNr then you potentially have to search through all 2.9 million rows The car you are looking for was in row number 2,899,999 A DBMS supports indexes that makes it significantly faster to find a specific row The difference is anywhere from minutes / seconds to under one second depending on database size The primary key is indexed automatically Each index increases the size of the database on disk
ALTER TABLE CREATE INDEX ALTER TABLE Car ADD INDEX chassis_index (chassisNr); # or ADD UNIQUE