SQL
SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: ANSI SQL, SQL92 (a.k.a. SQL2), SQL99 (a.k.a. SQL3), …. Vendors support various subsets(e.g. PLSQL from Oracle) The original full name was SEQUEL, which stood for "Structured English Query Language". It later had to be renamed to SQL due to trademark issues. So basically, it was yet another attempt to sell a programming language as "just like English, except with a formal syntax" (hence "structured").
SQL Data Definition Language (DDL) Create/alter/delete tables and their attributes Data Manipulation Language (DML) Query one or more tables – discussed next ! Insert/delete/modify tuples in tables Connect to mysql on dbsrv2 mysql -h dbsrv2.cs.fsu.edu -u username -p
Tables in SQL Table name Attribute names Product PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi Tuples or rows
Create a Tables CREATE TABLE table_name ( column_name1 data_type(size), column_name2 data_type(size), column_name3 data_type(size), .... );
Tables Explained The schema of a table is the table name and its attributes: Product(PName, Price, Category, Manfacturer) A key is an attribute whose values are unique; we underline a key Alternate key - An alternate key is any candidate key which is not selected to be the primary key Candidate key - A candidate key is a field or combination of fields that can act as a primary key field for that table to uniquely identify each record in that table. Compound key - compound key (also called a composite key or concatenated key) is a key that consists of 2 or more attributes. Primary key - a primary key is a value that can be used to identify a unique row in a table. Attributes are associated with it. Examples of primary keys are Social Security numbers (associated to a specific person) or ISBNs (associated to a specific book). In the relational model of data, a primary key is a candidate key chosen as the main method of uniquely identifying a tuple in a relation. Superkey - A superkey is defined in the relational model as a set of attributes of a relation variable (relvar) for which it holds that in all relations assigned to that variable there are no two distinct tuples (rows) that have the same values for the attributes in this set. Equivalently a superkey can also be defined as a set of attributes of a relvar upon which all attributes of the relvar are functionally dependent. Foreign key - a foreign key (FK) is a field or group of fields in a database record that points to a key field or group of fields forming a key of another database record in some (usually different) table. Usually a foreign key in one table refers to the primary key (PK) of another table. This way references can be made to link information together and it is an essential part of database normalization
Data Types in SQL Atomic types: Characters: CHAR(20), VARCHAR(50) Numbers: INTEGER(INT), FLOAT, DOUBLE, BIGINT Others: DATE, DATETIME, BLOB, … Data types might have different names in different database! A Binary Large OBject (BLOB) is a collection of binary data stored as a single entity
Tables Explained A tuple = a record Restriction: all attributes are of atomic type A table = a set of tuples Like a list… …but it is unorderd: no first(), no next(), no last().
SQL QUERY Basic form: (plus many many more bells and whistles) SELECT <attributes> FROM <one or more relations> WHERE <conditions>
SELECT * FROM Product WHERE category=‘Gadgets’ Simple SQL QUERY Product PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi SELECT * FROM Product WHERE category=‘Gadgets’ PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 “selection”
Simple SQL QUERY Product PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi SELECT PName, Price, Manufacturer FROM Product WHERE Price > 100 PName Price Manufacturer SingleTouch $149.99 Canon MultiTouch $203.99 Hitachi “selection” and “projection”
SQL Details Case insensitive: Same: SELECT Select select Same: Product product Different: ‘Seattle’ ‘seattle’
SELECT * FROM Products WHERE PName LIKE ‘%gizmo%’ The LIKE Operator SELECT * FROM Products WHERE PName LIKE ‘%gizmo%’ LIKE p: pattern matching on strings p may contain two special symbols: % = any sequence of characters _ = any single character
Eliminating Duplicates Category Gadgets Photography Household SELECT DISTINCT category FROM Product Compare to: Category Gadgets Photography Household SELECT category FROM Product
Ordering the Results SELECT pname, price, manufacturer FROM Product WHERE category=‘gizmo’ AND price > 50 ORDER BY price, pname Ties are broken by the second attribute on the ORDER BY list, etc. Ordering is ascending, unless you specify the DESC keyword.
Keys and Foreign Keys Company Key Product Foreign key CName StockPrice Country GizmoWorks 25 USA Canon 65 Japan Hitachi 15 Key Product Foreign key PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi A FOREIGN KEY is a key used to link two tables together. A FOREIGN KEY in a table points to a PRIMARY KEY in another table.
Aggregation SELECT avg(price) FROM Product WHERE maker=“Toyota” SELECT count(*) FROM Product WHERE year > 1995 SQL supports several aggregation operations: sum, count, min, max, avg Except count, all aggregations apply to a single attribute
Aggregation : Count COUNT applies to duplicates, unless otherwise stated: same as Count(*) SELECT Count(category) FROM Product WHERE year > 1995 We probably want: SELECT Count(DISTINCT category) FROM Product WHERE year > 1995
More Examples Purchase(product, date, price, quantity) SELECT Sum(price * quantity) FROM Purchase What do they mean ? SELECT Sum(price * quantity) FROM Purchase WHERE product = ‘bagel’
Purchase Simple Aggregation Product Date Price Quantity Bagel 10/21 1 20 Banana 10/3 0.5 10 10/10 10/25 1.50 SELECT Sum(price * quantity) FROM Purchase WHERE product = ‘bagel’ 50 (= 20+30)
Modifying the Database Three kinds of modifications Insertions Deletions Updates Sometimes they are all called “updates”
Insertions General form: INSERT INTO R(A1,…., An) VALUES (v1,…., vn) Example: Insert a new purchase to the database: INSERT INTO Purchase(buyer, seller, product, store) VALUES (‘Joe’, ‘Fred’, ‘wakeup-clock-espresso-machine’, ‘The Sharper Image’) Missing attribute NULL. May drop attribute names if give them in order.
Insertions INSERT INTO PRODUCT(name) SELECT DISTINCT Purchase.product FROM Purchase WHERE Purchase.date > “10/26/01” The query replaces the VALUES keyword. Here we insert many tuples into PRODUCT
Product(name, listPrice, category) Insertion: an Example Product(name, listPrice, category) Purchase(prodName, buyerName, price) prodName is foreign key in Product.name Suppose database got corrupted and we need to fix it: Purchase Product prodName buyerName price camera John 200 gizmo Smith 80 225 name listPrice category gizmo 100 gadgets Task: insert in Product all prodNames from Purchase
INSERT INTO Product(name) SELECT DISTINCT prodName FROM Purchase Insertion: an Example INSERT INTO Product(name) SELECT DISTINCT prodName FROM Purchase WHERE prodName NOT IN (SELECT name FROM Product) name listPrice category gizmo 100 Gadgets camera -
INSERT INTO Product(name, listPrice) SELECT DISTINCT prodName, price Insertion: an Example INSERT INTO Product(name, listPrice) SELECT DISTINCT prodName, price FROM Purchase WHERE prodName NOT IN (SELECT name FROM Product) name listPrice category gizmo 100 Gadgets camera 200 - camera ?? 225 ?? Depends on the implementation
Deletions Example: DELETE FROM PURCHASE WHERE seller = ‘Joe’ AND product = ‘Brooklyn Bridge’ Factoid about SQL: there is no way to delete only a single occurrence of a tuple that appears twice in a relation.
Updates Example: UPDATE PRODUCT SET price = price/2 WHERE Product.name IN (SELECT product FROM Purchase WHERE Date =‘Oct, 25, 1999’);
Alter-Add ALTER TABLE table_name ADD column_name datatype
Alter- Drop ALTER TABLE table_name DROP column_name