for Computer Professionals Relational Theory for Computer Professionals a technical tutorial by C. J. Date Copyright © C.J. Date 2013. All rights reserved. No part of this material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or otherwise, without the explicit written permission of the copyright owner.
INTENDED AUDIENCE : I assume you know something about computers and programming in general I do NOT assume you know anything about databases (relational or otherwise) … though you very likely do know that essentially all modern DB systems are supposed to be “relational” /* whatever that means! */ Objectives: To explain relational database (RDB) basics More specifically, to describe the relational model /* theoretical underpinnings for RDB technology */ Copyright C. J. Date 2013
INTENDED AUDIENCE bis : To repeat: I do NOT assume you know anything about databases (relational or otherwise) In fact, if you do already know something about databases, please pay extra careful attention! … Because you might have to do some unlearning … Copyright C. J. Date 2013
ANOTHER PRELIMINARY : There’s inevitably a certain amount of overlap between this seminar and its companion seminar SQL and Relational Theory But unlike this seminar, SQL and Relational Theory is explicitly aimed at audiences who do know something about databases—in fact, audiences that have several years of direct SQL experience To repeat, the present seminar is for (database) beginners Copyright C. J. Date 2013
OVERALL AGENDA : Part I : Foundations Entr’acte Part II : SQL Copyright C. J. Date 2013
PART I : Foundations Copyright C. J. Date 2013
WHAT’S A DATABASE ??? “Electronic filing cabinet”: Digitized information (“data”), kept in persistent storage, typically magnetic disk User /* online user or application programmer */ can insert new information & delete, change, or retrieve existing information How? By issuing requests or commands to the DB system Requests can be formulated in numerous different ways (e.g., by pointing and clicking) … but for our purposes I’ll assume text format … E.g., EMP WHERE JOB = 'Programmer' Copyright C. J. Date 2013
THE SUPPLIERS-AND-PARTS DATABASE : SNO SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 S4 Clark S5 Adams Athens SNO PNO QTY S1 P1 300 P2 200 P3 400 P4 P5 100 P6 S2 S3 S4 S SP PNO PNAME COLOR WEIGHT CITY P1 Nut Red 12.0 London P2 Bolt Green 17.0 Paris P3 Screw Blue Oslo P4 14.0 P5 Cam P6 Cog 19.0 P Copyright C. J. Date 2013
DB SYSTEM ARCHITECTURE : suppliers, parts, and shipments (“logical DB”) DB management system (DBMS) data as physically stored (“physical DB”) Copyright C. J. Date 2013
IN OTHER WORDS : “Logical DB” is the DB as perceived by the user (an abstraction) ... hence data independence “Physical DB” is the DB as perceived by the DBMS (still an abstraction!) Copyright C. J. Date 2013
SO WHAT’S A DBMS ??? Intermediary between logical and physical DBs … Supports user interface Interprets and responds to (“executes”) user requests ... both queries and updates Protects users from data (low level details) Protects data from users! … Provides security,* concurrency,* integrity, and recovery controls * Such controls are necessary because DB is typically shared Copyright C. J. Date 2013
Security: Users can perform only those operations they’re allowed to perform Concurrency: Operations performed by one user aren’t allowed to interfere with those performed “at the same time” by some other user Integrity: Operations that are known to be “incorrect” aren’t allowed Recovery: The database never forgets anything it’s been told Copyright C. J. Date 2013
I do not use the term database BY THE WAY : Please note: I do not use the term database to mean a DBMS !!! Copyright C. J. Date 2013
SO WHAT’S AN RDBMS ??? Well, it’s a DBMS … So it provides all of the usual DBMS functionality (data storage, query / update, security, concurrency, integrity, recovery, etc.) But it’s also relational … So the user interface is based on the relational model /* better: is a faithful implementation of the relational model */ I.e., the relational model is a recipe for what the user interface is supposed to look like Copyright C. J. Date 2013
SIMPLIFY , SIMPLIFY : To repeat: The relational model is a recipe for what the user interface is supposed to look like ... And that recipe is very simple !!! The rules of the relational model are not a straitjacket, but rather a discipline that makes life much easier for the user “Our life is frittered away by detail ... Simplify, simplify.” —Henry David Thoreau Copyright C. J. Date 2013
THE RELATIONAL MODEL* : As far as the user is concerned: Data looks relational Relational operators are available (for operating on data in relational form and hence for formulating user requests): e.g., S WHERE CITY = 'London' /* restrict suppliers to just the ones in London */ * E.F. Codd : 1969, 1970, … Copyright C. J. Date 2013
SO HERE’S THE PLAN : Data looks relational So we’ll take a closer look at exactly what RELATIONS are Relational operators are available So we’ll take a closer look at various RELATIONAL OPERATORS and see how they can be used Please note: The treatment is not meant to be exhaustive !!! Copyright C. J. Date 2013
UNFORTUNATELY ... There’s a problem! The relational model as such doesn’t prescribe a concrete syntax for how its concepts are supposed to be realized in practice /* i.e., it’s somewhat abstract */ There is a standard concrete language, viz. SQL, which is supported (more or less) by most current DB products But that standard is very bad … It’s complex, incomplete, hard to learn, and actively misleading in numerous ways Copyright C. J. Date 2013
SO HERE’S THE PLAN bis : First, teach the relational model without SQL … using a language called Tutorial D that has been expressly designed for the purpose Second, show how ideas from the relational model map into SQL syntax … NOT attempting to teach the whole of the SQL language, only as much as we need Be aware, therefore, that what’s coming up does NOT look much like most “RDB intro” books or presentations !!! Copyright C. J. Date 2013
PROGRAMMING 101 : VAR i , n , sum INTEGER ; VAR A ARRAY[1..n] OF INTEGER ; sum := 0 ; i := 0 ; DO WHILE i < n ; i := i + 1 ; sum := sum + A[i] ; END DO ; PRINT ( ‘The sum is ’, sum ) ; /* nine STATEMENTS */ Copyright C. J. Date 2013
POINTS TO NOTE : Types (sets of legal values): INTEGER, array of INTEGERs, CHAR Variables: i, n, sum, A Assignment: “:=” ... for updating variables Literals: 0, 1, 'The sum is ' Values: Denoted by expressions /* possibly literals or variable refs */ “Read-only” operators: “+” ... for deriving “new” values from “old” Comparison operators: “<” ... /* special case of previous */ All of these concepts are directly relevant to databases! Copyright C. J. Date 2013
MORE ON TYPES : Types: INTEGER, array of INTEGERs, CHAR, BOOLEAN Example didn’t illustrate the point, but types can be either system defined or user defined /* of arbitrary complexity */ … E.g., in a geometric application, we might have user defined types POINT, RECTANGLE, ELLIPSE, etc. But to the user who merely uses them (as opposed to the user who defines them), user defined types look just like system defined types anyway All of these concepts are directly relevant to databases! Copyright C. J. Date 2013
FURTHERMORE : Every value is of some type … Hence: Every variable, every parameter to every operator, every read-only operator, and every expression (in particular, every literal and every variable reference) is declared to be of some type … because these constructs all denote values All of these concepts are directly relevant to databases! Copyright C. J. Date 2013
FINALLY : Associated with type T is a set of operators for operating on values and variables of type T … E.g., system defined type INTEGER: System defines “:=”, “=”, “<”, etc., for assigning and comparing integers And “+”, “*”, etc., for arithmetic on integers Perhaps CAST to convert integers to character strings But not “||”, SUBSTR, etc. Copyright C. J. Date 2013
NOTE THAT : Operators defined for type T must include “:=” and “=” !!! Regarding “=” in particular: v1 = v2 is TRUE if and only if v1 and v2 are the very same value ... Hence if Op(v1) ≠ Op(v2) for some operator Op, v1 = v2 must be FALSE Copyright C. J. Date 2013
EXERCISES : /* and/or review questions ... */ What’s a database? What’s a DBMS? What do you understand by (a) security controls, (b) integrity controls, (c) concurrency controls, (d) recovery controls? What’s SQL? What’s Tutorial D? Copyright C. J. Date 2013
EXERCISES (cont.) : What do you think the following Tutorial D expressions represent? P WHERE WEIGHT < 12.5 P { PNO , COLOR , CITY } What do you think the following Tutorial D statements do? DELETE S WHERE STATUS = 10 ; UPDATE S WHERE STATUS > 10 : { STATUS := STATUS + 5 } ; Copyright C. J. Date 2013
EXERCISES (cont.) : Explain the following: Type Value Variable Literal Assignment Comparison Read-only operator Update operator Copyright C. J. Date 2013