Fundamentals/ICY: Databases 2013/14 WEEK 9 –Monday John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK
Class Test: ANSWERS – see site Marking – by end of term
Reminder
Two Other Properties uUnion distributes over intersection: A (B C) = (A B) (A C) uIntersection distributes over union: A (B C) = (A B) (A C)
New
Same Difference? uExercises for bath-time: Is the difference operation commutative or associative? And does it take part in any distributivity with the other operations?
“Tuples” in a Table The tuples are just lists representing the rows: ‘ A’, ‘Chopples’, 37 > ‘ Z’, ‘Blurp’, NULL > ‘ F’, ‘Rumpel’, 88 > PERS-IDNAMEAGE AChopples ZBlurp FRumpel88 People
Table Rows are “Tuples” uIn a table, each attribute has a “value domain” – the set of values that the attribute can have. E.g., the set of integers, the set of all character strings of any length, or the set of character strings of a specific format and length. uIf the attribute allows NULL values, we include NULL in the value domain as well. uThe values in a row form a tuple of values from the respective value domains. Just a list of the values, one for each attribute.
Tuples in General uA “tuple” in general is an ordered sequence of items of any sort. We will only deal with finite tuples. Items CAN be duplicated. l Can also be called a “vector” in other CS terminology. uNotation by angle brackets and commas: 6, JAB, 5, “JAB”, 5, , 9> Singleton and empty tuples:, <> uThe concatenation ( ⃘ ) of two tuples is just the result of putting them end to end to get one tuple. l ⃘ = l ⃘ <> = Ex: is concatenation commutative? associative?
Cartesian Products uThe set of all possible tuples formed from some list of sets is called the Cartesian product of the sets. Notation, e.g.: D E F G H if D, E, F, G, H are the sets—not necessarily different. The tuples are all possible tuples of the form where d D, e E, …, h H
Examples uLet A = {3, 8, 2} and B = {‘jjj’, ‘bb’}. Then A B = {,,,,, }. B B = {,,, }. A = = A A {TRUE} = {,, }
“Relations” uAny subset at all of a Cartesian product is called a relation on the sets in question (D, E, … above) l even the whole of the product (even if infinite) l and even the empty set. uI.e., a relation on D, E, …, H is just some set of tuples that are each of form where d D, e E, …, h H.
Examples uLet A = {3, 8, 2} and B = {‘jjj’, ‘bb’}. The Cartesian product A B = {,,,,, }. uSome relations on A and B: l {,, } l { } l A B l
Rows as forming a Relation uSo, for a given table, the tuples corresponding to all possible rows that you could create using whatever values you like from the value domains, forms the Cartesian product of the value domains of the table. uAnd, provided the table does not have repeated rows: AT ANY MOMENT the actual set of rows, considered as tuples, is a relation on the table’s value domains. l NB: crucial here that no row is exactly repeated, because a mathematical set cannot have repeated elements.
Relation from a Table The relation at the moment is ‘ A’, ‘Chopples’, 37 >, ‘ Z’, ‘Blurp’, NULL >, ‘ F’, ‘Rumpel’, 88 > PERS-IDNAMEAGE AChopples ZBlurp FRumpel88 People
A Table as a Relation? uPeople loosely talk about tables being relations. This is mathematically inaccurate for several reasons: 1)The table properly speaking includes not just the rows but also the attribute names themselves, their domains, specification of primary and foreign keys, etc. 2)It’s only the rows at any given moment that form a relation. When a value in the table changes or a row is added or deleted, the mathematical relation is replaced by a different one. 3)Relations do not cater for tables with repeated rows. ((ASIDE: But see next slide for a way out.)) But OK if you know what you (and those people) mean.
((ASIDE: “Bags” in Maths)) uA variant of sets called “bags” (or “multisets”) is used in maths (and CS) and allows repeated members. There are union, etc. operations that respect the repetitions. uSo bags and their operations are a better fit to DB tables and notably their repetition-respecting operations (e.g. UNION ALL) than sets and their operations are. uBut bags are non-standard and they’re not normally covered at an introductory level. uSee the databases textbook by Garcia-Molina et al 2009 for bags and their use in the DB area.