Presentation is loading. Please wait.

Presentation is loading. Please wait.

LIS 557 Database Design and Management William Voon Michael Cole Spring '04.

Similar presentations


Presentation on theme: "LIS 557 Database Design and Management William Voon Michael Cole Spring '04."— Presentation transcript:

1 LIS 557 Database Design and Management William Voon Michael Cole Spring '04

2 Table Operations to the ER 12 February 2004

3 Going deeper And deeper still Green Mountains - Santoka Taneda (1882 - 1940)

4 A New World ● A database is rather like a world, it contains entities with properties and relationships to other entities ● Last week, we saw that this world consists of individual sets of entities (=tables) and links between them ● Tonight, we will construct a language to talk to this world...

5 What do we need to talk about? ● The database is conceived as related tables, so our language is about tables – It must have words for the elements of tables (row=entity, column=attribute) and of course tables themselves – It must have verbs to express actions on tables ● What are all of the actions we can perform on tables?

6 Operations on Tables ● Ted Codd was a mathematician and his conception of the relational database was in mathematical terms ● Tables break the database 'thing' into logical units (table=entity set=relation) ● We can process tables mathematically

7 The Value of Operations ● Remember that Codd's insight was that the system could present a logical view of the data that hid the complexities of how the data was actually stored and managed ● Remember those teams of programmers needed to construct queries before the RDB? ● The operations on tables are the basis for a language that can be used to construct efficient queries. – It is based on simple set operations (union and intersection), but allows one to build a complete language (SQL) to talk to the database.

8 The Operations ● UNION – combines rows (entities) from two tables ● INTERSECT – produces rows (entities) that are common to two tables ● DIFFERENCE – produces the rows (entities) that appear in only one of the tables ● PRODUCT – produces all possible pairs from the two tables ● SELECT – produces values for all columns (=attributes) in a given table ● PROJECT – produces all values for selected columns (=attributes) ● JOIN – combines two or more tables (=entity sets) ● DIVIDE – finds common values in an attribute that is associated with values in a different specified attribute (wait for the example)

9 UNION Combine same attribute name; must have same data type

10 INTERSECTION Common rows for same attribute name; must have same data type

11 DIFFERENCE Opposite of INTERSECTION

12 PRODUCT Resulting table has every possible combination of rows (= entities). So here 3x2=6 rows, but 20x30=600 rows!

13 SELECT This selects entities by attribute SELECT SALES greater than 500000 SELECT SALES less than 1000000

14 PROJECT PROJECT NAME PROJECT NAME and PHONE

15 JOIN ● The JOIN operator is the real power in database language because it allows us to create new tables ● Two or more tables can be JOINed. Joining more than two tables is really a sequence of joining two tables at a time.

16 JOIN flavors ● Natural JOIN – new table includes only common entities (=rows), eliminates duplicate attributes (=columns) ● EquiJOIN – New table includes only entities that have attributes that are equal to some value; duplicate attributes are not eliminated – If a condition other than = is used, this is called a thetaJOIN

17 JOIN Flavors (cont) ● Outer JOIN – New table has all entities (=rows) from each of the tables; blank or null values are inserted where an entity in one table does not have an attribute (=column) specified in another table

18 Natural JOIN JOIN Result of PRODUCT then SELECT CUSNUM (gets common entities) then PROJECT (eliminates duplicate attributes)

19 Outer JOIN JOIN All entities are preserved and placeholders created for missing entity attributes (here Blakey's phone and REPNUM)

20 Left, Right Outer JOIN ● Left outer JOIN – all the entities (=rows) from the first table are included ● Right outer JOIN – all the entities (=rows) from the second table are included ● Full outer JOIN (illustrated on previous slide) – All entities from both the tables are included Exercise: Look at previous slide and construct the left and right outer join

21 DIVIDE Only defined for dividing one attribute into a table of two attributes. Result is the common value(s) for those attributes in the other column (i.e. another attribute)

22 The Data Dictionary ● Documents all of the tables (=entity sets) ● Consists of metadata about the entity types and their attributes

23 Data dictionary example

24 Relationships within Relational Databases ● Tables are entity sets ● Relationships between entities tell us: – how many tables (=entity sets) are needed, and – what links (keys) are required ● The types of relationships of interest – One-to-many (1:M) – Many-to-many (M:N) – One-to-one (1:1)

25 Relationship Types and Tables ● One-to-many (1:M) – The goal for the database design, it shows us that a specific entity type (in one table) naturally links to another table of a particular entity type ● Many-to-many (M:N) – Tells us that these entity types need to further examined to find 1:M relationships ● One-to-One (1:1) – Usually tells us some tables should be combined, but sometimes it is efficient to have tables reflecting 1:1 relationships

26 Entity Relationship Diagram ● Chen convention for ERDs – Rectangles represent entities – Entity names = nouns and is fully capitalized (and never plural because these diagrams are about entity types – e.g. PAINTER, PAINTING) – Diamonds represent relationships between entities – Relationships are active or passive verbs and written in lowercase – 1, M=many, N=many (used in many-to-many relationships)

27 One-to-Many A painter can have many paintings, but a painting can have but one painter. What do the table relationships look like?

28 Many-to-many Each CLASS has many students, and each STUDENT has many classes This can be reflected in a table relationship, (How?) but it is usually more efficient to have two 1:M tables

29 Many-to-many Tables Why is this inefficient?

30 Bridging from M:N to 1:M To break down a M:N relationship, create a new entity that bridges between the two entities So from STUDENT to CLASS build a new table with a link (=key) to STUDENT and a link to CLASS This is efficient because only the key information is redundant. These tables are composite or bridge entities

31 Bridging Tables

32 The Bridging Process

33 Data Redundancy Revisited ● Last meeting we saw how to use linked tables to reduce data redundancy ● The links (foreign keys) are attributes that appear in both tables. ● While the foreign key values are often repeated in a table, they save us from having to repeat the data for several attributes.

34 Indexes ● To speed access to data, RDBs will typically include special index tables ● Indexes have a key and a pointer and match a primary key (the identifier for a particular entity) with the primary keys in another table

35 An Index Example The primary key for each painter has pointers to quickly access the paintings in the linked table

36 Assignment ● Design a collection of tables for a database ● Briefly, what is the database about? Who are the users? What purposes will the database serve? – At least three related tables – Explain why this is the most efficient design – Is this also the most logical design?


Download ppt "LIS 557 Database Design and Management William Voon Michael Cole Spring '04."

Similar presentations


Ads by Google