LIS 557 Database Design and Management William Voon Michael Cole Spring '04.

LIS 557 Database Design and Management William Voon Michael Cole Spring '04

Table Operations to the ER 12 February 2004

Going deeper And deeper still Green Mountains - Santoka Taneda (1882 - 1940)

A New World ● A database is rather like a world, it contains entities with properties and relationships to other entities ● Last week, we saw that this world consists of individual sets of entities (=tables) and links between them ● Tonight, we will construct a language to talk to this world...

What do we need to talk about? ● The database is conceived as related tables, so our language is about tables – It must have words for the elements of tables (row=entity, column=attribute) and of course tables themselves – It must have verbs to express actions on tables ● What are all of the actions we can perform on tables?

Operations on Tables ● Ted Codd was a mathematician and his conception of the relational database was in mathematical terms ● Tables break the database 'thing' into logical units (table=entity set=relation) ● We can process tables mathematically

The Value of Operations ● Remember that Codd's insight was that the system could present a logical view of the data that hid the complexities of how the data was actually stored and managed ● Remember those teams of programmers needed to construct queries before the RDB? ● The operations on tables are the basis for a language that can be used to construct efficient queries. – It is based on simple set operations (union and intersection), but allows one to build a complete language (SQL) to talk to the database.

The Operations ● UNION – combines rows (entities) from two tables ● INTERSECT – produces rows (entities) that are common to two tables ● DIFFERENCE – produces the rows (entities) that appear in only one of the tables ● PRODUCT – produces all possible pairs from the two tables ● SELECT – produces values for all columns (=attributes) in a given table ● PROJECT – produces all values for selected columns (=attributes) ● JOIN – combines two or more tables (=entity sets) ● DIVIDE – finds common values in an attribute that is associated with values in a different specified attribute (wait for the example)

UNION Combine same attribute name; must have same data type

INTERSECTION Common rows for same attribute name; must have same data type

DIFFERENCE Opposite of INTERSECTION

PRODUCT Resulting table has every possible combination of rows (= entities). So here 3x2=6 rows, but 20x30=600 rows!

SELECT This selects entities by attribute SELECT SALES greater than 500000 SELECT SALES less than 1000000

PROJECT PROJECT NAME PROJECT NAME and PHONE

JOIN ● The JOIN operator is the real power in database language because it allows us to create new tables ● Two or more tables can be JOINed. Joining more than two tables is really a sequence of joining two tables at a time.

JOIN flavors ● Natural JOIN – new table includes only common entities (=rows), eliminates duplicate attributes (=columns) ● EquiJOIN – New table includes only entities that have attributes that are equal to some value; duplicate attributes are not eliminated – If a condition other than = is used, this is called a thetaJOIN

JOIN Flavors (cont) ● Outer JOIN – New table has all entities (=rows) from each of the tables; blank or null values are inserted where an entity in one table does not have an attribute (=column) specified in another table

Natural JOIN JOIN Result of PRODUCT then SELECT CUSNUM (gets common entities) then PROJECT (eliminates duplicate attributes)

Outer JOIN JOIN All entities are preserved and placeholders created for missing entity attributes (here Blakey's phone and REPNUM)

Left, Right Outer JOIN ● Left outer JOIN – all the entities (=rows) from the first table are included ● Right outer JOIN – all the entities (=rows) from the second table are included ● Full outer JOIN (illustrated on previous slide) – All entities from both the tables are included Exercise: Look at previous slide and construct the left and right outer join

DIVIDE Only defined for dividing one attribute into a table of two attributes. Result is the common value(s) for those attributes in the other column (i.e. another attribute)

The Data Dictionary ● Documents all of the tables (=entity sets) ● Consists of metadata about the entity types and their attributes

Data dictionary example

Relationships within Relational Databases ● Tables are entity sets ● Relationships between entities tell us: – how many tables (=entity sets) are needed, and – what links (keys) are required ● The types of relationships of interest – One-to-many (1:M) – Many-to-many (M:N) – One-to-one (1:1)

Relationship Types and Tables ● One-to-many (1:M) – The goal for the database design, it shows us that a specific entity type (in one table) naturally links to another table of a particular entity type ● Many-to-many (M:N) – Tells us that these entity types need to further examined to find 1:M relationships ● One-to-One (1:1) – Usually tells us some tables should be combined, but sometimes it is efficient to have tables reflecting 1:1 relationships

Entity Relationship Diagram ● Chen convention for ERDs – Rectangles represent entities – Entity names = nouns and is fully capitalized (and never plural because these diagrams are about entity types – e.g. PAINTER, PAINTING) – Diamonds represent relationships between entities – Relationships are active or passive verbs and written in lowercase – 1, M=many, N=many (used in many-to-many relationships)

One-to-Many A painter can have many paintings, but a painting can have but one painter. What do the table relationships look like?

Many-to-many Each CLASS has many students, and each STUDENT has many classes This can be reflected in a table relationship, (How?) but it is usually more efficient to have two 1:M tables

Many-to-many Tables Why is this inefficient?

Bridging from M:N to 1:M To break down a M:N relationship, create a new entity that bridges between the two entities So from STUDENT to CLASS build a new table with a link (=key) to STUDENT and a link to CLASS This is efficient because only the key information is redundant. These tables are composite or bridge entities

Bridging Tables

The Bridging Process

Data Redundancy Revisited ● Last meeting we saw how to use linked tables to reduce data redundancy ● The links (foreign keys) are attributes that appear in both tables. ● While the foreign key values are often repeated in a table, they save us from having to repeat the data for several attributes.

Indexes ● To speed access to data, RDBs will typically include special index tables ● Indexes have a key and a pointer and match a primary key (the identifier for a particular entity) with the primary keys in another table

An Index Example The primary key for each painter has pointers to quickly access the paintings in the linked table

Assignment ● Design a collection of tables for a database ● Briefly, what is the database about? Who are the users? What purposes will the database serve? – At least three related tables – Explain why this is the most efficient design – Is this also the most logical design?

LIS 557 Database Design and Management William Voon Michael Cole Spring '04.

Similar presentations

Presentation on theme: "LIS 557 Database Design and Management William Voon Michael Cole Spring '04."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

LIS 557 Database Design and Management William Voon Michael Cole Spring '04.

Similar presentations

Presentation on theme: "LIS 557 Database Design and Management William Voon Michael Cole Spring '04."— Presentation transcript:

Similar presentations

About project

Feedback