Relational Model CMSC 461 Michael Wilson
What is a relation? A relation is a set of tuples (d 1, d 2, d 3, …, d n ) Where each element d i is a member of D i, which is the data domain for d i The order of the tuples in the set is irrelevant A data domain describes all unique values that a particular data element may contain Corresponds to data types Each element is called an attribute value
A picture jacked shamelessly from Wikipedia
An example UMBC IDAgeFNameLNameGPA SM Mario 4.0 SM LuigiMario3.9 MT SamusAran4.0 FF SnowVilliers0.2 SW SnowWhite3.5
Relation schemas Relation schemas are basically groups of named attributes along with any constraints on the data housed by the attribute For the previous example, the relation schema would be: UMBC ID, Age, FName, LName, GPA String, Integer, String, String, Double
Translating to database terminology A relation looks a lot like a table An attribute looks a lot like a column A tuple looks a lot like a row Finally, a database contains many tables A database can be thought of as a set of relations
Differences between relations and tables Very minor stuff A table can have duplicate rows, whereas a relation cannot have duplicate tuples Tables have some metadata associated with them for programmatic reasons
Identification of tuples To identify a tuple, typically we’ll take a subset of the values in the tuple in order and refer to it that way Any subset of values that can uniquely identify the tuple is called a superkey Uniquely identifying a tuple requires that the superkey be unique As long as the subset itself is unique, it works Even if one or more values are shared between superkeys
Superkeys! UMBC IDAgeFNameLNameGPA SM Mario 4.0 SM LuigiMario3.9 MT SamusAran4.0 FF SnowVilliers0.2 SW SnowWhite3.5
Superkeys! UMBC IDAgeFNameLNameGPA SM Mario 4.0 SM LuigiMario3.9 MT SamusAran4.0 FF SnowVilliers0.2 SW SnowWhite3.5
More superkeys! Record IDPatientNameDate of VisitDiagnosis 14233Mario 04/28/1991Leg injury Snow Villiers12/05/2012Brain damage Lara Croft12/05/2012Powder burns
More superkeys! Record IDPatientNameDate of VisitDiagnosis 14233Mario 04/28/1991Leg injury Snow Villiers12/05/2012Brain damage Lara Croft12/05/2012Powder burns
More superkeys! Record IDPatientNameDate of VisitDiagnosis 14233Mario 04/28/1991Leg injury Snow Villiers12/05/2012Brain damage Lara Croft12/05/2012Powder burns
More superkeys! Record IDPatientNameDate of VisitDiagnosis 14233Mario 04/28/1991Leg injury Snow Villiers12/05/2012Brain damage Lara Croft12/05/2012Powder burns
What use is a superkey? In the previous examples, we’ve got multiple possible superkeys What use are they? Allows us to introduce a secondary concept, really
Candidate keys A candidate key is a superkey that contains the fewest number of values that can uniquely identify a tuple A relation can have multiple candidate keys
Candidate keys? Record IDPatientNameDate of VisitDiagnosis 14233Mario 04/28/1991Leg injury Snow Villiers12/05/2012Brain damage Lara Croft12/05/2012Powder burns
Choosing a candidate key A candidate key is a superkey that contains the fewest number of values that can uniquely identify a tuple A relation can have multiple candidate keys How do you know which one to choose?
Choosing a candidate key This is actually somewhat of an art The answer, really, is to choose the one that best suits your data Looking at ER Diagrams can help you decide
Primary keys When you’ve chosen a candidate key to use, you’ve chosen a primary key The primary key is the way you reference items in your relation There will be much more on this later
Some examples phpBB - phpBB uses 63 tables