Overview of Database Systems CSCE 315, Fall 2017 Project 1, Lecture 1
Project Your first project (next week) will involve putting together a very basic database system There will be a few lectures to give you an overview of database systems This is nowhere close to what you would get in a full database course Slides adapted from Jennifer Welch (some of hers were from Jeffrey Ullman)
Why Database Systems? Systems designed to manage very large amounts of data, and to query them Most of these data do not fit in the RAM Often, key considerations include: Efficiency, Safe concurrency Reliability Ease of access (querying, distributed) If an application developer was to implement all these features He would reinvent the wheel He may make mistakes Or he has to relearn couple decades worth of research in a few days In stead, the developer can just use a database software where all these are already available
Creating a Database A database schema determines what will be represented in the database This really about which Tables and the structure/schema of those Tables Specified through a data definition language (DDL) Once the database is created, then data can be inserted Like creating a class first and then creating objects of it Finally, the database can be queried for useful information using Data Manipulation Language (DML) SQL is used both as DDL and DML
Database Design vs Database Development When we write an application for some other course, most likely we will learn how to organize data in databases The database software is already there (e.g., MySQL, Microsoft Excel, Oracle, MongoDB) This is called database design, which means organizing data inside a available database software However, Team Project #1 is about: Developing a database engine, since we won’t use the available ones Then, organizing data inside it for the application part Therefore, we will learn both database development and data design
Database Design Means setting up the structure inside the database such that: There is minimal repetition of data Data access in convenient Will use Entity-Relationship (ER) model/diagram Similar to class diagram for Object Oriented Design
Entity-Relationship Model Way of expressing: Kinds/Types of data and how they are related Easy first way to think about databases Later, relational model described Relational model is the foundation of most databases
Entities and Attributes Entities are things Similar to objects in OOP Correspond to Records Example: Tanzir is a Person entity Attributes are properties of entity sets Correspond to fields in the Records Example: Tanzir’s birthday, ethnicity Entity sets are collections of those things Correspond to Tables Collection of all A&M employees/personnels
Entity Sets and Attributes Name Party Senator Bill State Text Name Years
Relationships Connect two or more entity sets Name Contributed Party Organization Senator Lobbyist State Sponsored Name Years Wrote Bill Name Text
Values of Entities and Relationships The “value” of an entity set is the entities it contains The “value” of a relationship is a list of currently related entities (one from each entity set) Senator (Entity Set) Sponsor (Relationship) Name Party State Years Smith Rep TX 13 Jones Dem OK 5 Dave CA 12 Senator Bill Smith Tax Bill Defense Bill Jones
Multi-Way Relationships E.g. Lobbyist lobbied Senator about Bill Name Party Organization Senator Lobbied Lobbyist State Name Years Bill Name Text
Relationship Types Consider binary relationships (two entity groups in a relationship) One-to-one Each entity can have at most one in the other category e.g. entity groups: Baseball player, Team relationship: Team MVP A team can only have one MVP, and a player can only be MVP for one team.
Relationship Types Consider binary relationships (two entity groups in a relationship) One-to-one Many-to-one Each entity of first set can go to at most one of the second set e.g. entity groups: Person, Town relationship: BornIn A person can is born in only one town, but a town can have many people born there
Relationship Types Consider binary relationships (two entity groups in a relationship) One-to-one Many-to-one Many-to-many Any number from one set to the other e.g. Senators can sponsor many bills, and each bill can be sponsored by many Senators
Diagrams of Relationships Arrow shows “to one” Lived In Born In Person Town MVP Baseball Player Team
Attributes on Relationships Born In Person Town Hospital
Keys A key is a set of attributes for an entity set such that no two entities agree on all the attributes. We must have a key for every entity set Name Party Senator State Years
Key for multiple attributes Must choose one set of attributes First Name Last Name Number Baseball Player Position Birthdate Nationality Salary Team
Key for multiple attributes Must choose one set of attributes First Name Last Name Number Baseball Player Position Birthdate Nationality Salary Team
Key for multiple attributes Must choose one set of attributes First Name Last Name Number Baseball Player Position Birthdate Nationality Salary Team
Weak entity sets Need “help” to determine key First Name Last Name Number Note arrrow: indicates many to one. Baseball Player Position Birthdate Plays On Team Nationality Salary Name City
Design Techniques Avoid redundancy Say the same thing two ways First Name Last Name Number Baseball Player Position Birthdate Plays On Team Team Name Salary Name City
Design Techniques Avoid redundancy Say the same thing two ways First Name Last Name Number Baseball Player Position Birthdate Team Name Team Town
Design Techniques Don’t use entity set if attribute will do Entity lists should either Have some non-key attribute, OR Be the “many” in a many-one/many-many relationship Plays On Name Baseball Player Team Name City
Design Techniques Don’t use entity set if attribute will do Entity lists should either Have some non-key attribute Be the “many” in a many-one/many-many relationship Name Baseball Player Team
Design Techniques Don’t overuse weak entity sets Usually use unique key for each entity set (e.g. UIN, SSN, VIN) Not always possible, though