376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Prof. Billibon Yoshimi 9/17/2018
Introduction What are databases? What are the different types of databases. Data modeling Access Security Prof. Billibon Yoshimi 9/17/2018
Who am I Bill Yoshimi Office: 106 Old Laundry Building Phone: 437-5986 Email: yoshimi@cs.vassar.edu Office Hours: 4:00pm-5:00pm MW Prof. Billibon Yoshimi 9/17/2018
Required Texts “Fundamentals of Database Systems,” Elmasri/Navathe, Benjamin Cummings. Prof. Billibon Yoshimi 9/17/2018
How course will be graded 40% - Homework/programming assignments 20% - Midterm 30% - Final Exam Prof. Billibon Yoshimi 9/17/2018
Programming assignments One homework/programming assignment per week (consisting of 3 or 4 exercises). Absolutely essential for learning the material. Try to hand assignments in on time, future assignments depend on knowledge learned from previous assignments. Prof. Billibon Yoshimi 9/17/2018
Late policy All assignments are due at the start of class on the date specified. Assignments arriving after the start of class but before the start of the next class will be accepted with a 10% penalty. Late assignments will not be accepted after this point. Prof. Billibon Yoshimi 9/17/2018
Tests Closed book and closed notes. Each test is cumulative. Tests will be announced later in semester. Prof. Billibon Yoshimi 9/17/2018
Original authoring of materials Talking to classmates about general ideas is OK. Each student must do his or her programming assignment entirely by himself or herself. Do not discuss or share programs with other students. Vassar regulations require the professor to report suspected violations of academic integrity to the Dean of Studies. Read the “Originality and Attribution” pamphlet. Prof. Billibon Yoshimi 9/17/2018
Now, we’re finished with the administrative tasks, on with the course… Prof. Billibon Yoshimi 9/17/2018
Define database A collection of related data, logically connected and organized. A database management System (DBMS) enables users to create and use a database Prof. Billibon Yoshimi 9/17/2018
Simplified Database System Application Programs/Queries Software to process queries/programs Software to access stored data Stored Database Definition (META-DATA) Stored Database Prof. Billibon Yoshimi 9/17/2018
Example STUDENT file stores data on each student. COURSE file stores data on eahc course SECTION file stores data on each section of a course GRADE_REPORT file stores grades that students receive in the various sections they complete PREREQUISITE file stores prerequisites for each course. Prof. Billibon Yoshimi 9/17/2018
STUDENT file Each student has Define the types of each of these Name StudentID Class Major Define the types of each of these Prof. Billibon Yoshimi 9/17/2018
Look at relationships between these files PREREQUISITE COURSE STUDENT SECTION GRADE_REPORT Prof. Billibon Yoshimi 9/17/2018
Basic operations Queries (list the names of all students enrolled in Databases) Updates (change Smith’s section from 1 to 2) These are informal queries (must be fully specified before real query can be made against database) Prof. Billibon Yoshimi 9/17/2018
Old way to access data Flat file system. Easiest to implement quickly. Each system has its own copy of the data. Every programmer must keep track of their own interfaces, must maintain consistency themselves. Prof. Billibon Yoshimi 9/17/2018
Fundamental characteristic of a database Should be self describing. System catalog contains info about all data, fields, constraints and relationships between data. (Also called meta-data) Database program should be application agnostic. (as long as relationships can be described by meta-data.) Information is looked up according to meta-data definitions. In ordinary files, this information is encoded in accessor routines. Prof. Billibon Yoshimi 9/17/2018
Databases provide Program-data independence. Change in physical file structure should not change access routines. Data abstraction – all details about how data is stored is hidden (access is via data model.) Multiple views – subsets of data and “virutal” data. Prof. Billibon Yoshimi 9/17/2018
Who accesses the database? Database administrators (DBA) – authorizing access, coordinating and monitoring use. Database designers – identify data and how it will be stored in DB, understand requirements, devise views. End users Casual end users – use query language to access database. Naïve or parametric end users – use canned queries (little variation.) Sophisticated end users – thoroughly familiar with intricacy of database, make sophisticated queries. System Analyst – determine needs of Naïve and parameteric users Application programs – implement SA reqs. Prof. Billibon Yoshimi 9/17/2018
Behind the scenes DBMS designers and implementers – maintain modules inside the database (query processors, data access, security, etc.) Tool developers – Facilitate design and improve performance. (performance monitoring tools, natural language interfaces, etc.) Operators and maintenance personnel Prof. Billibon Yoshimi 9/17/2018
Why use a DBMS? Control redundancy Shared data (concurrent access) Restrict unauthorized access Provide multiple interfaces to data Represent complex relationships between data. Enforce integrity constraints Backup and recovery Prof. Billibon Yoshimi 9/17/2018
Lesser advantages Ability to enforce standards across organization. Flexibility Reduced development time for new applications. Work with up to date data Economies of scale Prof. Billibon Yoshimi 9/17/2018
When not to use a DBMS Database and application are simple, well defined, not expected to change often. System has stringent real-time constraints that DBMS my not meet. Multiple access to data is not required. DBMS are high overhead systems (hardware, planning, training and maintenance). Prof. Billibon Yoshimi 9/17/2018
DBMS Concepts and Architecture Prof. Billibon Yoshimi 9/17/2018
Data model Describes structure of the database (types, relationships and constraints.) High level (conceptual) vs. low level (physical) vs. implementation data models High level (object-based models) Entity – an object Attribute – some quality or quantity associated with an object Relationship – how objects are related Implementation level Relational, network, hierarchical. Prof. Billibon Yoshimi 9/17/2018
Database Schema Description of the database. Specified during design time and not expected to change. Schema Diagram – data model specific convention for displaying schemas. Schema Construct – single object description in a schema. Prof. Billibon Yoshimi 9/17/2018
Schema Diagram example STUDENT Name StudentNumber Class Major COURSE CourseName Number CreditHour Dept SECTION SectionID CourseNumber Semester Year Prof. Billibon Yoshimi 9/17/2018
Understand Data changes frequently, data scheme, if well designed, should not change. The data in the database at any time is called the database instance, occurance or state. Any insert, delete or change converts DB from one instance to another. DBMS is responsible for verifying that each instance of the DB adheres to schema. Scheme also called intension, instance – extension of the schema. Prof. Billibon Yoshimi 9/17/2018
Three Schema Architecture Supports program data independence, multi-user views, and catalog to store DB schema. Internal level has internal scheme – physical structure of DB and access paths. Conceptual level has conceptual schema – describes entities, datatype, relationships, and constraints. External or view level has multiple external schemas or user views. – views for a particular audience. Prof. Billibon Yoshimi 9/17/2018
Schemas are only descriptions of data Data only exists at the physical level. Mappings are used to convert from schema to schema (walk through an example). Prof. Billibon Yoshimi 9/17/2018
How schema preserve data independence. Logical data independence – changes to conceptual scheme do not affect external schemas. Physical data independence – changes to internal schema (physical layer) do not affect conceptual schema. In reality, difficult to implement (also multiple level mappings add overhead to queries.) Prof. Billibon Yoshimi 9/17/2018
Interfaces to Databases Languages: Data Definition Language (DDL) – used to define internal and conceptual schemas Storage Definition Language (SDL) – mapping between conceptual and internal schema. View Definition Language (VDL) – defines user views and mapping to conceptual Data Manipulation Language (DML) – methods for retrieving, inserting, deleting and modifying data. High-level (non procedural) set-at-a-time Low-level (procedural) must be embedded. Record-at-a-time. Prof. Billibon Yoshimi 9/17/2018
How languages are used Data Manipulation calls are made either directly (query language) or embedded in a host language (data sublanguage) Naïve and parametric users have user-friendly interfaces Prof. Billibon Yoshimi 9/17/2018
“User Friendly” DBMS Interfaces Menus, graphical interfaces, forms, natural language, command languages for parametric users, DBA interfaces Are these really user friendly? Prof. Billibon Yoshimi 9/17/2018
Classification of DBMS Relational, network, hierarchical, other. Single vs. multi-user Number of sites (centralized or distributed). Homo vs. heterogeneous (federated) Cost Types of access path General vs. special purpose Prof. Billibon Yoshimi 9/17/2018
Main criteria – the data model Relational – data organized in tables, high level query language, limited form of user views. Conceptual and internal views are not distinguishable. Network – set of records and implements limited 1:N relationships. Must be embedded in host programming language. Hierarchical – include parent-child relationships. Prof. Billibon Yoshimi 9/17/2018
Database design process Collect requirements and analyze results. Create conceptual schema (conceptual database design) (datatypes, relationships, and constraints.) Implement database (data model mapping) Design physical database (specify internal storage structures and file organizations) Prof. Billibon Yoshimi 9/17/2018
Example: a company database Company is organized in departments. Each department has a name, a number and an employee who manages the department. Track when the employee started managing the department. Department may have several locations. Departments have projects, each has a name, a number and a location. Employees have SSN, address, salary, sex and birthdate. Employee is assigned to one department but may work on multiple projects. Track # of hours worked per week and on which projects. Track employee’s direct supervisor. Prof. Billibon Yoshimi 9/17/2018
Entity A “thing” in the real world. Can be real or conceptual. Entity has particular properties: attributes E.g. E1 Name = John Addr = 1 J St, Apt 2, NY, NY 10002 Age = 55 Phone = 222 222 2222 Prof. Billibon Yoshimi 9/17/2018
Attributes Atomic vs. composite. Atomic or simple are not decomposable. Composite attributes can form a hierarchy (street address – number, street, apartment) Composite attributes useful when parts are referenced as well as whole. Prof. Billibon Yoshimi 9/17/2018
Attributes cont. Single valued vs. multivalued. Derivable attributes (age from BD, # of employees by sum) NULL- not applicable or unknown Prof. Billibon Yoshimi 9/17/2018
Entity Types Similar entities (having similar attributes but with different values) Described using Entity Type Schema (intension)– common structure shared by entities. E.g. EMPLOYEE has Name, Age, Salary Sets of instances valid at a given point in time is called an extension of the entity type. E.g. (John Smith, 45, 80k) (Fred Brown, 32, 20K) Entity type schema should not change; extensions change often. Prof. Billibon Yoshimi 9/17/2018
Key Attribute Uniqueness constraint on attributes. An entity type usually an attribute whose values are distinct for each individual entry. Key attribute is used to uniquely identify an entity. E.g. PERSON has SocialSecurityNumber, COMPANY has Name. No two instances can have the same Key Attribute. Some entities can have more than one Key Attribute. Prof. Billibon Yoshimi 9/17/2018
Value Sets of Attributes A: Attribute E: Entity Type V: Value Set A:E -> P(V) P(V) is the Power Set of V the set of all subsets of V. Prof. Billibon Yoshimi 9/17/2018
Value Sets of Attributes cont. A(e) : value of attribute A for entity e A(e) is a singleton for SV attributes (has only one element) A(e) may be empty set, single element or multiple element for multivalued attribute. A(e) for composite attribute is cartesian product (all pairs mapping) of P(V1)xP(V2)xP(V3)… Use (…) and comma separted list to denote composite attribute. Use {…} and comma separated list to denote multi-valued attribute Prof. Billibon Yoshimi 9/17/2018
Value Set of Entity cont. E.g. if a person can have more than one residence and each residence can have more than one phone then the attribute AddressPhone {AddressPhone({Phone(AreaCode,PhoneNumber)}, Address(StreetAddress(Number, Street, ApartmentNumber),City,State,Zip))} Prof. Billibon Yoshimi 9/17/2018
Relationships Relationship type R among N entities E1…EN is a set of associations among these types. R is a set of ri where each ri is an n tuple of (e1, e2, … en) and each ej in ri is a member of entity type Ej 1<j<N Prof. Billibon Yoshimi 9/17/2018
Example EMPLOYEE DEPARTMENT WORKS_FOR e1 e2 e3 e4 e5 e6 e7 r1 r2 r3 r4 Prof. Billibon Yoshimi 9/17/2018
Degree of a Relationship Model Degree is number of participating Entity Types. In previous example, degree is 2 or binary. (most used form) Ternary relationship type has three Entity Types. (holds more information than 3 binary relationships). Connection trap can occur when 3 binary relations used instead of ternary relation. Prof. Billibon Yoshimi 9/17/2018
Relations as Attributes Prof. Billibon Yoshimi 9/17/2018
Example Prof. Billibon Yoshimi 9/17/2018
Prof. Billibon Yoshimi 9/17/2018
Entity-Relationship Model Prof. Billibon Yoshimi 9/17/2018
Prof. Billibon Yoshimi 9/17/2018
Example Student Name StudentID Class Major Smith 17 1 COSC Brown 8 2 Prof. Billibon Yoshimi 9/17/2018
Quoting Quoting by using (quote ) or ‘ protects the symbol that follows like (quote helloworld) or ‘helloworld Prof. Billibon Yoshimi 9/17/2018
Summary State what has been learned Define ways to apply training Request feedback of training session Prof. Billibon Yoshimi 9/17/2018
Where to Get More Information Other training sessions List books, articles, electronic sources Consulting services, other sources Prof. Billibon Yoshimi 9/17/2018