Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECS 647: Introduction to Database Systems

Similar presentations


Presentation on theme: "EECS 647: Introduction to Database Systems"— Presentation transcript:

1 EECS 647: Introduction to Database Systems
Instructor: Luke Huan Spring 2007

2 Luke Huan Univ. of Kansas
Administrative Take home background survey will be posted at the class website today Due Feb 5th (next Monday) at the class meeting time. 4/12/2019 Luke Huan Univ. of Kansas

3 Administrative (cont.)
Your PostgreSQL account has been created. Check your EECS account and get your passwd Do not change your PostgreSQL passwd! Go to (linked through the class website) and read through the tutorial 4/12/2019 Luke Huan Univ. of Kansas

4 Luke Huan Univ. of Kansas
Review A data model is a group of concepts for describing data. What are the two terms used by ER model to describe a miniworld? Entity Relationship A database schema is a description of a database, using a given data model (relational model by default). What are the three types of database schema that we usually see in database design? Student (sid: string, name: string, login: string, age: integer, gpa:real) 4/12/2019 Luke Huan Univ. of Kansas

5 SUMMARY OF ER-DIAGRAM NOTATION FOR ER SCHEMAS
Symbol Meaning ENTITY TYPE WEAK ENTITY TYPE RELATIONSHIP TYPE IDENTIFYING RELATIONSHIP TYPE ATTRIBUTE KEY ATTRIBUTE MULTIVALUED ATTRIBUTE COMPOSITE ATTRIBUTE DERIVED ATTRIBUTE TOTAL PARTICIPATION OF E2 IN R CARDINALITY RATIO 1:N FOR E1:E2 IN R STRUCTURAL CONSTRAINT (min, max) ON PARTICIPATION OF E IN R E1 R E2 N E1 R E2 (min,max) R E 4/12/2019 Luke Huan Univ. of Kansas

6 Luke Huan Univ. of Kansas
ER Case Study I Works_In does not allow an employee to work in a department for two or more periods. We want to record several values of the descriptive attributes for each instance of this relationship. from to name Employees ssn lot dname did budget Works_In Departments dname budget did name Departments ssn lot Employees Works_In Duration from to 4/12/2019 Luke Huan Univ. of Kansas

7 Luke Huan Univ. of Kansas
ER Case study II Design a database representing cities, counties, and states For states, record name and capital (city) For counties, record name, area, and location (state) For cities, record name, population, and location (county and state) Assume the following: Names of states are unique Names of counties are only unique within a state Names of cities are only unique within a county A city is always located in a single county A county is always located in a single state 4/12/2019 Luke Huan Univ. of Kansas

8 Case study : first design
name name In Cities States population capital county_name county_area County area information is repeated for every city in the county Redundancy is bad (why?) State capital should really be a city Should “reference” entities through explicit relationships 4/12/2019 Luke Huan Univ. of Kansas

9 Case study : second design
name Cities population In IsCapitalOf name In Counties States name area Technically, nothing in this design could prevent a city in state X from being the capital of another state Y, but oh well… 4/12/2019 Luke Huan Univ. of Kansas

10 Luke Huan Univ. of Kansas
Today’s Outline Relational Model Relational Model Constraints and Relational Database Schemas Update Operations Insertion Deletion Update Transaction Dealing with Constraint Violations 4/12/2019 Luke Huan Univ. of Kansas

11 Why Study the Relational Model?
Most widely used model. “Legacy systems” in older models e.g., IBM’s IMS Object-oriented concepts merged in “Object-Relational” model Early work done in POSTGRES research project at Berkeley XML features in most relational systems Can export XML interfaces Can embed XML inside relational fields 4/12/2019 Luke Huan Univ. of Kansas

12 Luke Huan Univ. of Kansas
Database Design 4/12/2019 Luke Huan Univ. of Kansas

13 Relational Model Concepts
Relational database: a set of relations. Relation: made up of 2 parts: Schema : specifies name of relation, plus name and type of each column. E.g. Students(sid: string, name: string, login: string, age: integer, gpa: real) Instance : a table, with rows and columns. #rows = cardinality #fields = degree / arity 4/12/2019 Luke Huan Univ. of Kansas

14 Luke Huan Univ. of Kansas
Informally RELATION: A table of values A relation may be thought of as a set of rows (table view). Each row represents a fact that corresponds to a real-world entity or relationship. Each row has a value of an item or set of items that uniquely identifies that row in the table. Sometimes row-ids or sequential numbers are assigned to identify the rows in the table. A relation may alternately be though of as a set of columns (schema view). Each column typically is called by its column name or column header or attribute name. 4/12/2019 Luke Huan Univ. of Kansas

15 Luke Huan Univ. of Kansas
Example 4/12/2019 Luke Huan Univ. of Kansas

16 Luke Huan Univ. of Kansas
Historically The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper: "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970. The above paper caused a major revolution in the field of Database management and earned Ted Codd the coveted ACM Turing Award. The picture is from wikipedia 4/12/2019 Luke Huan Univ. of Kansas

17 A (Slightly) Formal Definition
A database is a collection of relations (or tables) Each relation is identified by a name and a list of attributes (or columns) Each attribute has a name and a domain (or type) Set-valued attributes not allowed Simplicity is a virtue! 4/12/2019 Luke Huan Univ. of Kansas

18 Schema versus instance
Schema (metadata) Specification of how data is to be structured logically Defined at set-up Rarely changes Instance Content Changes rapidly, but always conforms to the schema Compare to type and objects of type in a programming language 4/12/2019 Luke Huan Univ. of Kansas

19 Luke Huan Univ. of Kansas
Example Schema Student (SID integer, name string, age integer, GPA float) Course (CID string, title string) Enroll (SID integer, CID integer) Instance { h142, Bart, 10, 2.3i, h123, Milhouse, 10, 3.1i, ...} { hCPS116, Intro. to Database Systemsi, ...} { h142, CPS116i, h142, CPS114i, ...} 4/12/2019 Luke Huan Univ. of Kansas

20 Formal Definition (Set Theory)
Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x Dn Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai  Di 4/12/2019 Luke Huan Univ. of Kansas

21 Luke Huan Univ. of Kansas
Example Example: If customer_name = {Jones, Smith, Curry, Lindsay, …} customer_street = {Main, North, Park, …} customer_city = {Harrison, Rye, Pittsfield, …} Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield) } is a relation over customer_name × customer_street × customer_city 4/12/2019 Luke Huan Univ. of Kansas

22 Luke Huan Univ. of Kansas
Attribute Types Each attribute of a relation has a name, designating the role of the attribute The set of allowed values for each attribute is called the domain of the attribute Attribute values are (normally) required to be atomic; that is, indivisible E.g. the value of an attribute can be an account number, but cannot be a set of account numbers Domain is said to be atomic if all its members are atomic The special value null is a member of every domain 4/12/2019 Luke Huan Univ. of Kansas

23 Luke Huan Univ. of Kansas
Relation Schema A1, A2, …, An are attributes R = (A1, A2, …, An ) is a relation schema Example: Customer_schema = (customer_name, customer_street, customer_city) r(R) denotes a relation r on the relation schema R customer (Customer_schema) 4/12/2019 Luke Huan Univ. of Kansas

24 Luke Huan Univ. of Kansas
Relation Instance The current values (relation instance) of a relation are specified by a table An element t of r is a tuple, represented by a row in a table attributes (or columns) customer_name customer_street customer_city Jones Smith Curry Lindsay Main North Park Harrison Rye Pittsfield tuples (or rows) customer 4/12/2019 Luke Huan Univ. of Kansas

25 Luke Huan Univ. of Kansas
Definition Summary Informal Terms Formal Terms Table Relation Column Attribute/Domain Row Tuple Values in a column Domain Table Definition Schema of a Relation Populated Table Extension 4/12/2019 Luke Huan Univ. of Kansas

26 Characteristics of Relation
The tuples in a ration r(R) are not considered to be ordered, even though they appear to be in the tabular form. We consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> to be ordered . (However, a more general alternative definition of relation does not require this ordering). All values are considered atomic (indivisible). A special null value is used to represent values that are unknown or inapplicable to certain tuples. 4/12/2019 Luke Huan Univ. of Kansas

27 Characteristics of Relation
Notation: we refer to component values of a tuple t by t[Ai] = vi (the value of attribute Ai for tuple t). Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t containing the values of attributes Au, Av, ..., Aw, respectively. 4/12/2019 Luke Huan Univ. of Kansas

28 Relational Integrity Constraints
Constraints are conditions that must hold on all valid relation instances. There are four main types of constraints: Domain constraints The value of a attribute must come from its domain Key constraints Entity integrity constraints Referential integrity constraints 4/12/2019 Luke Huan Univ. of Kansas

29 Primary Key Constraints
A set of fields is a candidate key for a relation if : 1. No two distinct tuples can have same values in all key fields, and 2. This is not true for any subset of the key. Part 2 false? A superkey. If there’s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key. E.g., given a schema Student(sid: string, name: string, gpa: float) we have: sid is a key for Students. (What about name?) The set {sid, gpa} is a superkey. 6

30 Luke Huan Univ. of Kansas
Key Example CAR (licence_num: string, Engine_serial_num: string, make: string, model: string, year: integer) What is the candidate key(s) Which one you may use as a primary key What are the super keys 4/12/2019 Luke Huan Univ. of Kansas

31 Luke Huan Univ. of Kansas
Entity Integrity Entity Integrity: The primary key attributes PK of each relation schema R in S cannot have null values in any tuple of r(R). Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key. 4/12/2019 Luke Huan Univ. of Kansas

32 Foreign Keys, Referential Integrity
Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’. E.g. sid is a foreign key referring to Students: Student(sid: string, name: string, gpa: float) Enrolled(sid: string, cid: string, grade: string) If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references. Can you name a data model w/o referential integrity? Links in HTML! 7

33 Foreign Keys Only students listed in the Students relation should be allowed to enroll for courses. Enrolled Students Or, use NULL as the value for the foreign key in the referencing tuple when the referenced tuple does not exist 7

34 Luke Huan Univ. of Kansas
In-Class Exercise (Taken from Exercise 5.16) Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course: STUDENT(SSN, Name, Major, Bdate) COURSE(Course#, Cname, Dept) ENROLL(SSN, Course#, Quarter, Grade) BOOK_ADOPTION(Course#, Quarter, Book_ISBN) TEXT(Book_ISBN, Book_Title, Publisher, Author) Draw a relational schema diagram specifying the foreign keys for this schema. 4/12/2019 Luke Huan Univ. of Kansas

35 Other Types of Constraints
Semantic Integrity Constraints: based on application semantics and cannot be expressed by the model per se e.g., “the max. no. of hours per employee for all projects he or she works on is 56 hrs per week” A constraint specification language may have to be used to express these SQL-99 allows triggers and ASSERTIONS to allow for some of these 4/12/2019 Luke Huan Univ. of Kansas

36 Update Operations on Relations
INSERT a tuple. DELETE a tuple. MODIFY a tuple. Constraints should not be violated in updates 4/12/2019 Luke Huan Univ. of Kansas

37 Update Operations on Relations
In case of integrity violation, several actions can be taken: Cancel the operation that causes the violation (REJECT option) Perform the operation but inform the user of the violation Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) Execute a user-specified error-correction routine 4/12/2019 Luke Huan Univ. of Kansas

38 Key concept: Transaction
an atomic sequence of database actions (reads/writes) takes DB from one consistent state to another transaction consistent state 1 consistent state 2 4/12/2019 Luke Huan Univ. of Kansas

39 Luke Huan Univ. of Kansas
Example Transaction “transfer $100 from Saving to checking” checking: $300 savings: $900 checking: $200 savings: $1000 Here, consistency is based on our knowledge of banking “semantics” In general, up to writer of transaction to ensure transaction preserves consistency DBMS provides (limited) automatic enforcement, via integrity constraints e.g., balances must be >= 0 4/12/2019 Luke Huan Univ. of Kansas


Download ppt "EECS 647: Introduction to Database Systems"

Similar presentations


Ads by Google