Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMSC424: Database Design Instructor: Amol Deshpande

Similar presentations


Presentation on theme: "CMSC424: Database Design Instructor: Amol Deshpande"— Presentation transcript:

1 CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu

2 Today DBMS Overview Data Modeling Going from conceptual requirements of a application to a concrete data model E/R Model

3 Example Simple Banking Application Need to store information about: Accounts Customers Need to support: ATM transactions Queries about the data Why we can’t use a file-system based solution ?

4 A file-system based solution Data stored in files in ASCII format #-seperated files in /usr/db directory /usr/db/accounts Account Number # Balance 101 # 900 102 # 700 … /usr/db/customers Customer Name # Customer Address # Account Number Johnson # 101 University Blvd # 101 Smith # 1300 K St # 102 Johnson # 101 University Blvd # 103 …

5 A file-system based solution Write application programs to support the operations In your favorite programming language To support withdrawals by a customer for amount $X from account Y Scan /usr/db/accounts, and look for Y in the 1 st field Subtract $X from the 2 nd field, and rewrite the file To support finding names of all customers on street Z Scan /usr/db/customers, and look for (partial) matches for Z in the addess field …

6 What’s wrong with this solution ? Hard to control redundancy Hard to evolve the structure Data retrieval requires writing application programs Semantic constraints all over the place Not fast enough ! Data consistency issues Disk crashes etc Security Database management provide an end-to-end solution to all of these problems

7 How ? The key insight is whats called data abstraction

8 Data Abstraction Probably the most important purpose of a DBMS Goal: Hiding low-level details from the users of the system Through use of logical abstractions

9 Data Abstraction Logical Level Physical Level View Level View 1View 2View n … How data is actually stored ? e.g. are we using disks ? Which file system ? What data is stored ? describe data properties such as data semantics, data relationships What data users and application programs see ?

10 Data Abstraction: Banking Example Logical level: Provide an abstraction of tables Two tables can be accessed: accounts  Columns: account number, balance customers  Columns: name, address, account number View level: A teller (non-manager) can only see a part of the accounts table Not containing high balance accounts

11 Data Abstraction: Banking Example Physical Level: Each table is stored in a separate ASCII file # separated fields Identical to what we had before ? BUT the users are not aware of this They only see the tables The application programs are written over the tables abstraction Can change the physical level without affecting users In fact, can even change the logical level without affecting the teller

12 DBMS at a Glance 1. Data Modeling 2. Data Retrieval 3. Data Storage 4. Data Integrity

13 Data Modeling A data model is a collection of concepts for describing data properties and domain knowledge: Data relationships Data semantics Data constraints We will discuss two models extensively in this class Entity-relationship Model Relational Model Probably discuss XML as well

14 Data Retrieval Query = Declarative data retrieval program describes what data to acquire, not how to acquire it Non-declarative: scan the accounts file look for number 55 in the 2 nd field subtract $50 from the 3 rd field Declarative (posed against the tables abstraction): Subtract $50 from the column named balance for the row corresponding to account number 55 in the accounts table How to do it is not specified. Why ? Easier to write More efficient to execute (why ?)

15 Data Storage Where and how to store data ? Main memory ? What if the database larger than memory size ? Disks ? How to move data between memory and disk ? Etc etc…

16 Data Integrity Manage concurrency and crashes Transaction: A sequence of database actions enclosed within special tags Properties: Atomicity: Entire transaction or nothing Consistency: Transaction, executed completely, take database from one consistent state to another Isolation: Concurrent transactions appear to run in isolation Durability: Effects of committed transactions are not lost Consistency: Transaction programmer needs to guarantee that DBMS can do a few things, e.g., enforce constraints on the data Rest: DBMS guarantees

17 Data Integrity Semantic constraints Typically specified at the logical level E.g. balance > 0

18 DBMS at a glance Data Models Conceptual representation of the data Data Retrieval How to ask questions of the database How to answer those questions Data Storage How/where to store data, how to access it Data Integrity Manage crashes, concurrency Manage semantic inconsistencies Not fully disjoint categorization !!

19 Recap Data Models Conceptual representation of the data Data Retrieval How to ask questions of the database How to answer those questions Data Storage How/where to store data, how to access it Data Integrity Manage crashes, concurrency Manage semantic inconsistencies

20 Data Modeling Goals: Conceptual representation of the data “Reality” meets “bits and bytes” Must make sense, and be usable by other people We will study: Entity-relationship Model Relational Model Note the difference !! May study XML-based models or object-oriented models Why so many models ??

21 Motivation You’ve just been hired by Bank of America as their DBA for their online banking web site. You are asked to create a database that monitors: customers accounts loans branches transactions, … Now what??!!!

22 22 Database Design Steps Three Levels of Modeling info Conceptual Data Model Logical Data Model Physical Data Model Conceptual DB design Logical DB design Physical DB design Entity-relationship Model Typically used for conceptual database design Relational Model Typically used for logical database design

23 Entity-Relationship Model Two key concepts Entities: An object that exists and is distinguishable from other objects  Examples: Bob Smith, BofA, CMSC424 Have attributes (people have names and addresses) Form entity sets with other entities of the same type that share the same properties  Set of all people, set of all classes Entity sets may overlap  Customers and Employees

24 Entity-Relationship Model Two key concepts Relationships: Relate 2 or more entities  E.g. Bob Smith has account at College Park Branch Form relationship sets with other relationships of the same type that share the same properties  Customers have accounts at Branches Can have attributes:  has account at may have an attribute start-date Can involve more than 2 entities  Employee works at Branch at Job

25 25 ER Diagram: Starting Example Rectangles: entity sets Diamonds: relationship sets Ellipses: attributes customer has cust-street cust-id cust-name cust-city account balance number access-date

26 Rest of the class Details of the ER Model How to represent various types of constraints/semantic information etc. Design issues A detailed example

27 Next: Relationship Cardinalities We may know: One customer can only open one account OR One customer can open multiple accounts Representing this is important Why ? Better manipulation of data If former, can store the account info in the customer table Can enforce such a constraint Application logic will have to do it; NOT GOOD Remember: If not represented in conceptual model, the domain knowledge may be lost

28 Mapping Cardinalities Express the number of entities to which another entity can be associated via a relationship set Most useful in describing binary relationship sets

29 Mapping Cardinalities One-to-One One-to-Many Many-to-One Many-to-Many customer has account customer has account customer has account customer has account

30 Mapping Cardinalities Express the number of entities to which another entity can be associated via a relationship set Most useful in describing binary relationship sets N-ary relationships ? More complicated Details in the book

31 Next: Types of Attributes Simple vs Composite Single value per attribute ? Single-valued vs Multi-valued E.g. Phone numbers are multi-valued Derived If date-of-birth is present, age can be derived Can help in avoiding redundancy, enforcing constraints etc…

32 Types of Attributes customer has cust-street cust-id cust-name cust-city account balance number access-date

33 Types of Attributes customer cust-street cust-id cust-name cust-city has account balance number access-date phone no. date-of-birth age month dayyear Composite Attribute

34 Next: Keys Key = set of attributes that uniquely identifies an entity or a relationship

35 customer cust-street cust-id cust-name cust-city phone no. age date-of-birth Possible Keys: {cust-id} {cust-name, cust-city, cust-street} {cust-id, age} cust-name ?? Probably not. Domain knowledge dependent !! Entity Keys

36 Superkey any attribute set that can distinguish entities Candidate key a minimal superkey Can’t remove any attribute and preserve key-ness  {cust-id, age} not a candidate key  {cust-name, cust-city, cust-street} is  assuming cust-name is not unique Primary key Candidate key chosen as the key by DBA Underlined in the ER Diagram

37 Entity Keys {cust-id} is a natural primary key Typically, SSN forms a good primary key Try to use a candidate key that rarely changes e.g. something involving address not a great idea customer cust-street cust-id cust-name cust-city phone no. age date-of-birth

38 Relationship Set Keys What attributes are needed to represent a relationship completely and uniquely ? Union of primary keys of the entities involved, and relationship attributes {cust-id, access-date, account number} describes a relationship completely customer has cust-id account number access-date

39 Relationship Set Keys Is {cust-id, access-date, account number} a candidate key ? No. Attribute access-date can be removed from this set without losing key-ness In fact, union of primary keys of associated entities is always a superkey customer has cust-id account number access-date

40 Relationship Set Keys Is {cust-id, account-number} a candidate key ? Depends customer has cust-id account number access-date

41 Relationship Set Keys Is {cust-id, account-number} a candidate key ? Depends customer has cust-id account number access-date If one-to-one relationship, either {cust-id} or {account-number} sufficient Since a given customer can only have one account, she can only participate in one relationship Ditto account

42 Relationship Set Keys Is {cust-id, account-number} a candidate key ? Depends customer has cust-id account number access-date If one-to-many relationship (as shown), {account-number} is a candidate key A given customer can have many accounts, but at most one account holder per account allowed

43 Relationship Set Keys General rule for binary relationships one-to-one: primary key of either entity set one-to-many: primary key of the entity set on the many side many-to-many: union of primary keys of the associate entity sets n-ary relationships More complicated rules

44 … What have we been doing Why ? Understanding this is important Rest are details !! That’s what books/manuals are for.


Download ppt "CMSC424: Database Design Instructor: Amol Deshpande"

Similar presentations


Ads by Google