Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Database Design Donghui Zhang CCIS, Northeastern University.

Similar presentations


Presentation on theme: "Introduction to Database Design Donghui Zhang CCIS, Northeastern University."— Presentation transcript:

1 Introduction to Database Design Donghui Zhang CCIS, Northeastern University

2 Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming

3 Database, DBMS A Database is a very large, integrated collection of data. A Database is a very large, integrated collection of data. A Database Management System (DBMS) is a software designed to store and manage databases. A Database Management System (DBMS) is a software designed to store and manage databases. A Database Application is a software which enables the users to access the database. A Database Application is a software which enables the users to access the database.

4 Why DBMS? We currently live in a world experiencing information explosion. We currently live in a world experiencing information explosion. To manage the huge amount of data: DBMS To manage the huge amount of data: DBMS the total RDBMS market in 2003 was $7 billion in license revenues. the total RDBMS market in 2003 was $7 billion in license revenues. Much more money was spent to develop Database applications. Much more money was spent to develop Database applications.

5 Total revenue: 7.1 billion in 2003.

6 The worldwide database management software market saw double-digit growth in 2004. The worldwide database management software market saw double-digit growth in 2004. The five-year forecast calls for a compound annual growth rate of nearly 6 percent, bringing the market to $12.7 billion in new license revenue by 2009. The five-year forecast calls for a compound annual growth rate of nearly 6 percent, bringing the market to $12.7 billion in new license revenue by 2009. Title: Forecast: Database Management Systems Software, Worldwide, 2003-2009 Title: Forecast: Database Management Systems Software, Worldwide, 2003-2009 Author: Colleen Graham, Gartner Author: Colleen Graham, Gartner Time: April 21, 2005 Time: April 21, 2005

7

8 DBMS can Provide … Data independence and efficient access. Data independence and efficient access. Reduced application development time. Reduced application development time. Data integrity and security. Data integrity and security. Uniform data administration. Uniform data administration. Concurrent access, recovery from crashes. Concurrent access, recovery from crashes.

9 DBMS Historic Points First DBMS developed by Turing Award winner Charles Bachman in the early 1960s. First DBMS developed by Turing Award winner Charles Bachman in the early 1960s. in 1970, Turing Award winner Edgar Codd proposed the relational data model. in 1970, Turing Award winner Edgar Codd proposed the relational data model. in the late 1980s, IBM proposed SQL. in the late 1980s, IBM proposed SQL.

10 Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming

11 Components of Data-Intensive Systems Three separate types of functionality: Data management Data management Application logic Application logic Presentation Presentation

12 Example: Course Enrollment -- Build a system using which students can enroll in courses: Data Management Data Management Student info, course info, instructor info, course availability, pre-requisites, etc.Student info, course info, instructor info, course availability, pre-requisites, etc. Application Logic Application Logic Logic to add a course, drop a course, create a new course, etc.Logic to add a course, drop a course, create a new course, etc. Presentation Presentation Log in different users (students, staff, faculty), display forms and human-readable outputLog in different users (students, staff, faculty), display forms and human-readable output

13 The Three-Tier Architecture Database System Application Server Client Program (Web Browser) Presentation tier Middle tier Data management tier

14 E.g. What we use Database System Application Server Client Program (Web Browser) Presentation tier Middle tier Data management tier MySQL Apache JSP

15 HTML: An Example <HTML> Barns and Nobble Internet Bookstore Barns and Nobble Internet Bookstore Our inventory: Our inventory: Science Science The Character of Physical Law The Character of Physical Law Author: Richard Feynman Author: Richard Feynman Published 1980 Published 1980 <LI>Hardcover</LI> Fiction Fiction Waiting for the Mahatma Waiting for the Mahatma Author: R.K. Narayan Author: R.K. Narayan Published 1981 Published 1981 The English Teacher The English Teacher Author: R.K. Narayan Author: R.K. Narayan Published 1980 Published 1980 <LI>Paperback</LI> </HTML>

16 HTML: static vs dynamic Static: you create an HTML file which is sent to the client’s web browser upon request. E.g.: Static: you create an HTML file which is sent to the client’s web browser upon request. E.g.: your CCIS login is ‘donghui’,your CCIS login is ‘donghui’, your HTML file is /home/donghui/.www/index.htmlyour HTML file is /home/donghui/.www/index.html The URL is http://www.ccs.neu.edu/home/donghuiThe URL is http://www.ccs.neu.edu/home/donghui http://www.ccs.neu.edu/home/donghui Dynamic: the HTML file is generated dynamically via your ASP.NET code. Dynamic: the HTML file is generated dynamically via your ASP.NET code.

17 Another View MySQL Machine 1 Apache Your JSP Code Machine 2 Client Machines Client browser 1 Client browser 2 Client browser 3 Your database

18 Client-Server Architecture Data Management: DBMS @ Server. Data Management: DBMS @ Server. Presentation: Client program. Presentation: Client program. Application Logic: can go either way. Application Logic: can go either way. If combined with server: thin-client architectureIf combined with server: thin-client architecture If combined with client: thick-client architectureIf combined with client: thick-client architecture ServerClient

19 Thin-Client Architecture Database server and web server too closely coupled,Database server and web server too closely coupled, E.g. Does not allow the application logic to access multiple databases on different servers.E.g. Does not allow the application logic to access multiple databases on different servers. Server Client

20 Thick-Client Architecture No central place to update the business logicNo central place to update the business logic Security issues: Server needs to trust clientsSecurity issues: Server needs to trust clients Does not scale to more than several 100s of clientsDoes not scale to more than several 100s of clients Server Client

21 Advantages of the Three-Tier Architecture Heterogeneous systems Heterogeneous systems Tiers can be independently maintained, modified, and replacedTiers can be independently maintained, modified, and replaced Thin clients Thin clients Only presentation layer at clients (web browsers)Only presentation layer at clients (web browsers) Integrated data access Integrated data access Several database systems can be handled transparently at the middle tierSeveral database systems can be handled transparently at the middle tier Central management of connectionsCentral management of connections Scalability Scalability Replication at middle tier permits scalability of business logicReplication at middle tier permits scalability of business logic Software development Software development Code for business logic is centralizedCode for business logic is centralized Interaction between tiers through well-defined APIs: Can reuse standard components at each tierInteraction between tiers through well-defined APIs: Can reuse standard components at each tier

22 Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming

23 ER-Model  Entity: Real-world object distinguishable from other objects. E.g. Students, Courses.  An entity has multiple attributes. E.g. Students have ssn, name, phone.  Entities have relationships with each other. E.g. Students enroll Courses.

24 Example of ER Diagram title unit cid phone name ssn Enroll Students Courses time To implement the above design, store three tables in the database.

25 ssnnamephone 1111John617-373-5120 2222Alice781-322-6084 3333Victor617-442-7798 StudentscidtitleunitCSU430 Database Design 4 CSG131 Transaction Processing 4 CSG339 Data Mining 4 Coursesssncidtime1111CSU430Fall’03 1111CSG339Spring’04 2222CSG131Winter’03 2222CSG339Spring’04 3333CSU430Winter’01 Enroll

26 Key Constraint in ER Diagram dname address did phone name ssn BelongsTo Students Departments Many-to-one relationship: no need to be implemented as a table!

27 ssnnamephonedid 1111John617-373-51201 2222Alice781-322-60841 3333Victor617-442-77983 Studentsdiddnameaddress1 Computer Science #161 Cullinane 2 Electrical Engineering #300 Egan 3Physics #112 Richard Departments

28 Some Other Design Concepts Primary key Primary key Participation constraint Participation constraint Normal forms (BCNF, 3-NF, etc.) Normal forms (BCNF, 3-NF, etc.) IS-A hierarchy IS-A hierarchy Ternary relationships Ternary relationships

29 Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming

30 SQL Query Find the students in Computer Science Department. SELECT S.name FROM Students S WHERE S.did=1 if we know the did is 1: otherwise: SELECT S.name FROM Students S, Departments D WHERE D.did=S.did AND D.dname=`Computer Science’

31 SQL in Application Code SQL commands can be called from within a host language (e.g., C++, Java ) program. SQL commands can be called from within a host language (e.g., C++, Java ) program. Two main integration approaches: Two main integration approaches: Embed SQL in the host language (Embedded SQL, SQLJ)Embed SQL in the host language (Embedded SQL, SQLJ) Create special API to call SQL commands (JDBC)Create special API to call SQL commands (JDBC)

32 32 Implementation of Database System Introduction Donghui Zhang Partially using Prof. Hector Garcia-Molina’s slides (Notes01) http://www-db.stanford.edu/~ullman/dscb.html

33 33 Isn’t Implementing a Database System Simple? Relations Statements Results

34 34 Introducing the Database Management System The latest from Megatron Labs Incorporates latest relational technology UNIX compatible

35 35 Megatron 3000 Implementation Details Relations stored in files (ASCII) Relations stored in files (ASCII) e.g., relation R is in /usr/db/R Smith # 123 # CS Jones # 522 # EE...

36 36 Megatron 3000 Implementation Details Directory file (ASCII) in /usr/db/directory Directory file (ASCII) in /usr/db/directory R1 # A # INT # B # STR … R2 # C # STR # A # INT …...

37 37 Megatron 3000 Sample Sessions % MEGATRON3000 Welcome to MEGATRON 3000! & & quit %...

38 38 Megatron 3000 Sample Sessions & select * from R # Relation R A B C SMITH 123 CS &

39 39 Megatron 3000 Sample Sessions & select A,B from R,S where R.A = S.A and S.C > 100 # A B 123 CAR 522 CAT &

40 40 Megatron 3000 To execute “ select * from R where condition ”: To execute “ select * from R where condition ”: (1) Read directory file to get R attributes (2) Read R file, for each line: (a) Check condition (b) If OK, display

41 41 Megatron 3000 To execute “ select A,B from R,S where condition ”: To execute “ select A,B from R,S where condition ”: (1) Read dictionary to get R,S attributes (2) Read R file, for each line: (a) Read S file, for each line: (i) Create join tuple (ii) Check condition (iii) Display if OK

42 42 What’s wrong with the Megatron 3000 DBMS? Expensive update and search Expensive update and search e.g.,- To locate an employee with a given SSN, file scan. - To change “Cat” to “Cats”, complete file write. Solution: Indexing!

43 43 What’s wrong with the Megatron 3000 DBMS? Brute force query processing Brute force query processing e.g., select * from R,S where R.A = S.A and S.B > 1000 - Do select first? - More efficient join? Solution: Query optimization!

44 44 What’s wrong with the Megatron 3000 DBMS? No concurrency control or reliability No concurrency control or reliability e.g.,- if two client programs read your bank balance ($5000) and add $1000 to it… - Crash. Solution: Transaction management!


Download ppt "Introduction to Database Design Donghui Zhang CCIS, Northeastern University."

Similar presentations


Ads by Google