Download presentation
Presentation is loading. Please wait.
Published byIndia Rimmer Modified over 9 years ago
1
Introduction to Database Design Donghui Zhang CCIS, Northeastern University
2
Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming
3
Database, DBMS A Database is a very large, integrated collection of data. A Database is a very large, integrated collection of data. A Database Management System (DBMS) is a software designed to store and manage databases. A Database Management System (DBMS) is a software designed to store and manage databases. A Database Application is a software which enables the users to access the database. A Database Application is a software which enables the users to access the database.
4
Why DBMS? We currently live in a world experiencing information explosion. We currently live in a world experiencing information explosion. To manage the huge amount of data: DBMS To manage the huge amount of data: DBMS the total RDBMS market in 2003 was $7 billion in license revenues. the total RDBMS market in 2003 was $7 billion in license revenues. Much more money was spent to develop Database applications. Much more money was spent to develop Database applications.
5
Total revenue: 7.1 billion in 2003.
6
The worldwide database management software market saw double-digit growth in 2004. The worldwide database management software market saw double-digit growth in 2004. The five-year forecast calls for a compound annual growth rate of nearly 6 percent, bringing the market to $12.7 billion in new license revenue by 2009. The five-year forecast calls for a compound annual growth rate of nearly 6 percent, bringing the market to $12.7 billion in new license revenue by 2009. Title: Forecast: Database Management Systems Software, Worldwide, 2003-2009 Title: Forecast: Database Management Systems Software, Worldwide, 2003-2009 Author: Colleen Graham, Gartner Author: Colleen Graham, Gartner Time: April 21, 2005 Time: April 21, 2005
8
DBMS can Provide … Data independence and efficient access. Data independence and efficient access. Reduced application development time. Reduced application development time. Data integrity and security. Data integrity and security. Uniform data administration. Uniform data administration. Concurrent access, recovery from crashes. Concurrent access, recovery from crashes.
9
DBMS Historic Points First DBMS developed by Turing Award winner Charles Bachman in the early 1960s. First DBMS developed by Turing Award winner Charles Bachman in the early 1960s. in 1970, Turing Award winner Edgar Codd proposed the relational data model. in 1970, Turing Award winner Edgar Codd proposed the relational data model. in the late 1980s, IBM proposed SQL. in the late 1980s, IBM proposed SQL.
10
Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming
11
Components of Data-Intensive Systems Three separate types of functionality: Data management Data management Application logic Application logic Presentation Presentation
12
Example: Course Enrollment -- Build a system using which students can enroll in courses: Data Management Data Management Student info, course info, instructor info, course availability, pre-requisites, etc.Student info, course info, instructor info, course availability, pre-requisites, etc. Application Logic Application Logic Logic to add a course, drop a course, create a new course, etc.Logic to add a course, drop a course, create a new course, etc. Presentation Presentation Log in different users (students, staff, faculty), display forms and human-readable outputLog in different users (students, staff, faculty), display forms and human-readable output
13
The Three-Tier Architecture Database System Application Server Client Program (Web Browser) Presentation tier Middle tier Data management tier
14
E.g. What we use Database System Application Server Client Program (Web Browser) Presentation tier Middle tier Data management tier MySQL Apache JSP
15
HTML: An Example <HTML> Barns and Nobble Internet Bookstore Barns and Nobble Internet Bookstore Our inventory: Our inventory: Science Science The Character of Physical Law The Character of Physical Law Author: Richard Feynman Author: Richard Feynman Published 1980 Published 1980 <LI>Hardcover</LI> Fiction Fiction Waiting for the Mahatma Waiting for the Mahatma Author: R.K. Narayan Author: R.K. Narayan Published 1981 Published 1981 The English Teacher The English Teacher Author: R.K. Narayan Author: R.K. Narayan Published 1980 Published 1980 <LI>Paperback</LI> </HTML>
16
HTML: static vs dynamic Static: you create an HTML file which is sent to the client’s web browser upon request. E.g.: Static: you create an HTML file which is sent to the client’s web browser upon request. E.g.: your CCIS login is ‘donghui’,your CCIS login is ‘donghui’, your HTML file is /home/donghui/.www/index.htmlyour HTML file is /home/donghui/.www/index.html The URL is http://www.ccs.neu.edu/home/donghuiThe URL is http://www.ccs.neu.edu/home/donghui http://www.ccs.neu.edu/home/donghui Dynamic: the HTML file is generated dynamically via your ASP.NET code. Dynamic: the HTML file is generated dynamically via your ASP.NET code.
17
Another View MySQL Machine 1 Apache Your JSP Code Machine 2 Client Machines Client browser 1 Client browser 2 Client browser 3 Your database
18
Client-Server Architecture Data Management: DBMS @ Server. Data Management: DBMS @ Server. Presentation: Client program. Presentation: Client program. Application Logic: can go either way. Application Logic: can go either way. If combined with server: thin-client architectureIf combined with server: thin-client architecture If combined with client: thick-client architectureIf combined with client: thick-client architecture ServerClient
19
Thin-Client Architecture Database server and web server too closely coupled,Database server and web server too closely coupled, E.g. Does not allow the application logic to access multiple databases on different servers.E.g. Does not allow the application logic to access multiple databases on different servers. Server Client
20
Thick-Client Architecture No central place to update the business logicNo central place to update the business logic Security issues: Server needs to trust clientsSecurity issues: Server needs to trust clients Does not scale to more than several 100s of clientsDoes not scale to more than several 100s of clients Server Client
21
Advantages of the Three-Tier Architecture Heterogeneous systems Heterogeneous systems Tiers can be independently maintained, modified, and replacedTiers can be independently maintained, modified, and replaced Thin clients Thin clients Only presentation layer at clients (web browsers)Only presentation layer at clients (web browsers) Integrated data access Integrated data access Several database systems can be handled transparently at the middle tierSeveral database systems can be handled transparently at the middle tier Central management of connectionsCentral management of connections Scalability Scalability Replication at middle tier permits scalability of business logicReplication at middle tier permits scalability of business logic Software development Software development Code for business logic is centralizedCode for business logic is centralized Interaction between tiers through well-defined APIs: Can reuse standard components at each tierInteraction between tiers through well-defined APIs: Can reuse standard components at each tier
22
Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming
23
ER-Model Entity: Real-world object distinguishable from other objects. E.g. Students, Courses. An entity has multiple attributes. E.g. Students have ssn, name, phone. Entities have relationships with each other. E.g. Students enroll Courses.
24
Example of ER Diagram title unit cid phone name ssn Enroll Students Courses time To implement the above design, store three tables in the database.
25
ssnnamephone 1111John617-373-5120 2222Alice781-322-6084 3333Victor617-442-7798 StudentscidtitleunitCSU430 Database Design 4 CSG131 Transaction Processing 4 CSG339 Data Mining 4 Coursesssncidtime1111CSU430Fall’03 1111CSG339Spring’04 2222CSG131Winter’03 2222CSG339Spring’04 3333CSU430Winter’01 Enroll
26
Key Constraint in ER Diagram dname address did phone name ssn BelongsTo Students Departments Many-to-one relationship: no need to be implemented as a table!
27
ssnnamephonedid 1111John617-373-51201 2222Alice781-322-60841 3333Victor617-442-77983 Studentsdiddnameaddress1 Computer Science #161 Cullinane 2 Electrical Engineering #300 Egan 3Physics #112 Richard Departments
28
Some Other Design Concepts Primary key Primary key Participation constraint Participation constraint Normal forms (BCNF, 3-NF, etc.) Normal forms (BCNF, 3-NF, etc.) IS-A hierarchy IS-A hierarchy Ternary relationships Ternary relationships
29
Outline Database and DBMS Database and DBMS Architecture of Database Applications Architecture of Database Applications Database Design Database Design Database Application Programming Database Application Programming
30
SQL Query Find the students in Computer Science Department. SELECT S.name FROM Students S WHERE S.did=1 if we know the did is 1: otherwise: SELECT S.name FROM Students S, Departments D WHERE D.did=S.did AND D.dname=`Computer Science’
31
SQL in Application Code SQL commands can be called from within a host language (e.g., C++, Java ) program. SQL commands can be called from within a host language (e.g., C++, Java ) program. Two main integration approaches: Two main integration approaches: Embed SQL in the host language (Embedded SQL, SQLJ)Embed SQL in the host language (Embedded SQL, SQLJ) Create special API to call SQL commands (JDBC)Create special API to call SQL commands (JDBC)
32
32 Implementation of Database System Introduction Donghui Zhang Partially using Prof. Hector Garcia-Molina’s slides (Notes01) http://www-db.stanford.edu/~ullman/dscb.html
33
33 Isn’t Implementing a Database System Simple? Relations Statements Results
34
34 Introducing the Database Management System The latest from Megatron Labs Incorporates latest relational technology UNIX compatible
35
35 Megatron 3000 Implementation Details Relations stored in files (ASCII) Relations stored in files (ASCII) e.g., relation R is in /usr/db/R Smith # 123 # CS Jones # 522 # EE...
36
36 Megatron 3000 Implementation Details Directory file (ASCII) in /usr/db/directory Directory file (ASCII) in /usr/db/directory R1 # A # INT # B # STR … R2 # C # STR # A # INT …...
37
37 Megatron 3000 Sample Sessions % MEGATRON3000 Welcome to MEGATRON 3000! & & quit %...
38
38 Megatron 3000 Sample Sessions & select * from R # Relation R A B C SMITH 123 CS &
39
39 Megatron 3000 Sample Sessions & select A,B from R,S where R.A = S.A and S.C > 100 # A B 123 CAR 522 CAT &
40
40 Megatron 3000 To execute “ select * from R where condition ”: To execute “ select * from R where condition ”: (1) Read directory file to get R attributes (2) Read R file, for each line: (a) Check condition (b) If OK, display
41
41 Megatron 3000 To execute “ select A,B from R,S where condition ”: To execute “ select A,B from R,S where condition ”: (1) Read dictionary to get R,S attributes (2) Read R file, for each line: (a) Read S file, for each line: (i) Create join tuple (ii) Check condition (iii) Display if OK
42
42 What’s wrong with the Megatron 3000 DBMS? Expensive update and search Expensive update and search e.g.,- To locate an employee with a given SSN, file scan. - To change “Cat” to “Cats”, complete file write. Solution: Indexing!
43
43 What’s wrong with the Megatron 3000 DBMS? Brute force query processing Brute force query processing e.g., select * from R,S where R.A = S.A and S.B > 1000 - Do select first? - More efficient join? Solution: Query optimization!
44
44 What’s wrong with the Megatron 3000 DBMS? No concurrency control or reliability No concurrency control or reliability e.g.,- if two client programs read your bank balance ($5000) and add $1000 to it… - Crash. Solution: Transaction management!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.