ITEC313 Database Programming

Slides:



Advertisements
Similar presentations
Database Planning, Design, and Administration
Advertisements

CSC271 Database Systems Lecture # 18. Summary: Previous Lecture  Transactions  Authorization  Authorization identifier, ownership, privileges  GRANT/REVOKE.
Introduction to Database Management  Department of Computer Science Northern Illinois University January 2001.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Database Planning, Design, and Administration Transparencies
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Chapter 6 Methodology Conceptual Databases Design Transparencies © Pearson Education Limited 1995, 2005.
1 Pertemuan 14 Perencanaan, Desain dan Administrasi Databases Matakuliah: >/ > Tahun: > Versi: >
Chapter 9 & 10 Database Planning, Design and Administration.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 2 Introduction to Database Development.
Introduction to Database Development. 2-2 Outline  Context for database development  Goals of database development  Phases of database development.
Chapter 2 Database Environment Pearson Education © 2014.
Methodology Conceptual Database Design
Lecture Nine Database Planning, Design, and Administration
Modeling & Designing the Database
The database development process
Database System Development Lifecycle Transparencies
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Database Management COP4540, SCS, FIU An Introduction to database system.
CSC271 Database Systems Lecture # 21. Summary: Previous Lecture  Phases of database SDLC  Prototyping (optional)  Implementation  Data conversion.
Database Environment 1.  Purpose of three-level database architecture.  Contents of external, conceptual, and internal levels.  Purpose of external/conceptual.
Chapter 9 Database Planning, Design, and Administration Sungchul Hong.
Database Planning, Design, and Administration Transparencies
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Database System Development Lifecycle
Overview of the Database Development Process
Database Design, Application Development, and Administration, 5 th Edition Copyright © 2011 by Michael V. Mannino All rights reserved. Chapter 2 Introduction.
Chapter 2 CIS Sungchul Hong
CSC271 Database Systems Lecture # 4.
ITEC224 Database Programming
Part3 Database Analysis and Design Techniques Chapter 04- Overview of Database Planning, Design and Administration Database Systems Lu Wei College of Software.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
ITEC 3220M Using and Designing Database Systems
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Methodology - Conceptual Database Design Transparencies
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
Methodology Conceptual Databases Design
University of Sunderland COM 220 Lecture Three Slide 1 Database Application Lifecycle.
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
1 Minggu 9, Pertemuan 17 Database Planning, Design, and Administration Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
Database Planning, Design, and Administration Transparencies
1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.
Methodology: Conceptual Databases Design
DATABASE MGMT SYSTEM (BCS 1423) Chapter 5: Methodology – Conceptual Database Design.
Database System Development Lifecycle 1.  Main components of the Infn System  What is Database System Development Life Cycle (DSDLC)  Phases of the.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Methodology - Conceptual Database Design
Part4 Methodology of Database Design Chapter 07- Overview of Conceptual Database Design Lu Wei College of Software and Microelectronics Northwestern Polytechnical.
Database Environment Chapter 2. Data Independence Sometimes the way data are physically organized depends on the requirements of the application. Result:
Methodology – Physical Database Design for Relational Databases.
Bayu Adhi Tama, M.T.I 1 © Pearson Education Limited 1995, 2005.
Chapter 9 & 10 Database Planning, Design and Administration Database Application Lifecycle DBMS Selection Database Administration.
Chapter 2 Database Environment.
1 Chapter 2 Database Environment Pearson Education © 2009.
Lecture On Introduction (DBMS) By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
IST 210 Database Design Process IST 210, Section 1 Todd S. Bacastow January 2004.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Chapter 9 Database Planning, Design, and Administration Transparencies © Pearson Education Limited 1995, 2005.
Methodology Conceptual Databases Design
ITEC 3220A Using and Designing Database Systems
Methodology Conceptual Database Design
Database System Development lifecycle
Database System Development Lifecycle
Database Planning, Design and Administration
Methodology Conceptual Databases Design
Presentation transcript:

ITEC313 Database Programming Lecture 1: Database Design Methodology : Introduction

Learning Objectives Database Design Terminology Purpose of Database Design Phases of Database Design

Database and Database System A database is a shared collection of logically related data designed to meet the information needs of an organization. Components of a Database Systems Database Hardware Software - DBMS Users

Database The data in the database will be expected to be both integrated and shared particularly on multi-user systems Integration - The database may be thought of as a unification of several otherwise distinct files, with any redundancy among these files eliminated Shared - individual pieces of data in the database may be shared among several different users

Hardware These are secondary storage on which the database physically resides, together with the associated I/O devices, device controllers etc.

DBMS Examples of DBMS Products Oracle Informix Access DB2 Fox pro dBase SQL Server My SQL

Typical Functions of DBMS Functions of a DBMS Data storage, retrieval and update A user-accessible catalog Transaction support Concurrency and control services Recovery services Authorization services Support of data communication Integrity Services Services to promote data independence Utility services Typical Functions of DBMS

Users Application Programmer - writes programs that use the database Database Designers - designs conceptual and logical database Database Administrator (DBA) Data Administrator End - user - interacts with the system from an on-line terminal by using Query Languages etc.

Data & Database Administration Data Administrator – a business manager responsible for controlling the overall corporate data resources Database Administrator (DBA) - a technical person responsible for development of the total system

Sample Applications Student Records Stock control Banking Insurance Billing Systems e.g. Electricity, Phone ISPs Accounting Systems Reservation Systems e.g. Airline, Hotel Medical Records Stock control Personnel systems Product catalogues Telephone directories Train timetables Airline bookings Credit card details Customer histories Stock market prices Discussion boards Web indexes Library catalogues

Advantages Control of data redundancy Data consistency Multipurpose use of data Sharing of data, Enforcement of standards Economy of scale Balance conflicting user requirement Improved data accessibility and responsiveness Increased productivity Improved maintenance through data independence Increased concurrency Improved backup and recovery services.

Disadvantages Complexity Size Cost of DBMS Additional hardware costs Cost of conversion

database schema database instance the description of the database is called the database schema or intension; specified at the creation of the database not expected to change very often database instance the raw data that populates a database at a particular moment in time is called a database instance of the extension of the database

Data Independence Software maintenance is a large part (50%) of information system budgets Reduce impact of changes by separating database description from applications Change database definition with minimal effect on applications that use the database Data Independence: a database should have an identity separate from the applications (computer programs, forms, and reports) that use it. The separate identity allows the database definition to be changed without affecting related applications. The close association between a database and related programs led to problems in software maintenance. Software maintenance encompassing requirement changes, corrections, and enhancements can consume a large fraction of computer budgets. In early DBMSs, most changes to the database definition caused changes to computer programs. In many cases, changes to computer programs involved detailed inspection of the code, a labor-intensive process. This code inspection work is similar to year 2000 compliance where date formats must be changed to four digits. Performance tuning of a database was difficult because sometimes hundreds of computer programs had to be recompiled for every change. Because database definition changes are common, a large fraction of software maintenance resources were devoted to database changes. Some studies have estimated the percentage as high as 50% of software maintenance resources.

Three Schema Architecture

Three Schema Architecture

Database Architecture External Level – concerned with the way users perceive the database Conceptual Level – concerned with abstract representation of the database in its entirety Internal Level – concerned with the way data is actually stored

Differences among Levels External Course Registration Form Instructor load assignments Conceptual: Tables: student, course,takes, … Internal Files needed to store the tables Extra files to improve performance To make the three schema levels clearer, we can examine differences among database definition at the three schema levels using examples. Even in a simplified university database, the differences among the schema levels is clear. With a more complex database, the differences would be even more pronounced with many more views, a much larger conceptual schema, and a more complex internal schema. The schema mappings describe how a schema at a higher level is derived from a schema at a lower level. For example, the external views in are derived from the tables in the conceptual schema. The mapping provides the knowledge to convert a request using an external view into a request using the tables in the conceptual schema. The mapping between conceptual and internal levels shows how entities are stored in files.

Architecture of Db System External Level Application 1 Application 2 Application 3 Logical Data Independence Conceptual Level DBMS Physical Data Independence Database Internal Level

Data Independence Logical Data Independence – users and user programs are independent of logical structure of the database Physical Data Independence – the separation of structural information about the data from the programs that manipulate and use the data i.e. the immunity of application programs to changes in the storage structure and access strategy

Data Independence Different applications will need different views of the same data, so that if they are not interested in a part of the database, that part need not be included in their view. This feature is also important for controlling access to parts of database The DBA must have the freedom to change the storage structure or access strategy in response to changing requirements, without having to modify the existing applications

Client-Server Architecture Client-Server Architecture: an arrangement of components (clients and servers) and data among computers connected by a network. The client-server architecture supports efficient processing of messages (requests for service) between clients and servers. To improve performance and availability of data, the client-server architecture supports many ways to distribute software and data in a computer network. The simplest scheme is just to place both software and data on the same computer (Figure 13(a)). To take advantage of a network, both software and data can be distributed. In Figure 13(b), the server software and database are located on a remote computer. In Figure 13(c), the server software and database are located on multiple remote computers.

Database Development In the past many software development projects were unsuccessful due to: requirements were not properly collected/specified Lack of development methodology The stages in the DB development cycle has been identified: Clearly specified Not sequential, but involve some repetition. Contain feedback loops (even back to the requirements stage)

Db Development Life Cycle Database planning System definition Requirement collection and analysis Database design DBMS selection Application design Prototyping Implementation Data conversion and loading Testing Operational maintenance

Database Application Lifecycle DATABASE PLANNING SYSTEMS DEFINITION REQUIREMENTS ANALYSIS Database Design CONCEPTUAL DESIGN DBMS SELECTION LOGICAL DESIGN APPLICATION DESIGN DISTRIBUTED DB DESIGN Optional PHYSICAL DESIGN PROTOTYPING These are the stages in the development cycle. It is important to note that these are not sequential, but involve some repetition. Sometimes this going back to previous stages is called feedback loops. For example a problem of design might mean going back to the requirements stage and collecting more data. PLANNING - Identifying how the stages can be completed in the most effective & efficient way. SYSTEM DEFINITION - Specifying the scope & boundaries of the system. REQUIREMENTS ANALYSIS - Collection of requirements of users DB DESIGN - The design of the DB itself DBMS Selection (Optional). Which software APPLICATION DESIGN - Designing the functionality of the DB PROPTOTYPING(Optional) - building a working model IMPLEMENTATION - Creating the DB (tables etc) and the application programs. DATA CONVERSION & LOADING - (Only if replacing an old system). TESTING - MAINTENANCE - Monitoring, adding new requirements. IMPLEMENTATION DATA LOADING TESTING MAINTENANCE

Database Application Lifecycle Management activities that allow the stages of the database application to be realized as efficiently as possible Database Planning : The scope and boundaries of the application including its major application areas and user groups System Definition : Encompasses tasks that determine the needs or conditions to meet for a new or altered product, taking account of the possibly conflicting, vague and incomplete requirements of the various stakeholders Requirements Analysis: Planning: has two stages (a) Planning factors (2) Planning Objectives. (1) Planning factors involve (a)work to be done (b) the resources to do it © cost However… Must be done within the overall planning strategy of the organisation Therefore, several factors to be taken into account: Identify goals of the organisation, e.g. 10% growth per year No redundancies Identify critical success factors, e.g. High quality products On-time deliveries Identify problem areas, e.g. Inaccurate stock control More competition (2)Identify Planning Objectives Organisational Units Consist of various departments Locations List of operational locations Business Functions Identify related business processes Entity Types System definition : Here you want to know at a very high level, what the boundaries of the system are: You might need to think about who the users are, what the current application areas are etc Something for which data is collected Identify boundaries Want to know at a very high level what the boundaries of the system are, e.g. Current application areas Current users Identify interfaces within organisation Requirement Analysis : Database design should reflect the information within the organisation Critical information Documentation used Main application areas and user groups Details of transactions needed Amount gathered depends on size of organisation and scope of application Use information to (1) draw up a prioritised user requirement specification (2) Document the requirements clearly Requirements analysis is critical to the success of a development project. A DB system with inadequate functionality will fail => reqs are important Requirements must be documented, actionable, measurable, testable, related to identified business needs or opportunities, and defined to a level of detail sufficient for system design. Database Design is based on the information about the organisation. There are many ways of gathering such information interviewing observing examining documents using questionnaires using experience from the design of other systems etc The information gathered should include the main application areas and user groups; the documentation used. Details of the transactions needed. A prioritised user requirement specification should be drawn up. This is a preliminary stage to logical database design. The amount of data gathered depends on the size of the organisation, the scope of the application to be developed etc. It is VERY important to document the requirements and there are all kinds of techniques that can be used for this - Data Flow Diagrams, Document/use matrices etc. You will be studying some of these techniques in the Systems Analysis & design module. Identifying the required functionality for a database system is crucial, as systems with inadequate functionality will fail.

Database Application Lifecycle Design of the user interface and the application programs that use and process the database. Application Design : Building a working model of a database application Prototyping : Physical realization of the database and application design Implementation :

Database Application Lifecycle Transferring any existing data into the new database and converting any existing processes to run on the new database. Data Conversion and Loading : Process of executing the application programs with the intent of finding errors. Testing : Process of monitoring and maintaining the system following installation. Operational Maintenance : Data conversion and Loading: used only if there is an existing system. Normally DBMSs provide a facility to load the data. Eg. Oracle sql loader or other tools such as data pump fmay be used Testing:

Planning Planning Factors Planning Objectives Two stages The work to be done The resources to do it The cost Planning Objectives Organisational Units Consist of various departments Locations List of operational locations Business Functions Identify related business processes Entity Types Something for which data is collected Must be done within the overall planning strategy of the organisation Therefore, several factors to be taken into account: Identify goals of the organisation, e.g. 10% growth per year No redundancies Identify critical success factors, e.g. High quality products On-time deliveries Identify problem areas, e.g. Inaccurate stock control More competition

System Definition Identify boundaries Want to know at a very high level what the boundaries of the system are, e.g. Current users Current application areas Identify interfaces within organization Here you want to know at a very high level, what the boundaries of the system are: You might need to think about who the users are, what the current application areas are etc

Requirements Analysis Database design should reflect the information within the organisation Many ways of gathering information interviewing observing examining documents using questionnaires using experience from the design of other systems … Database Design is based on the information about the organisation. There are many ways of gathering such information interviewing observing examining documents using questionnaires using experience from the design of other systems etc The information gathered should include the main application areas and user groups; the documentation used. Details of the transactions needed. A prioritised user requirement specification should be drawn up. This is a preliminary stage to logical database design. The amount of data gathered depends on the size of the organisation, the scope of the application to be developed etc. It is VERY important to document the requirements and there are all kinds of techniques that can be used for this - Data Flow Diagrams, Document/use matrices etc. You will be studying some of these techniques in the Systems Analysis & design module. Identifying the required functionality for a database system is crucial, as systems with inadequate functionality will fail.

Requirements Analysis Critical information Main application areas and user groups Documentation used Details of transactions needed A prioritized user requirement specification Amount gathered depends on size of organization and scope of application Documentation is VERY important DFD, matrices etc. The information gathered should include the main application areas and user groups; the documentation used. Details of the transactions needed. A prioritised user requirement specification should be drawn up. This is a preliminary stage to logical database design. The amount of data gathered depends on the size of the organisation, the scope of the application to be developed etc. It is VERY important to document the requirements and there are all kinds of techniques that can be used for this - Data Flow Diagrams, Document/use matrices etc. You will be studying some of these techniques in the Systems Analysis & design module. Identifying the required functionality for a database system is crucial, as systems with inadequate functionality will fail. Identifying the required functionality for a database system is crucial: systems with inadequate functionality will fail

Database Design MAIN AIMS To represent data & relationships required by users and applications To provide a data model which supports transactions To specify a design that meets performance requirements

Database Design Approaches BOTTOM UP begins at the level of attributes and then adds entities as new relationships are seen. Normalization is an example of this. TOP-DOWN starts with the development of the data model that contains a few high level entities and then it refines them in ever increasing detail. Data modeling comes under this. There are other approaches, e.g. the back of the fag packet approach (not very successful), and a mixed approach.

Phases of database Design Remember the main phases: Conceptual Database Design Logical Database design Distributed Database Design (optional) Physical Database Design The next few slides will examine each of these phases.

Conceptual Database Design Create a conceptual data model Use data modeling to understand each users perspective of data the data Use of data across applications Independent of any implementation details DBMS or physical aspects are immaterial Based on user requirements specification assists in understanding data facilitates communication Building a data model requires you to ask a lot of questions about entities, relationships etc. We undertake data modelling to try to understand each user’s perspective of the data, to understand the data itself (independently of its physical representation, and to understand the use of data across applications) We also use data modelling to convey the designers understanding of the data. Data modelling is based on the Entity-Relationship model. Data Modelling is a top down approach to design - because it starts at the ‘top’ of the organisation and then works down to a more & more detailed view. Very good at identifying the data used in an organisation. It is not normally enough on its own though - because you have no real way of knowing whether all of the entities have been identified.

Logical database design The data model created in the previous phase is refined At this point you know which type of DBMS you will implementing in - e.g. relational, object-oriented … but not the actual DBMS Test the correctness of the data model through Normalization Validation against user transactions Normalization is a bottom up approach to design. It is driven by a set of rules which determine how the data is structured. It is time consuming to perform. Sometimes it is difficult to know how many levels of normalization should be used. It is also dependent on the analyst understanding the rules. The best approach is usually a combination of both of these. The entity model can be compared to the normalized data model to identify any differences etc.

Database selection A crucial stage in the database Application lifecycle is choosing the DB. The aim is to choose a system that allows expansion enables speedy retrieval gives easy application development etc. All data should have been collected and documented before DB selection Many organizations in practice choose a DBMS purely on the basis of cost. A crucial stage in the database Application lifecycle is choosing the DB. The aim is to choose a system that allows expansion, that enables speedy retrieval, gives easy application development etc. Ideally before engaging in DB selection all data should have been collected and documented. Many organisations in practice choose a DBMS purely on the basis of cost. A simple way to choose a DB is to have a checklist of things to look for. Terms of Reference Prior to any software evaluation, the scope of the study should be stated. This should include a potential list of the products to be assessed,the criteria to be used, timescales etc. Identify products The info in the terms of reference is used to draw up the list. Decisions about harware, compatibility with existing systems, cost etc should be used to draw up the list. User support, upgrades etc can be taken into account. Produce Shortlist Shortlist 2 or 3 products usually by an analysis of the features of each. Get students to write down some of the features that they think could be included. The best way is to weight features so that they can be compared. Evaluate Products Allow vendors to give presentations, involve users. Recommend one This is the final stage. Produce a report of findings. Give details of the criteria, and how each product measured up.

Database selection Define terms of reference Identify products the scope of the study should be stated potential list of the products to be assessed the criteria to be used, timescales … Identify products hardware, compatibility with existing systems, cost .. User support upgrades … Produce shortlist of products Shortlist 2-3 products Evaluate products Ask Vendors Involve Users Recommend selection and produce report Give details of criteria used Compare/Contrast alternatives A simple way to choose a DB is to have a checklist of things to look for. Terms of Reference Prior to any software evaluation, the scope of the study should be stated. This should include a potential list of the products to be assessed,the criteria to be used, timescales etc. Identify products The info in the terms of reference is used to draw up the list. Decisions about harware, compatibility with existing systems, cost etc should be used to draw up the list. User support, upgrades etc can be taken into account. Produce Shortlist Shortlist 2 or 3 products usually by an analysis of the features of each. Get students to write down some of the features that they think could be included. The best way is to weight features so that they can be compared. Evaluate Products Allow vendors to give presentations, involve users. Recommend one This is the final stage. Produce a report of findings. Give details of the criteria, and how each product measured up.

Physical Database Design HOW to physically implement the logical data model derive tables & constraints identify storage structures and access methods design security features The aim of physical design is to describe how we intend to physically implement the logical data model. At this point the target DBMS is known. Physical design is about HOW. Logical design is about WHAT. We will look in detail later at the steps to go through to physically implement a design. Derive Tables etc Storage structures e.g. B-Tree - , hashing etc., choosing indexes etc. Demoralization for performance purposes etc. Security - who uses what, authorization rules etc.

Application Design Design transactions Design human interface Design of software programs which will process the data Design transactions data to be used by transactions functions of the transactions output of transactions programs Design human interface Various guidelines Application design is the design of software programs which will process the data. There are several techniques for specifying the high level transactions that need to occur. The design of transactions is based on the information given in the requirements specification. Design Interface (ITEC333) - meaningful titles, - good instructions - consistent use of colour etc

Prototyping used to check Building a working model developer’s understanding of what is required interpretation of requirements Building a working model Inexpensive & quick to build At various points we can build a prototype. It is primarily used to check developer’s understanding of what is required. There are different kinds of prototyping, e.g. simulation

Implementation Database created using DDL Implement application programs using selected language Implement security & integrity controls On completion of the design stage we are ready to implement.

Data Loading/Conversion Transfer any existing data Insert any new data Usually there is a facility within the DBMS to load data into a database This stage is only used when there is an existing system. Usually there is a facility within the DBMS to load data into a database. (e.g. In Oracle this facility is called SQL Loader)

Testing The process of executing the application programs with the intention of finding errors. Use realistic data Involve users There are various strategies that can be used: White Box Black box testing This is achieved through strategies for testing. This means using realistic data. Users should be involved in testing. There are various strategies that can be used- White Box & Black box testing (software engineering course)

Maintenance Monitoring Performance Maintaining and Upgrading Various tools are available Maintaining and Upgrading Once the DB is fully operational, close monitoring should take place to ensure that performance is acceptable. Usually there are various monitoring tools that are available to do this.

Overview of Database Design Purpose of Data Modeling Assist in understanding of the semantics of data Facilitate the communication about information requirements Expressability Simplicity Integrity Extensibility Nonredundancy Diagrammatic Representation Structural Validity Nonredundancy Integrity Simplicity Shareability Expressability Extensibility Diagrammatic Representation Structural Validity Shareability Criteria for Optimal Data Models

Database Design Methodology A structured approach that uses procedures, techniques, tools and documentation aids to support and facilitate the process of design Interaction with users DBDL Data-driven approach Data dictionary validate Structured methodology Repeat Structural and integrity considerations diagrams

Broad Goals of Database Development Develop a common vocabulary Define data meaning Ensure data quality Provide efficient implementation See next slides for more details about each goal

Develop a Common Vocabulary Diverse groups of users Difficult to obtain acceptance of a common vocabulary Compromise to find least objectionable solution Unify organization by establishing a common vocabulary Define data meaning: - Business rules: how an organization operates - How restrictive should rules be? - Too restrictive: reject valid business interactions - Too loose: allow erroneous business interactions - Role of exceptions: area between clear cut correct and errors - Example: - Faculty assignment to courses: timing issue - Prerequisite check: allow prerequisites to be violated Data quality: - Many measures - Poor data quality leads to poor decision making - Difficult customer communication: lost sales; more time with complaints - Poor sales forecasts: inventory problems - More data quality is better but at what cost - Consider tradeoffs to improve data quality measures - Some measures may not be apparent until later: consistency across systems - Long term vs. short term considerations Efficient implementation: - Trumps other goals: no system if not efficient enough - Complex subject: focus of advanced db course - DBMS specific

Define Meaning of Data Business rules support organizational policies Restrictiveness of business rules Too restrictive: reject valid business interactions Too loose: allow erroneous business interactions Exceptions allow flexibility Example: - Faculty assignment to courses: timing issue - Prerequisite check: allow prerequisites to be violated Efficient implementation: - Trumps other goals: no system if not efficient enough - Complex subject: focus of advanced db course - DBMS specific

Data Quality Poor data quality leads to poor decision making Difficult customer communication Inventory shortages Cost-benefit tradeoff to achieve desired level of data quality Long-term effects of poor data quality Apply resources to improve: cost-benefit tradeoff Timeliness: - Frequency of sampling prices - Automated equipment to synchronize prices

Data Quality Measures Completeness Lack of ambiguity Timeliness Correctness Consistency Reliability Measures: - Completeness: database represents all important parts of an information system - Lack of ambiguity: each part of a database has only one meaning - Timeliness: business changes are posted to a database without excessive delays - Correctness: database contains values perceived by the user - Consistency: different parts of a database do not conflict - Reliability: failures or interference do not corrupt database Importance of measure depends on the database, system, and organization Each measure can be quantified

Data Quality Measures Completeness: Lack of ambiguity: Timeliness: database represents all important parts of an information system Lack of ambiguity: each part of a database has only one meaning Timeliness: business changes are posted to a database without excessive delays Correctness: database contains values perceived by the user Consistency: different parts of a database do not conflict Reliability: failures or interference do not corrupt database Importance of measure depends on the database, system, and organization Each measure can be quantified

Efficient Implementation Supersedes other goals Optimization problem Maximize performance Subject to constraints of data quality, data meaning, and resource usage Difficult problem: Number of choices Relationships among choices DBMS specific Efficient implementation: - Trumps other goals: no system if not efficient enough - Complex subject: focus of advanced db course - DBMS specific

Database Development Phases Conceptual Data Modeling Logical Database Design Distributed Database Physical Database ERD Tables Distribution Schema Internal Schema, Populated DB Data requirements OPTIONAL Input: - Data requirements come in many formats - Description of data needs - Documentation of existing system - Proposed forms and reports Phases: - Logical information content: Conceptual data modeling and logical database design - Performance: distributed and physical db design

Database Design Conceptual database design - the process of constructing a model of the information used in an organization, independent of all physical considerations Step 1 Build local conceptual data model for each user view

Database Design Logical database design for the relational model - the process of constructing a model of the info used in an organization based on a specific data model, but independent of a particular DBMS and other physical considerations Step 2 Build and validate local data model for each user view Step 3 Build and validate global logical data model

Database Design Physical database design for relational databases - the process of producing a description of the implementation of the database on secondary storage. Step 4 Translate global data model for target DBMS Step 5 Design physical representation Step 6 Design security mechanisms Step 7 Monitor and tune the operational system

Phases of Database Design Process of constructing a model of the information used in an enterprise independent of all physical considerations Conceptual Database Design Process of constructing a model of information used in an enterprise based on a specific data model but independent of a particular DBMS or any other physical considerations Logical Database Design (Optional)Process of deciding about the placement of data across the sites of a computer network. Involves designing the network itself, as well as distribution of DBMS software, DB applications and data Distributed Database Design Description of the implementation of the database on secondary storage. It describes the storage structures and access methods for efficient access. Physical Database Design

Overview of Database Design Build local conceptual data model for each user view Conceptual Logical Build and Validate local logical data model for each user view Build and validate global logical Model Translate global logical model for target DBMS Physical Design Physical representation Design Security Mechanisms Monitor and Tune operational system

Centralized Approach to Managing Multiple User Views We do not use this approach Pearson Education © 2009

View Integration Approach to Managing Multiple User Views

Conceptual Database Design Build local conceptual data model for each user view 1.1 Identify entity types 1.2 Identify relationship types 1.3 Identify and associate attributes with entity or relationship types 1.4 Determine Attribute Domains 1.5 Determine candidate and primary key attributes 1.6 Specialize/generalize entity types 1.7 Draw Entity-Relationship Diagram 1.8 Review local conceptual data model with user

Conceptual Database Design 1. Build local conceptual data model for each user view 1.1 Identify entity types 1.2 Identify relationship types 1.3 Identify and associate attributes with entity or relationship types 1.4 Determine Attribute Domains 1.5 Determine candidate and primary key attributes 1.6 Specialize/generalize entity types 1.7 Draw Entity-Relationship Diagram 1.8 Review local conceptual data model with user

Logical Database Design Build and validate local logical data model 2.1 Map local Conceptual data model to local local data model 2.2 Derive relations from local logical data model 2.3 Validate model using normalization 2.4 Validate model against user transactions 2.5 Draw Entity relationship Diagram 2.6 Define integrity constraints 2.7 Review Local logical data model with user

Logical Database Design 2. Build and validate local logical data model 2.1 Map local Conceptual data model to local logical data model 2.2 Derive relations from local logical data model 2.3 Validate model using normalization 2.4 Validate model against user transactions 2.5 Draw Entity relationship Diagram 2.6 Define integrity constraints 2.7 Review Local logical data model with user

Logical Database Design Build and Validate Global Logical data model 3.1 Merge local logical data models into global model 3.2 Validate global logical data model 3.3 Check for future growth 3.4 Draw final Entity Relationship diagram 3.5 Review global logical data model with users

Logical Database Design 3. Build and Validate Global Logical data model 3.1 Merge local logical data models into global model 3.2 Validate global logical data model 3.3 Check for future growth 3.4 Draw final Entity Relationship diagram 3.5 Review global logical data model with users

Physical Database Design Translate Global Logical Data Model for target DBMS 4.1 Design base relations for target DBMS 4.2 Design enterprise constraints for target DBMS Design Physical Representations 5.1 Analyze transactions 5.2 Choose file organizations

Physical Database design 5.3 Choose secondary indexes 5.4 Consider introduction of controlled redundancy Design Security Mechanisms 6.1 Design user views 6.2 Design access rules Monitor and tune operational system

END OF LECTURE