Fundamental database concepts

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

Chapter 10: Designing Databases
Entity-Relationship (ER) Modeling
Introduction to Databases
The Relational Model System Development Life Cycle Normalisation
© Copyright 2011 John Wiley & Sons, Inc.
Oct 31, 2000Database Management -- Fall R. Larson Database Management: Introduction to Terms and Concepts University of California, Berkeley School.
Chapter 14 (Web): Object-Oriented Data Modeling
Database Management: Getting Data Together Chapter 14.
Data Management Design
Geographic Information Systems
Object-Oriented Databases
Introduction to Databases Transparencies
Chapter 11 Data Management Layer Design
Modern Systems Analysis and Design Third Edition
Chapter 3: Modeling Data in the Organization
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
Object Oriented Databases - Overview
Chapter 4 Entity Relationship (E-R) Modeling
Chapter 3 © 2005 by Prentice Hall 1 Objectives Definition of terms Definition of terms Importance of data modeling Importance of data modeling Write good.
RIZWAN REHMAN, CCS, DU. Advantages of ORDBMSs  The main advantages of extending the relational data model come from reuse and sharing.  Reuse comes.
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Chapter 4 The Relational Model.
11 1 Object oriented DB (not in book) Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Learning objectives: What.
Shaowen Wang CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department of Geography and National Center for Supercomputing Applications.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
© 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 8 Slide 1 Chapter 9 Structuring System Data Requirements.
1 ER Modeling BUAD/American University Entity Relationship (ER) Modeling.
Concepts and Terminology Introduction to Database.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 20 Object-Oriented.
11 Chapter 11 Object-Oriented Databases Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
Lecture 12 Designing Databases 12.1 COSC4406: Software Engineering.
10/3/2012ISC329 Isabelle Bichindaritz1 Logical Design.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
Lecture2: Database Environment Prepared by L. Nouf Almujally 1 Ref. Chapter2 Lecture2.
CHAPTER 13 (ONLINE): OBJECT-ORIENTED DATA MODELING © 2013 Pearson Education, Inc. Publishing as Prentice Hall 1 Modern Database Management 11 th Edition.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 15: Object-Oriented Data Modeling Modern Database Management 9 h Edition Jeffrey A.
© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 13 (Online): Object-Oriented Data Modeling Modern Database Management 10 th Edition.
CHAPTER 13: OBJECT-ORIENTED DATA MODELING (OVERVIEW) © 2013 Pearson Education, Inc. Publishing as Prentice Hall 1 Modern Database Management 11 th Edition.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Concepts of Database Management, Fifth Edition Chapter 6: Database Design 2: Design Methodology.
Object-Oriented Data Modeling
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
Guofeng Cao CyberInfrastructure and Geospatial Information Laboratory Department of Geography National Center for Supercomputing Applications (NCSA) University.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
Guofeng Cao CyberInfrastructure and Geospatial Information Laboratory Department of Geography National Center for Supercomputing Applications (NCSA) University.
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
1 Chapter 2 Database Environment Pearson Education © 2009.
CSCI 6315 Applied Database Systems Review for Midterm Exam I Xiang Lian The University of Texas Rio Grande Valley Edinburg, TX 78539
Logical Design 12/10/2009GAK1. Learning Objectives How to remove features from a local conceptual model that are not compatible with the relational model.
1 10 Systems Analysis and Design in a Changing World, 2 nd Edition, Satzinger, Jackson, & Burd Chapter 10 Designing Databases.
Department of Mathematics Computer and Information Science1 CS 351: Database Management Systems Christopher I. G. Lanclos Chapter 4.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Rationale Databases are an integral part of an organization. Aspiring Database Developers should be able to efficiently design and implement databases.
Introduction to Spatial Computing CSE 555
Logical Database Design and the Rational Model
Business System Development
Database Management:.
Modern Systems Analysis and Design Third Edition
Lec 3: Object-Oriented Data Modeling
Chapter 20 Object-Oriented Analysis and Design
Presentation transcript:

Fundamental database concepts Chapter 2 Fundamental database concepts © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

What you will learn What is a database? Why use a database? What is a relational database? Why does spatial data present problems for relational databases? How do you develop a database? What is object-orientation, and how is it relevant to databases? Summary © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Introduction to databases Section 2.1 Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

What is a database? A database is a collection of data organized in such a way that a computer can efficiently store and retrieve data A repository of data that is logically related A database is created and maintained using a general-purpose piece of software called a database management system (DBMS) Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

The database approach Before databases, computers were primarily used to convert data between different formats “The computer as a giant calculator” Databases treat computers as useful repositories of data “The computer as data repository” Most applications (including GIS) require a balance of processing and storage Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Databases in a nutshell In order to be effective, databases must offer the following functions: All these functions are managed by the DBMS Data independence Self-describing Concurrency Distributed capabilities High performance Reliability Integrity Security User views User interface Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Nutty Nuggets #1 We might write a program to organize the stock for the “Nutty Nuggets” restaurant As time continues, this program will become more complex, offering more functions Introduction to databases Stage 1 Stage 2 © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Nutty Nuggets #2 Key problems with the previous approach are: Loss of integrity Loss of independence Loss of security Stage 3, the database, solves these problems Introduction to databases Stage 3 © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Common database applications Home/office database Simple applications (e.g., Nutty Nuggets) Commercial database Store the information for businesses (e.g. customers, employees) Engineering database Used to store engineering designs (e.g. CAD) Image and multimedia database Store image, audio, video data Geodatabase Store a combination of spatial and non-spatial data Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Elements of a DBMS Query language Query compiler Runtime database processor Constraint enforcer Stored data manager System catalog/data dictionary Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Transaction management A transaction is an atomic unit of interaction between user and database Insertion of data Modification of data Deletion of data Retrieval of data Transaction management must support Concurrency (multiple users accessing the same data at the same time) Recovery management (retrieval of a valid database state following system failure) Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Concurrency: Lost update Lost update can occur when atomic transactions are incorrectly interleaved Introduction to databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Relational databases Section 8.2 © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Database architectures Most databases today are either: Relational; or Object-oriented (especially useful for spatial data) Early database systems were based on the hierarchical model Efficient storage, but limited expressiveness The network model was used to overcome lack of expressiveness in hierarchical databases But led to highly complex database system The deductive model is an active research area today Stores rules in addition to facts Relational databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

The relational model A relational database is a collection of relations, often just called tables Each relation has a set of attributes The data in the relation is structured as a set of rows, often called tuples Each tuple consists of data items for each attribute Each cell in a tuple contains a single value A relational database management system (RDBMS) is the software that manages a relational database Relational databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Example relation Relation Attribute Tuple Data item Relational databases Tuple Data item © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Relations A relation scheme is the set of attribute names and the domain (data type) for each attribute name A database scheme is a set of relation schemes In a relation: Each tuple contains as many values as there are attributes in the relation scheme Each data item is drawn from the domain for its attribute The order of tuples is not significant Tuples in a relation are all distinct from each other In most relational systems, data items are atomic A relation that contains only atomic items is said to be in first normal form (1NF) The degree of a relation is its number of columns The cardinality of a relation is the number of tuples Relational databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Relation scheme A candidate key is an attribute or minimal set of attributes that will uniquely identify each tuple in a relation One candidate key is usually chose as a primary key Relational databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Operations on relations There are five fundamental relational operators: union, difference, product, project, and restrict Three derived relational operators are also important: intersection, divide, and join Together, these operations and the way they are combined is called relational algebra combined The relational model is said to be closed, because relational operators take one or more relations as input and return a relation Relational databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Project operator The project operator is unary It outputs a new relation that has a subset of attributes Identical tuples in the output relation are coalesced Relational databases project NAME © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Restrict and join operators The restrict operator is unary It outputs a new relation that has a subset of tuples A condition specifies those tuples that are required The join operator is binary It outputs the combined relation where tuples agree on a specified attribute (natural join) Join is the most time-consuming of all relational operators to compute In general, relational operators may not be arbitrarily reordered Query optimization aims to find an efficient way of processing queries, for example reordering to produce equivalent but more efficient queries Relational databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Relational operator example Join relations SHOW and FILM using FILM_NAME and TITLE Relational databases Restrict using CINEMA_ID=1 Project TITLE, DIRECTOR, CINEMA_ID, and SCREEN_NO For full database see book web site: http://worboys.duckham.org © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Relational databases and spatial data Several issues prevent unmodified databases being useful for spatial data Structure of spatial data does not naturally fit with tables Performance is impaired by the need to perform multiple joins with spatial data Indexes are non-spatial in a conventional relational database An extensible RDBMS offers some solutions to these problems with user defined data types user-defined operations user-defined indexes and access methods active database functions (e.g., triggers) Relational databases © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Database development Section 8.3 © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Conceptual data model A conceptual data model provides a model of the proposed system that is independent of implementation details An effective conceptual model will provide a means for communication between analysts, designers and users aid the design of the system provide basic reference material for implemented system Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Entity relationship model #1 The entity relationship model is a conceptual data modeling technique where An entity type represents a collection of similar objects An entity instance is an occurrence of a particular entity An attribute type is a property associated with an entity An attribute type that serves to uniquely identify an entity type is called an identifier Identifiers are usually underlined attribute type entity type Database development identifier © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Entity relationship model #2 Entity types are connected using relationships A relationship type connects one or more entity types A relationship occurrence is a particular instance of a relationship Relationships may have their own attributes independent of entities Entity, attribute, and relationship types are shown in an entity relationship diagram (E-R diagram) relationship type Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Entity relationship model #3 Relationship types may be many-to-many: e.g., a town may have many road, which in turn may pass through many towns many-to-one: e.g., a town may have many cinemas, but a cinema can be located in at most one town one-to-one: e.g., a cinema may have one manager who manages only one cinema These constraints constitute cardinality conditions Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Entity relationship model #4 In addition to cardinality conditions, relationships may also have participatory conditions: optional or mandatory (indicated with a double line) A relationship from an entity to itself is called involutory A relationship connecting three entities is called a ternary relationship Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Extended entity relationship model The extended entity relationship model (EER) adds further features: An entity type E1 is a subtype of E2 if every occurrence of E1 is also an occurrence of E2. In this case, E2 is a supertype of E1 The operation of forming subtypes is called specialization; the inverse operation of forming supertypes is called generalization For specialization (and conversely for generalization) A subtype has the same identifying attribute(s) as the supertype A subtype has all the attributes of the supertype, and possibly some more A subtype enters into all the relationships in which the supertype is involved, and possibly some more. Subtypes and supertypes are organized into an inheritance hierarchy Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Extended entity relationship model Subtypes may be: disjoint: where no occurrence of one subtype is an occurrence of another overlapping: subtypes are not disjoint EER uses an extended diagrammatic notation to represent specialization/generalization constructs Database development supertype disjoint overlapping subtype © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

EER for spatial information #1 E-R or EER can be used to model spatial entities Most vector-based GIS use a similar structure Database development node area directed arc © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

EER for spatial information #2 Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Relational database design An E-R model can be transformed into a relational database scheme Advantageous features for a relational database scheme are: Lack of redundancy (redundant data wastes space and causes integrity problems) Fast access to data There usually exists a balance between space (lack of redundancy) and speed (fast access to data) Many relations leads to lower redundancy, but more joins (slower speed) Fewer relations leads to fewer joins (slower speed), but greater redundancy (and integrity problems) Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Redundancy For example, the following relation and relation scheme will be able achieve fast access but involves considerable redundancy Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Removing redundancy Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Building relational schemes Another guideline is to ensure relations are in first normal form, a process known as normalization A first pass at building a relational scheme from an E-R model is to: Convert each entity into a relation Convert each relationship into a relation However, not all relationships will require a relation For entities in a mandatory many to one relation, we can always opt to define a single joined relation in the relation scheme, known as posting the foreign key Database development © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Object-orientation Section 8.4 © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Object-orientation The stages of the system development process (chapter 1) present a problem Information may be lost at each stage of the development process, termed impedance mismatch Object-orientation aims to minimize impedance mismatch, bringing low-level system constructs closer to high-level conceptual constructs Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Foundations of object-orientation The object is at the core of object-orientation Objects have attributes that model the static, data-oriented aspects of a system (similar to tuples in a relation) The totality of attribute values constitutes the state of an object Objects also have operations that model the behavior of a system Behaviors are also called methods Objects with similar behaviors are grouped into classes The set of behaviors for a object form an interface object = state + behavior Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Example of object-orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Features of object-orientation The four main features of object-orientation from a modeling perspective are: Reduces complexity: decomposes complex phenomena into simpler objects Combats impedance mismatch: object-orientation can be applied at every level of system development Promotes reuse: System development is more efficient if constructed from collections of well-understood components Metaphorical power: Objects in object-orientation are metaphors for physical objects, making the modeling process easier In addition, four key constructs are closely associated with object-orientation: identity, encapsulation, inheritance, and association Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Identity and encapsulation An object has an identity that is independent of its attribute values Even if an object changes all its attribute values, it retains its identity Identity is immutable, created with an object and destroyed only when that object is destroyed Objects hide the internal mechanisms of their behavior from the external access to that behavior, called encapsulation What behaviors an object exhibits are separated from how those behaviors are achieved Encapsulation promotes reuse, because changes to an object’s internal mechanisms will not affect the object’s external interface Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Inheritance and polymorphism Classes may be organized into an inheritance hierarchy that allows objects to share common properties A class that provides more specialized behaviors is a subclass A class that provides more generalized behaviors is a superclass Inheritance allows objects to perform different roles within specific contexts, termed polymorphism Inclusion polymorphism is where a subclass is substituted for a superclass Overloading is where subclasses implement their own specialized versions of general behaviors There exists two types of inheritance: Single inheritance: each class may have zero or one superclasses Multiple inheritance: each class may have zero or more superclasses (requires some protocol for resolving behavioral conflicts) Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Class diagram superclass behavior (single) inheritance subclass Object orientation overloading (polymorphism) © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Association An association groups objects together to in order to model phenomena with complex internal structure Aggregation is a type of association concerned with part/whole relationships (e.g. a wheel is “part of” a car) Aggregation relationships will form a hierarchy often referred to as a partonomy An association is homogenous if it is formed from objects all of the same class. E.g., a soccer team is a homogenous association (aggregation) An association is ordered where the ordering of component objects is important. E.g., a polyline might be a linear ordering of points Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Object-oriented modeling #1 Object-oriented modeling comprises defining the classes, attributes, behaviors, associations, and inheritance for a system Attributes for a class can be defined in a similar way to E-R modeling Behaviors for a class fall into three categories Constructors are behaviors that are activated when an object is created, while destructors are activated when an object is destroyed Accessors are behaviors that may be used to examine the state of an object Transformers are behaviors that change the state of an object Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Object-oriented modeling #2 Defining associations and inheritance relationships is an iterative and application-dependent process As a rule of thumb: Inheritance relationships can be detected by using the connection “is a” in a sentence with two classes. E.g., ‘a car “is a” vehicle’ Aggregation relationships can be detected using “part of” in a sentence. E.g., ‘a steering wheel is “part of” a car’ Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Class diagrams association transformer aggregation constructor Object orientation accessor attribute © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press

Object-oriented DBMS A DBMS that utilizes an object-oriented data model is called an object-oriented DBMS (OODBMS) In addition to OO constructs, several other features are needed by OODBMS Scheme management (ability to create and change class schemes) Automatic query optimization Storage and access management Transaction management There exists technical problems with achieving these features: System complexity means that there are no longer a few simple operators, like in relational systems Encapsulation means that internal state may be hidden from DBMS As a result, performance for OODBMS is lower that for RDBMS Hybrid object-relational DBMS (ORDBMS) use a combination of relational data management and object-oriented “shell” for mediating user access to the DBMS Object orientation © Worboys and Duckham (2004) GIS: A Computing Perspective, Second Edition, CRC Press