Advanced Information Modeling and Database Systems Introduction to Database
Data, Information, Knowledge, and Wisdom Data: facts concerning people, objects, events or other entities Structured: numbers, text, dates Unstructured: images, video, documents Information: data that are processed to be useful; answers "who", "what", "where", and "when" questions Knowledge: the appropriate collection of information, such that it's intent is to be useful; answers "how" questions Understanding: appreciation of “why” Wisdom: evaluated understanding
File Systems Traditionally composed of collection of file folders kept in file cabinet Organization within folders was based on data’s expected use (ideally logically related) System was adequate for small amounts of data with few reporting requirements Finding and using data in growing collections of file folders became time consuming and cumbersome
File Systems (cont.)
File Systems (cont.) Advantages of File Systems No resource overhead No cost overhead Speed to access data Disadvantages of File Systems Data redundancy and inconsistency Difficulty in access data and process data Lack of standardizations Hard to maintenance and update data Security problems, etc….…
Database System and Database Management System (DBMS)
Database System (cont.) Shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization. System catalog (metadata, data dictionary) provides description of data to enable program–data independence. Logically related data comprises entities, attributes, and relationships of an organization’s information.
Database Management System (DBMS) A software system that enables users to define, create, maintain, and control access to the database. (Database) application program: a computer program that interacts with database by issuing an appropriate request (SQL statement) to the DBMS.
Components of DBMS Environment
Advantages of DBMSs Control of data redundancy Data consistency More information from the same amount of data Sharing of data Improved data integrity Improved security Enforcement of standards Economy of scale Multiple applications on 1 set of data
Advantages of DBMSs Balance conflicting requirements DBA makes decision about the design and operational use of database in order to achieve the optimal performance Improved data accessibility and responsiveness Increased productivity Improved maintenance through data independence Increased concurrency Improved backup and recovery service
Disadvantages of DBMSs Complexity Size Cost of DBMS Additional hardware costs Cost of conversion Performance DBMS is written to be more general (as opposed to being specific to a certain type of application), so it may not run as fast as the file-based systems. Higher impact of a failure – single point of failure
ANSI-SPARC Three-Level Architecture
Database Architecture (cont.) External Level Users’ view of the database. Describes that part of database that is relevant to a particular user. Conceptual Level Community view of the database. Describes what data is stored in database and relationships among the data. Internal Level Physical representation of the database on the computer. Describes how the data is stored in the database.
Differences between Three Levels of ANSI-SPARC Architecture
Benefit of 3-level Architecture: Data Independence Logical Data Independence Capacity to change conceptual schema without having to change external schema or application programs Physical Data Independence Capacity to change the internal schema without having to change the conceptual (or external) schemas
Data Independence and the ANSI-SPARC Three-Level Architecture
Data Model Integrated collection of concepts for describing data, relationships between data, and constraints on the data in an organization. Data Model comprises: A structural part; which database can be constructed A manipulative part; types of allowed operation A set of integrity rules; ensuring accuracy of data
Data Model (Cont.) Purpose Categories of data models include: To represent data in an understandable way. Categories of data models include: Physical Record-based Object-based
Data Model Physical Data Models Record-Based Data Models Hierarchical Data Model Network Data Model Relational Data Model Object-Based Data Models Entity-Relationship Data Model Object-Oriented Data Model etc.
Hierarchical Data Model
Network Data Model
Relational Data Model
Entity Relationship Data Model
Object Oriented Data Model
Evolution of Data Models
ER Diagram of Branch User Views of DreamHome ER Data Modeling ER Diagram of Branch User Views of DreamHome © Pearson Education Limited 1995, 2005
Concepts of the ER Model Entity types Relationship types Attributes © Pearson Education Limited 1995, 2005
Entity Type Entity type Entity occurrence Group of objects with same properties, identified by enterprise as having an independent existence. Entity occurrence Uniquely identifiable object of an entity type. © Pearson Education Limited 1995, 2005
ER diagram of Staff and Branch Entity Types © Pearson Education Limited 1995, 2005
Relationship Types Relationship type Relationship occurrence Set of meaningful associations among entity types. Relationship occurrence Uniquely identifiable association, which includes one occurrence from each participating entity type. © Pearson Education Limited 1995, 2005
Semantic net of “Has” Relationship Type © Pearson Education Limited 1995, 2005
ER diagram of Branch “Has” Staff Relationship © Pearson Education Limited 1995, 2005
Relationship Types Degree of a Relationship Relationship of degree: Number of participating entities in relationship. Relationship of degree: Two is binary Three is ternary Four is quaternary. © Pearson Education Limited 1995, 2005
Binary Relationship called “POwns” © Pearson Education Limited 1995, 2005
Ternary Relationship called “Registers” © Pearson Education Limited 1995, 2005
Quaternary Relationship called “Arranges” © Pearson Education Limited 1995, 2005
Relationship Types Recursive Relationship Relationship type where same entity type participates more than once in different roles. Relationships may be given role names to indicate purpose that each participating entity type plays in a relationship. © Pearson Education Limited 1995, 2005
Recursive Relationship called “Supervises” with Role Names © Pearson Education Limited 1995, 2005
Entities Associated through Two Distinct Relationships with Role Names © Pearson Education Limited 1995, 2005
Attributes Attribute Attribute Domain Simple Attribute Property of an entity or a relationship type. Attribute Domain Set of allowable values for one or more attributes. Simple Attribute Attribute composed of a single component with an independent existence. Composite Attribute Attribute composed of multiple components, each with an independent existence. © Pearson Education Limited 1995, 2005
Attributes Single-valued Attribute Multi-valued Attribute Attribute that holds a single value for each occurrence of an entity type. Multi-valued Attribute Attribute that holds multiple values for each occurrence of an entity type. Derived Attribute Attribute that represents a value that is derivable from value of a related attribute, or set of attributes, not necessarily in the same entity type. © Pearson Education Limited 1995, 2005
Keys Candidate Key Primary Key Composite Key Minimal set of attributes that uniquely identifies each occurrence of an entity type. Primary Key Candidate key selected to uniquely identify each occurrence of an entity type. Composite Key A candidate key that consists of two or more attributes. © Pearson Education Limited 1995, 2005
ER Diagram of Staff and Branch Entities and Their Attributes © Pearson Education Limited 1995, 2005
Entity Type Strong Entity Type Weak Entity Type Entity type that is not existence-dependent on some other entity type. Weak Entity Type Entity type that is existence-dependent on some other entity type. © Pearson Education Limited 1995, 2005
Relationship called “Advertises” with Attributes © Pearson Education Limited 1995, 2005
Structural Constraints Main type of constraint on relationships is called multiplicity. Multiplicity - number (or range) of possible occurrences of an entity type that may relate to a single occurrence of an associated entity type through a particular relationship. Represents policies (called business rules) established by user or company. The most common degree for relationships is binary. Binary relationships are generally referred to as being: one-to-one (1:1) one-to-many (1:*) many-to-many (*:*) © Pearson Education Limited 1995, 2005
Semantic Net of Staff “Manages” Branch Relationship Type © Pearson Education Limited 1995, 2005
Multiplicity of Staff “Manages” Branch (1:1) Relationship © Pearson Education Limited 1995, 2005
Semantic Net of Staff “Oversees” PropertyForRent Relationship Type © Pearson Education Limited 1995, 2005
Multiplicity of Staff “Oversees” PropertyForRent (1: Multiplicity of Staff “Oversees” PropertyForRent (1:*) Relationship Type © Pearson Education Limited 1995, 2005
Semantic net of Newspaper “Advertises” PropertyForRent Relationship Type © Pearson Education Limited 1995, 2005
Multiplicity of Newspaper “Advertises” PropertyForRent (. : Multiplicity of Newspaper “Advertises” PropertyForRent (*:*) Relationship © Pearson Education Limited 1995, 2005
Structural Constraints Multiplicity for Complex Relationships Number (or range) of possible occurrences of an entity type in an n-ary relationship when other (n-1) values are fixed. © Pearson Education Limited 1995, 2005
Semantic Net of Ternary “Registers” Relationship with Values for Staff and Branch Entities Fixed © Pearson Education Limited 1995, 2005
Multiplicity of Ternary “Registers” Relationship © Pearson Education Limited 1995, 2005
Summary of Multiplicity Constraints © Pearson Education Limited 1995, 2005
Structural Constraints Multiplicity is made up of two types of restrictions on relationships: cardinality and participation. Cardinality Describes maximum number of possible relationship occurrences for an entity participating in a given relationship type. Participation Determines whether all or only some entity occurrences participate in a relationship. © Pearson Education Limited 1995, 2005
Multiplicity as Cardinality and Participation Constraints © Pearson Education Limited 1995, 2005
Problems with ER Models Problems may arise when designing a conceptual data model called connection traps. Often due to a misinterpretation of the meaning of certain relationships. Two main types of connection traps are called fan traps and chasm traps. Fan Trap Where a model represents a relationship between entity types, but pathway between certain entity occurrences is ambiguous. Chasm Trap Where a model suggests the existence of a relationship between entity types, but pathway does not exist between certain entity occurrences. © Pearson Education Limited 1995, 2005
An Example of a Fan Trap © Pearson Education Limited 1995, 2005
Semantic Net of ER Model with Fan Trap At which branch office does staff number SG37 work? © Pearson Education Limited 1995, 2005
Restructuring ER Model to Remove Fan Trap © Pearson Education Limited 1995, 2005
Semantic Net of Restructured ER Model with Fan Trap Removed SG37 works at branch B003. © Pearson Education Limited 1995, 2005
An Example of a Chasm Trap © Pearson Education Limited 1995, 2005
Semantic Net of ER Model with Chasm Trap At which branch office is property PA14 available? © Pearson Education Limited 1995, 2005
ER Model Restructured to Remove Chasm Trap © Pearson Education Limited 1995, 2005
Semantic Net of Restructured ER Model with Chasm Trap Removed © Pearson Education Limited 1995, 2005
Degrees of Relationship, Alternative Representation © Pearson Education Limited 1995, 2005
Object-Oriented Data Modeling What Is Object-Oriented Data Modeling? Centers around objects and classes Involves inheritance Encapsulates both data and behavior Benefits of Object-Oriented Modeling Ability to tackle challenging problems Improved communication between users, analysts, designers, and programmers Increased consistency in analysis, design, and programming Explicit representation of commonality among system components System robustness Reusability of analysis, design, and programming results © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Classes and Objects Class: An entity that has a well-defined role in the application domain, as well as state, behavior, and identity Tangible: person, place or thing Concept or Event: department, performance, marriage, registration Artifact of the Design Process: user interface, controller, scheduler Object: a particular instance of a class Objects exhibit BEHAVIOR as well as attributes Different from entities © 2009 Pearson Education, Inc. Publishing as Prentice Hall
State, Behavior, Identity State: attribute types and values Behavior: how an object acts and reacts Behavior is expressed through operations that can be performed on it Identity: every object has a unique identity, even if all of its attribute values are the same © 2009 Pearson Education, Inc. Publishing as Prentice Hall
UML Class and Object Diagram Class diagram shows the static structure of an object-oriented model: object classes, internal structure, relationships © 2009 Pearson Education, Inc. Publishing as Prentice Hall
© 2009 Pearson Education, Inc. Publishing as Prentice Hall Operation A function or service that is provided by all instances of a class Types of operations: Constructor: creates a new instance of a class Query: accesses the state of an object but does not alter its state Update: alters the state of an object Scope: operation applying to the class instead of an instance Operations implement the object’s behavior © 2009 Pearson Education, Inc. Publishing as Prentice Hall
© 2009 Pearson Education, Inc. Publishing as Prentice Hall Associations Association: Named relationship among object classes Association Role: Role of an object in an association The end of an association where it connects to a class Multiplicity: How many objects participate in an association. Lower-bound…Upper-bound (cardinality) © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Examples of Association Relationships of Different Degree Unary Lower-bound – upper-bound Represented as: 0..1, 0..*, 1..1, 1..* Similar to minimum/maximum cardinality rules in EER Ternary Binary © 2009 Pearson Education, Inc. Publishing as Prentice Hall
© 2009 Pearson Education, Inc. Publishing as Prentice Hall Association Class An association that has attributes or operations of its own or that participates in relationships with other classes Like an associative entity in E-R model © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Class Diagram Showing Association Classes Registration class implements a many-to-many association between Student and Course © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Generalization/Specialization Subclass, superclass similar to subtype/supertype in EER Common attributes, relationships, and operations Disjoint vs. Overlapping Complete (total specialization) vs. incomplete (partial specialization) Abstract Class: no direct instances possible, but subclasses may have direct instances Concrete Class: direct instances possible © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Examples of Generalization, Inheritance, and Constraints - a) Employee Superclass with Three Subclasses An employee can only be one of these subclasses Shared attributes and operations An employee may be none of them Specialized attributes and operations
© 2009 Pearson Education, Inc. Publishing as Prentice Hall Examples of Generalization, Inheritance, and Constraints – b) Abstract Patient Class with Two Concrete Subclasses Abstract indicated by italics A patient MUST be EXACTLY one of the subtypes Dynamic means a patient can change from one subclass to another over time © 2009 Pearson Education, Inc. Publishing as Prentice Hall
© 2009 Pearson Education, Inc. Publishing as Prentice Hall Polymorphism Abstract Operation: Defines the form or protocol of the operation, but not its implementation Method: The implementation of an operation Polymorphism: The same operation may apply to two or more different classes in different ways © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Class-Scope Attribute Specifies a value common to an entire class, rather than a specific value for an instance. Represented by underlining “=“ is initial, default value © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Polymorphism, Abstract Operation, Class-Scope Attribute, and Ordering This operation is abstract…it has no method at Student level Class-scope attributes–only one value common to all instances of these classes (includes default values) Methods are defined at subclass level
© 2009 Pearson Education, Inc. Publishing as Prentice Hall Aggregation Aggregation: A part-of relationship between a component object and an aggregate object Composition: A stronger form of aggregation in which a part object belongs to only one whole object and exists only as part of the whole object Recursive Aggregation: Composition where component object is an instance of the same class as the aggregate object © 2009 Pearson Education, Inc. Publishing as Prentice Hall
Example of Aggregation A Personal Computer includes CPU, Hard Disk, Monitor, and Keyboard as parts. But, these parts can exist without being installed into a computer. The open diamond indicates aggregation, but not composition © 2009 Pearson Education, Inc. Publishing as Prentice Hall
References Hoffer et. al. Modern Database Management, The Tenth Edition, Pearson. Education, 2011 . Kroenke and Auer. Database Concepts, 3rd Edition, Upper Saddle River, N.J.: Pearson Prentice Hall, 2008. Elmasri and Navathe. Fundamentals of Database Systems, The Fifth Edition, Pearson. Education, Inc., 2007.