Organizing Data & Information Chapter 5 Organizing Data & Information
Data & Databases Data consists of raw facts that when organized may be transformed into information A Database is a collection of data organized to meet users’ needs A Database Management System (DBMS) is a group of programs that manipulate the database & provide an interface between the database & the user of the database or other application programs Chapter 5 IS for Management
The Hierarchy of Data (Figure 5.1) Database Management System Database File (table) Record (entity, row) Field (characteristic, column) Byte (character) Chapter 5 IS for Management
Data Entities, Attributes, & Keys Entity: A generalized class of people, places, or things for which data is collected, stored, & maintained Examples: customers, employees Attribute: A characteristic of an entity; something the entity is identified by Examples: customer name, employee name Key: A field or set of fields in a record that is a unique identifier of a record Examples: social insurance number, customer number Chapter 5 IS for Management
Keys & Attributes (Figure 5.2) Employee Number Last Name First Name Hire Date Department 005-10-6321 Johns Francine 10-7-65 257 549-77-1001 Buckley Bill 2-17-79 650 098-40-1370 Fiske Steven 1-5-85 598 <--------Entities--------> (records) Key Field Chapter 5 IS for Management
The Traditional Approach (Figure 5.3) Separate files are created & stored for each application program Chapter 5 IS for Management
Drawbacks to the Traditional Approach Data redundancy Duplication of data in separate files Lack of data integrity The degree to which the data in any one file is accurate Program-data dependence A situation in which programs & data organized for one application are incompatible with programs & data organized differently for another application Inability to Link Data Chapter 5 IS for Management
The Database Approach (Figure 5.4) A pool of related data is shared by multiple applications. Rather than having separate data files, each application uses a collection of data that is either joined or related in the database. Chapter 5 IS for Management
Advantages to the Database Approach Improved strategic use of corporate data Reduced data redundancy Improved data integrity Easier modification & updating Data & program independence Better access to data & information Standardization of data access A framework for program development Better overall protection of the data Shared data & information resources Chapter 5 IS for Management
Disadvantages to the Database Approach Relatively high cost of purchasing & operating a DBMS in a mainframe operating environment Increased cost of specialized staff Increased vulnerability Chapter 5 IS for Management
Database Design Logical design precedes physical design Abstract model of how data should be structured & arranged Users should assist in creating logical design Physical design starts with the logical design What specific hardware/software will be used Fine-tuning of logical design for performance/cost considerations Planned Data Redundancy A way of organizing data in which the logical database design is altered so that certain data entities are combined Summary totals are carried in the data records rather than calculated from elemental data Some data attributes are repeated in more than one data entity to improve database performance Chapter 5 IS for Management
Data Modeling Data Model Enterprise data modeling A map or diagram of entities & their relationships Enterprise data modeling Data modeling done at the level of the entire organization Entity-Relationship (ER) diagrams A data model that uses basic graphical symbols to show the organization of & relationships between data (Figure 5.5) Chapter 5 IS for Management
Database Models Hierarchical (Figure 5.6): A data model in which the data is organized in a top-down or inverted tree structure Network (Figure 5.7): An expansion of the hierarchical database model with an owner-member relationship in which a member may have many owners Relational (Figure 5.8): All data elements are placed in two-dimensional tables, called relations, that are the logical equivalent of files Chapter 5 IS for Management
A Relational Database (3 tables) Chapter 5 IS for Management
Relational Database Terminology Domain: Allowable values for attributes Selecting: Data manipulation that eliminates rows (records) according to user-defined criteria Projecting: Data manipulation that eliminates columns (attributes) in a table Joining: Data manipulation that combines two or more tables Linking: Relating tables in a relational database together by a common attribute(s) Chapter 5 IS for Management
Schemas & Subschemas Schema Subschema View of the entire database Includes logical & physical structure & relationships among all data Subschema User view of a portion of the database Can have many subschemas for one database Chapter 5 IS for Management
Data Definition Language & Dictionary Data Definition Language (DDL) A collection of instructions & commands used to define & describe data & data relationships in a database Data Dictionary A detailed description of all data used in the database Provides a standard definition of terms & data elements Assists programmers in designing & writing programs Simplifies database modification Reduces data redundancy Increases data reliability Faster program development Easier modification of data & information Chapter 5 IS for Management
Logical & Physical Access Paths (Figure 5.14) Data on Storage Device Physical Access Path DBMS accesses a storage device to retrieve data DBMS Logical Access Path Application requires information from the DBMS Management inquiries Other Software Application Programs Chapter 5 IS for Management
Manipulating Data Concurrency Control Data Manipulation Language (DML) A method of dealing with a situation in which two or more people need to access the same record in a database at the same time Data Manipulation Language (DML) The commands that are used to manipulate the data in a database Structured Query Language (SQL) A standardized data manipulation language for querying a database Most modern databases are SQL compliant Chapter 5 IS for Management
DBMS Selection Criteria Database size Number of concurrent users Performance Integration Features Vendor Cost Chapter 5 IS for Management
Database Developments (1) Distributed Database A database in which the actual data may be spread across several smaller databases connected via telecommunications devices Transparent to user (user does not know where data is) Replicated Database Duplicate of original database (saves telecom time/$$) Chapter 5 IS for Management
Database Developments (2) Data Warehouse A relational database management system designed specifically to support management decision making Data Mart A subset of a data warehouse for small & medium-size businesses or departments within larger companies Data Mining Automated discovery of patterns & relationships in a data warehouse Built-in analysis tools Chapter 5 IS for Management
Database Developments (3) On-line Transaction Processing (OLTP) TP happens at time of transaction On-line Analytical Processing (OLAP) Supports high speed analysis of data involving complex relationships Multidimensional Databases Data can include graphics, photographs, sound files, etc. Open Database Connectivity (ODBC) Software written in compliance with ODBC standards can be used with any ODBC-compliant database Chapter 5 IS for Management
Object-Relational Database Management Systems Can manipulate audio, video, & graphical data Hypertext: Users can search & manipulate alphanumeric data in an unstructured way Hypermedia: Users can search & manipulate multimedia forms of data Spatial Data Technology: Use of an object-relational database to store & access data according to the location it describes & to permit spatial queries & analysis Chapter 5 IS for Management
Case US West Chapter 5 IS for Management