Databases and Information Management CHAPTER SIX Databases and Information Management Oleh : Kundang K Juman
LEARNING OBJECTIVES After reading this chapter, you will be able to answer the following questions: What are the problems of managing data resources in a traditional file environment, and how are they solved by a database management system? What are the major capabilities of a database management system (DBMS), and why is a relational DBMS so powerful? What are some important principles of database design? continued …
LEARNING OBJECTIVES (continued) What are the principal tools and technologies for accessing information from databases to improve business performance and decision making? Why are information policy, data administration, and data quality assurance essential for managing a firm’s data resources?
Toronto Opens up its Data Problem: The City of Toronto decided to open its data to the public Solutions: Allowed public access to much of the data that had previously been kept private Used appropriate technology to ensure proper privacy and security of data Demonstrates importance of database management in creating open access to data Illustrates need to ensure data safeguards are in place to ensure proper privacy As the text says “An effective information system provides users with accurate, timely, and relevant information.” Ask students to define and explain why these three characteristics (accurate, timely, relevant) are important.
Organizing Data in a Traditional File Environment File organization terms and concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single character Field: Group of words or a complete number Record: Group of related fields File: Group of records of same type Continued …
Organizing Data in a Traditional File Environment File Organization Concepts (continued) Database: Group of related files Entity: Person, place, thing, event about which information is maintained Attribute: Description of a particular entity Key field: Identifier field used to retrieve, update, sort a record
Organizing Data in a Traditional File Environment
Organizing Data in a Traditional File Environment Problems with the traditional file environment Data redundancy and inconsistency Program-data dependence Lack of flexibility Poor security Lack of data sharing and availability
Organizing Data in a Traditional File Environment
Organizing Data in a Traditional File Environment Problems with the Traditional File Environment Data Redundancy and Inconsistency: Data redundancy: The presence of duplicate data in multiple data files so that the same data are stored in more than one place or location Data inconsistency: The same attribute may have different values. Continued …
Organizing Data in a Traditional File Environment Problems with the Traditional File Environment (continued) Program-Data Dependence: The coupling of data stored in files and the specific programs required to update and maintain those files such that changes in programs require changes to the data Continued …
Organizing Data in a Traditional File Environment Problems with the Traditional File Environment (continued) Lack of Flexibility A traditional file system can deliver routine scheduled reports after extensive programming efforts, but it cannot deliver ad-hoc reports or respond to unanticipated information requirements in a timely fashion Continued …
Organizing Data in a Traditional File Environment Problems with the Traditional File Environment (continued) Poor security Management may have no knowledge of who is accessing or making changes to the organization’s data Lack of data sharing and availability: Information cannot flow freely across different functional areas or different parts of the organization.
The Database Approach to Data Management Database management systems How a DBMS solves the problems of the traditional file environment Relational DBMS Operations of a relational DBMS Hierarchical and network DBMS Object-oriented DBMS
The Database Approach to Data Management
The Database Approach to Data Management Relational DBMS Represents data as two-dimensional tables called relations Relates data across tables based on common data element Examples: Access, DB2, Oracle, MS SQL Server
The Database Approach to Data Management
The Database Approach to Data Management Operations of a Relational DBMS Select: Creates subset of rows that meet specific criteria Join: Combines relational tables to provide users with information Project: Enables users to create new tables containing only relevant information
The Database Approach to Data Management
The Database Approach to Data Management Object-oriented DBMS Stores data and procedures as objects that can be retrieved and shared automatically Provides capabilities of both object-oriented and relational DBMS Hybrid OODBMS: combine benefits of relational and object-oriented DBMS
The Database Approach to Data Management Capabilities of Database Management Systems Data Definition Language Data Dictionary Querying and Reporting
The Database Approach to Data Management
The Database Approach to Data Management
The Database Approach to Data Management
The Database Approach to Data Management Designing Databases Conceptual design: Abstract model of database from a business perspective Physical design: Detailed description of business information needs Entity-relationship diagram: Methodology for documenting databases illustrating relationships between database entities Normalization: Process of creating small stable data structures from complex groups of data
The Database Approach to Data Management
The Database Approach to Data Management
The Database Approach to Data Management
The Database Approach to Data Management Distributed database: A database that is stored in more than one physical location Reduce the vulnerability of a single, massive central site Increase service and responsiveness to local users Can often run on smaller, less expensive computers Depend on high-quality telecommunications lines
Using Databases to Improve Business Performance and Decision Making Data warehouse Stores current and historical data from many core operational transaction systems Consolidates and standardizes information for use across enterprise, but data cannot be altered Data warehouse system will provide query, analysis, and reporting tools Data marts Subset of data warehouse Summarized or highly focused portion of firm’s data for use by specific population of users Typically focuses on single subject or line of business
Using Databases to Improve Business Performance and Decision Making
Using Databases to Improve Business Performance and Decision Making Tools for Business Intelligence Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions Example of Best Western building customer relationships with CRM Principle tools include: Software for database query and reporting Online analytical processing (OLAP) Data mining
Using Databases to Improve Business Performance and Decision Making Online analytical processing (OLAP) Supports multidimensional data analysis Viewing data using multiple dimensions Each aspect of information (product, pricing, cost, region, time period) is different dimension E.g., how many washers sold in East in June compared with other regions? OLAP enables rapid, online answers to ad hoc queries This slide discusses online analytical processing, one of the three principle tools for gathering business intelligence. Ask students to come up with additional examples of what a multidimensional query might be.
Using Databases to Improve Business Performance and Decision Making
Using Databases to Improve Business Performance and Decision Making Data Mining Tools for analyzing large pools of data Find hidden patterns and infer rules to predict trends Associations Sequences Classifications Clusters Forecasts
Using Databases to Improve Business Performance and Decision Making Text Mining Extracts key elements from large unstructured data sets (e.g., stored e-mails)
Using Databases to Improve Business Performance and Decision Making Web Mining Discovery and analysis of useful patterns and information from WWW Techniques Web content mining Knowledge extracted from content of Web pages Web structure mining E.g., links to and from Web page Web usage mining User interaction data recorded by Web server
What can Businesses Learn from Text Mining? Read the Window on Technology and answer the following questions: What challenges does the increase in unstructured data present for businesses? How does text mining improve decision making? What are the challenges involved in text mining? What kinds of companies are most likely to benefit from text mining software? Explain your answer. How might text mining lead to the erosion of personal information privacy? (See Chapter 4.) Explain.
Using Databases to Improve Business Performance and Decision Making Databases and the Web Many companies use Web to make some internal databases available to customers or partners Typical configuration includes: Web server Application server/middleware/CGI scripts Database server (hosting DBM) Advantages of using Web for database access: Ease of use of browser software Web interface requires few or no changes to database Inexpensive to add Web interface to system
Using Databases to Improve Business Performance and Decision Making
Managing Data Resources Establishing an information policy Specifies the organization’s rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying information Data administration is responsible for specific policies and procedures through which data is managed Data governance Database administration
Managing Data Resources Ensuring Data Quality Data Quality Audit Structured survey of the accuracy and completeness of data in an information system Data cleansing consists of activities for detecting and correcting data in an information system
Credit Bureau Errors – Big People Problems Read the Window on Organizations, and then discuss the following questions: Assess the business impact of credit bureaus’ data quality problems for the credit bureaus, for lenders, and for individuals. Are any ethical issues raised by credit bureaus’ data quality problems? Explain your answer. Analyze the management, organization, and technology factors responsible for credit bureaus’ data quality problems. What can be done to solve these problems?
Databases and Information Management CHAPTER SIX Databases and Information Management