Download presentation
Presentation is loading. Please wait.
Published byMae Pearson Modified over 9 years ago
1
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS MBNA ebay
CHAPTER 3 DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS Hossein BIDGOLI MBNA ebay Hong Kong Airport A not so perfect match proquest
2
Chapter 3 Database Systems, Data Warehouses, and Data Marts
l e a r n i n g o u t c o m e s LO1 Define a database and a database management system. LO2 Explain logical database design and the relational database model. LO3 Define the components of a database management system. LO4 Summarize recent trends in database design and use. LO5 Explain the components and functions of a data warehouse.
3
l e a r n i n g o u t c o m e s (cont’d.)
Chapter 3 Database Systems, Data Warehouses, and Data Marts l e a r n i n g o u t c o m e s (cont’d.) LO6 Describe the functions of a data mart.
4
Managing data and information
MBNA Managing data and information Usually too much data rather than too little in organizations How does an organization organize all this data and information? Database – a collection of integrated and related files Ebay Proquest MBNA
5
What is Database Technology ?
A collection of related data organized in a way that makes it valuable and useful Allows organizations to retrieve, store, and analyze information easily Is vital to an organization’s success in running operations and making decisions Database Collection of related data that can be stored in a central location or in multiple locations Usually a group of files File Group of related records All files are integrated Record Group of related fields Data hierarchy
6
The Hierarchy of Data Bit (a binary digit): a circuit that is either on or off Byte: 8 bits Character: each byte represents a character; the basic building block of information Field --- Individual characteristics about an ENTITY. Fields are also called attributes or columns depending on the type of DBMS Record: A group of fields or attributes to describe a single instance of an ENTITY. These are also called rows depending on the DBMS File: A collection of records or instances for a given ENTITY. These are also called tables, depending on the DBMS Database: A collection of files or entities containing information to support a given system or a particular topic area Hierarchy of data Bits, characters, fields, records, files, and databases
7
Databases Critical component of information systems
Any type of analysis that’s done is based on data available in the database Database management system (DBMS) Creating, storing, maintaining, and accessing database files Advantages over a flat file system
8
The Traditional Approach
Payroll Grades Grades Student Tuition Student Tuition Traditional approach: separate data files are created for each application Results in data redundancy (duplication) Data redundancy conflicts with data integrity For now, take a look at Exhibit 3.2, which shows how users, the DBMS, and the database interact. The user issues a request, and the DBMS searches the data-base and returns the information to the user. In the past, data was stored in a series of fi les called “ fl at fi les,” because they weren’t arranged in a hierarchy and there was no relation among these fi les. The prob-lem with this fl at fi le organization was that the same data could be stored in more than one fi le, creating data redundancy. For example, in a customer database, a customer’s name might be stored in more than one table. This duplication takes up unnecessary storage space and can make retrieving data ineffi cient. In ad-dition, in a fl at fi le system, data might not be updated in all fi les consistently, resulting in confl icting reports generated from these fi les. Updating a fl at fi le system can also be time consuming. Parking Parking The Traditional Approach to Data Management U of L example
9
Database management system
The Database Approach Payroll Grades Payroll Grades Tuition Parking Database management system Tuition Database approach: pool of related data is shared by multiple applications Significant advantages over traditional approach The database approach to data management provides significant advantages over the traditional file-based approach. Define general data management concepts and terms, highlighting the advantages of the database approach to data management. Describe the relational database model and outline its basic features. Parking The Database Approach to Data Management
10
Advantages of the Database Approach
11
Exhibit 3.2 Interaction between the user, DBMC, and Database The major components of a DBMS are discussed in “ Components of a DBMS” later in the chapter. If you’re familiar with Microsoft Offi ce soft-ware, you know that you use Word to create a docu-ment and Excel to create a spreadsheet. You can also use Access to create and modify a database, although it doesn’t have as many features as other DBMSs. For now, take a look at Exhibit 3.2, which shows how users, the DBMS, and the database interact. The user issues a request, and the DBMS searches the data-base and returns the information to the user.
12
Types of Data in a Database
Internal data Collected within organization External data Sources Competitors, customers, and suppliers Distribution networks Economic Government regulations Labor and population statistics Tax records functional information systems Internal data is collected from within an organization and can be transaction records, sales records, personnel records, and so forth. An organization might use internal data on customers’ past purchases to generate BI about future buying patterns, for example. Internal data is usually stored in the organization’s internal databases and can be used b examples of sources for external data: • Competitors, customers, and suppliers • Distribution networks • Economic indicators ( for example, the consumer price index) • Government regulations • Labor and population statistics • Tax records functional information systems.
13
BI in Action: Law Enforcement
Business intelligence (BI) Used in law enforcement as well as in the business world Richmond, Virginia System generates BI reports that help pinpoint crime patterns Allocate manpower to days and locations where crime likely to occur BI in Action: Law Enforcement Business intelligence ( BI) is used in law enforcement as well as in the busi-ness world. In Richmond, Virginia, data entered into the information sys-tem includes crime reports from the past 5 years, records of 911 phone calls, details about weather patterns, and information about special events. The system generates BI reports that help pinpoint crime patterns and are useful for personnel deployment, among other purposes. The sys-tem has increased public safety, reduced 911 calls, and helped manage-ment make better use of Richmond’s 750 offi cers. Recently, the department refi ned its reports by separating violent crimes into robberies, rapes, and homicides to help them discover patterns for certain types of crime. For example, the department discovered that His-panic workers were often robbed on paydays. By entering workers’ pay-days into the system and looking at robbery patterns, law enforcement offi cers were able to identify days and locations where these incidents were likely to occur. Moving additional offi cers into those areas on pay-days has reduced the number of robberies. Calgary
14
Methods for Accessing Files
Sequential file structure Records organized and processed in numerical or sequential order Organized based on a “primary key” Usually used for backup and archive files Because they need updating only rarely Random access file structure Records can be accessed in any order Fast and very effective when a small number of records need to be processed daily or weekly In a sequential access fi le structure, records in fi les are organized and processed in numerical or sequential order, typically the order in which they were entered. Records are or-ganized based on what is known as a “ primary key,” discussed later in this chapter, such as Social Security numbers or account numbers. For example, to access record number 10, records 1 through 9 must be read fi rst. This type of access method is effective when a large number of records are processed less frequently, perhaps on a quarterly or yearly basis. Because access speed usually isn’t critical, these records are typically stored on magnetic tape. A sequential fi le structure is usually used for backup and archive fi les, because they need updating only rarely. In a random access fi le structure, records can be accessed in any order, regardless of their physical loca-tion in storage media. This method of access is fast and very effective when a small number of records needs to be processed daily or weekly. To achieve this speed, these records are often stored on magnetic disks. Disks are random access devices, whereas tapes are sequential access devices ( consider how much quicker it is to skip a song on a CD as opposed to a tape cassette).
15
Methods for Accessing Files (cont’d.)
Indexed sequential access method (ISAM) Records accessed sequentially or randomly Depending on the number being accessed Indexed access Uses an index structure with two parts: Indexed value Pointer to the disk location of the record matching the indexed value With the indexed sequential access method ( ISAM), records can be accessed sequentially or randomly, depending on the number being accessed. For a small number, random access is used, and for a large number, Indexed ACCESS an index is more useful when the number of records is small. Ac-cess speed with this method is fast, so it’s recommended when records must be accessed frequently
16
Logical Database Design
Physical view How data is stored on and retrieved from storage media Logical view How information appears to users How it can be organized and retrieved Can be more than one logical view Beefore designing a database, you need to know the two ways information is viewed in a database. two ways infor-mation is viewed in a database. The physical view involves how data is stored on and retrieved from stor-age media, such as hard disks, magnetic tapes, or CDs. For each data-base, there’s only one physical view of data. The logical view involves how information ap-pears to users and how it can be organized and retrieved. There can be more than one logical view of data, depend-ing on the user.
17
Logical Database Design (cont’d.)
Data model Determines how data is created, represented, organized Includes Data structure Operations Integrity rules Data structure— Describes how data is organized and the relationship between records • Operations— Describe methods, calculations, and so forth that can be performed on data, such as updating and querying data • Integrity rules— Defines the boundaries of a database, such as maximum and minimum values allowed for a field, constraints ( limits on what type of data can be stored in a field), and access methods
18
The Relational Model Relational model Data dictionary
Uses a two-dimensional table of rows and columns of data Data dictionary Field name Field data type Default value Validation rule A relational model uses a two- dimensional table of rows and columns of data. Rows are records ( also called “ tuples”), and columns are fi elds ( also referred to as “ attributes”). To begin designing a relational database, you must defi ne the logical structure by defi ning each table and the fi elds in it. For example, the Students table has fi elds for StudentID, StudentFirstName, StudentLast- Name, and so forth. The collection of these defi nitions is stored in the data dictionary. The data dictionary can also store other defi nitions, such as data types for fi elds, default values for fi elds, and validation rules for data in each fi eld, as described in the following list:
19
The Relational Model - Example
Primary key Unique identifier Foreign key Establishes relationships between tables Normalization Improves database efficiency Eliminates redundant data To improve database effi ciency, a process called normalization is used, which eliminates redundant data ( storing customer names in only one table, for example) and ensures that only related data is stored in a table.
20
The Relational Model Data retrieval Select Project Join Intersection
Union Difference A select operation searches data in a table and retrieves records based on certain criteria Table 3.1 shows data stored in the Stu-dents table. Using the select operation you can generate a list of only the students majoring in MIS, as Table 3.2 shows. A project operation pares down a table by eliminating columns ( fi elds) according to certain criteria. For example, you need a list of students but don’t want to include their ages. Using the project operation you can retrieve the data shown in Table The “( Table 3.1)” in this statement means to use the data in Table 3.1. A join operation combines two tables based on a common field ( for example, the primary key in the first table and the foreign key in the second table). Table 3.4 shows data in the Customers table, and Table 3.5 shows data in the Invoices table. The Customer# is the primary key for the Customers tables and a foreign key in the Invoices table; the Invoice# is the primary key for the Invoices table. Table 3.6 shows the table resulting from joining these two tables.
21
Components of a DBMS Database engine Data definition Data manipulation
Application generation Data administration A database engine, the heart of DBMS software, is responsible for data storage, manipulation, and retrieval. It converts logical requests from users into their physical equivalents.
22
Database Engine Heart of DBMS software
Responsible for data storage, manipulation, and retrieval Converts logical requests from users into their physical equivalents A database engine, the heart of DBMS software, is responsible for data storage, manipulation, and retrieval. It converts logical requests from users into their physical equivalents ( reports, for example) by interacting with other components of the DBMS ( usually the data manipulation component). For example, a marketing manager wants to see a list of the top three salespeople in the Southeast re-gion ( a logical request). The database engine interacts with the data manipulation component to fi nd where these three names are stored and displays them on screen or in a printout ( the physical equivalent). Because more than one logical view of data is possible, the database engine can retrieve and return data to users in many different ways.
23
Data Definition Create and maintain the data dictionary
Define the structure of files in a database Adding fields Deleting fields Changing field size Changing data type The data defi nition component is used to create and maintain the data dictionary and defi ne the structure of fi les in a database. Any changes to a database’s struc-ture, such as adding fi elds, deleting fi elds, changing a fi eld’s size, and changing the data type stored in a fi eld, are made with this component.
24
Data Manipulation Add, delete, modify, and retrieve records from a database Query language Structured Query Language (SQL) Standard fourth-generation query language used by many DBMS packages SELECT statement Query by example (QBE) Construct statement of query forms Graphical interface The data manipulation component is used to add, delete, modify, and retrieve records from a database. Typically, a query language is used for this component. Many query languages are available, but Structured Query Language ( SQL) and Query By Example ( QBE) are two of the most widely used. Structured Query Language ( SQL) is a standard fourth- generation query language used by many DBMS packages, such as Oracle 11g and Microsoft SQL Server. With query by example ( QBE), you request data from a database by constructing a statement made up of query forms. With current graphical databases, you simply click to select query forms instead of having to remember keywords, as you do with SQL.
25
Application Generation
Design elements of an application using a database Data entry screens Interactive menus Interfaces with other programming languages The application generation component is used to de-sign elements of an application using a database, such as data entry screens, interactive menus, and interfaces with other programming languages. These applications might be used to create a form or generate a report, for example. If you’re designing an order entry applica-tion for users, you could use the application generation component to create a menu system that makes the ap-plication easier to use. Typically, IT professionals and database administrators use this component.
26
Data Administration Used for: Create, read, update, and delete (CRUD)
Backup and recovery Security Change management Create, read, update, and delete (CRUD) Database administrator (DBA) Individual or department Responsibilities The data administration component, also used by IT professionals and database administrators, is used for tasks such as backup and recovery, security, and change management. In addition, this component is used to determine who has permission to perform certain func-tions, often summarized as create, read, update, and delete ( CRUD). In large organizations, database design and man-agement is handled by the database administrator ( DBA), although with complex databases, this task is sometimes handled by an entire department.
27
Recent Trends in Database Design and Use
Data-driven Web sites Distributed databases Client/server databases Object-oriented databases Data-driven Web site Interface to a database Retrieves data and allows users to enter data Improves access to information Useful for: E-commerce sites that need frequent updates News sites that need regular updating of content Forums and discussion groups Subscription services, such as newsletters Distributed database Data is stored on multiple servers placed throughout an organization Reasons for choosing Approaches for setup Fragmentation Replication Allocation Security issues
28
Data Warehouses, Data Marts, and Data Mining
Data warehouse: collects business information from many sources in the enterprise Data mart: a subset of a data warehouse Data mining: an information-analysis tool for automated discovery of patterns and relationships in a data warehouse or a data mart Online Analytical Processing -Graphical software tools that provide complex analysis of data stored in a database Data warehouse supports different types of analysis Generates reports for decision making Data-mining analysis Discover patterns and relationships Reports Cross-reference segments of an organization’s operations for comparison purposes Find patterns and trends that can’t be found with databases Analyze large amounts of historical data quickly Online analytical processing (OLAP) Generates business intelligence Uses multiple sources of information and provides multidimensional analysis Hypercube Drill down and drill up Data mart Smaller version of data warehouse Used by single department or function Advantages over data warehouses More limited scope than data warehouses
29
Data Warehouses Data warehouse Multidimensional data
Hong Kong Airport Data warehouse Collection of data used to support decision-making applications and generate business intelligence Multidimensional data List the Different Databases that Hong Kong Airport would utilize? data warehouse is a collection of data from a variety of sources used to support decision- making appli-cations and generate business intelligence. 8 Data warehouses store multidimensional data, so they’re sometimes called “ hypercubes.” Typically, data in a data warehouse is described as having the following characteristics in contrast to data in a database: • Subject oriented— Focused on a specifi c area, such as the home- improvement business or a university, whereas data in a database is transaction/ function oriented. • Integrated— Comes from a variety of sources, whereas data in a database usually doesn’t. • Time variant— Categorized based on time, such as historical information, whereas data in a database only keeps recent activity in memory. • Type of data— Captures aggregated data, whereas data in a database captures raw transaction data. • Purpose— Used for analytical purposes, whereas data in a database is used for capturing and managing transactions.
30
A Data Warehouse Configuration
Exhibit 3.9 A Data Warehouse Configuration INPUT Variety of sources External Databases Transaction files ERP systems CRM systems Extraction, transformation, and loading (ETL) Extraction Collecting data from a variety of sources Converting data into a format that can be used in transformation processing Transformation processing Make sure data meets the data warehouse’s needs Loading Process of transferring data to the data warehouse
31
Exhibit 3.10 Slicing and Dicing Data Data warehouses are not transaction-oriented. Data warehouses support online analytical processing (OLAP).
32
A not so perfect match A not so perfect match
With the increasing power of Data mining techniques, comes ever increasing and reaching uses of this powerful technology. 1. What are the benefits of DNA databases? 2. What problems do DNA databases pose? 3. Who should be included in a national DNA database? Should it be limited to convicted felons? 4. Who should be able to use DNA databases? What are the benefits of DNA databases? • DNA databases have become a very powerful crime-fighting tool. Law enforcement agencies are now able to identify criminals by their own genetics. • DNA identification can also prove the innocence of an individual and thus avoid the potential act of wrongfully convicting them of a crime they did not commit. • In Canada, the National DNA Data Bank allows local law enforcement agencies to share DNA profiles. • DNA analysis is used to solve “cold cases” many years after the crime. 2. What problems do DNA databases pose? • Privacy advocates and defense lawyers believe genetic databases pose risks to the innocent if they contain data on people who are not convicted criminals. • Individuals changed with misdemeanors could also be included in the DNA database. • People who collect and analyze DNA can make mistakes.
33
Summary Databases Data warehouses and data marts Accessing files
Design principles Components Recent trends Data warehouses and data marts 3. Who should be included in a national DNA database? Should it be limited to convicted felons? Explain your answer. Students’ will vary in their answer to the questions. Some will believe that everyone is innocent until proven guilty and that there is little risk involved with having the DNA genes in a database. They will see this as just another “tool” to be used by law enforcement. Others will take a very different approach to this question. They will state that it is an invasion of privacy. All individuals have a right to privacy. If there is no crime committed by an individual, then there is no need to have to submit to giving a DNA sample. As stated in the case, errors have been made and individuals have been wrongfully convicted of a crime based on DNA profiling. 4. Who should be able to use DNA databases? Clearly, the answer to this question is that DNA databases should be very restricted to those involved in law enforcement agencies. This does not imply that just because a person works in law enforcement that they have access. Like all information, it must be highly protected and only made available only to individuals having the proper authority to use it. Even at that, it still must be restricted to fill the “need to know” requirements of their job.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.