Download presentation
Published byJayson McLaughlin Modified over 9 years ago
1
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
2
DATABASES AND DATA WAREHOUSES
Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence
3
INTRODUCTION Businesses need business intelligence (BI)
Business intelligence – collective information e about your customers, competitors, business partners, environment, and internal operations Enables effective, important and strategic decision making Such Information need to be gathered and organized in data repositories such as databases, data warehouses and data marts then use IT tools (such as DBMS and data mining to define and analyze relationships within information. McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
4
INTRODUCTION IT tools help process information to create business intelligence according to… OLTP (online transaction processing) OLAP (online analytical processing) McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
5
INTRODUCTION OLTP – gathering and processing transaction information and updating existing information to reflect transaction Databases and DBMS support OLTP Operational database – database that supports OLTP Simple queries which require information from single database OLTP (on-line transaction processing) – Major task of traditional relational DBMS – Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc. – Aims at reliable and efficient processing of a large number of transactions and ensuring data consistency • OLAP (on-line analytical processing) – Major task of data warehouse system – Data analysis and decision making – Aims at efficient multidimensional processing of large data volumes • Fast, interactive answers to large aggregate queries • Distinct features (OLTP vs. OLAP): – User and system orientation: customer vs. market – Data contents: current, detailed vs. historical, consolidated – Database design: ER + application vs. star + subject – View: current, local vs. evolutionary, integrated – Access patterns: update vs. read-only but complex queries McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
6
INTRODUCTION OLAP – manipulation of information to support decision making Databases can help some Data warehouses support only OLAP, not OLTP Data warehouses – special forms of databases that support decision making OLAP: Support more complex manipulation and in depth queries which require information from multiple DB McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
7
INTRODUCTION McGraw-Hill
© 2007 The McGraw-Hill Companies, Inc. All rights reserved.
8
RELATIONAL DATABASE MODEL
Database – logical collection of information you organize and access according to the logical structure of the information Relational database – uses a series of two-dimensional tables or files to store information in the form of a database McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
9
Databases Are… Collections of information
Created with logical structures With logical ties within the information With built-in integrity constraints McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
10
Databases – Collections of Information
Databases have many tables Solomon Enterprises as a concrete provider. Tables include: Order Customer Concrete Type Employee Truck McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
11
Databases – Collections of Information
McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
12
Databases – Created with Logical Structures
In databases, row numbers are irrelevant In databases, columns have logical names (attributes) such as Order Date and Customer Name Data dictionary – contains the logical structure of the information in a database Data dictionary contains important information (or logical properties about your information) (characteristics of each field) McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
13
Databases – Logical Ties within the Information
Logical ties (relationships) must exist between the tables to show how the tables related to each other. Logical ties are created with primary and foreign keys Primary key – field (or group of fields in some cases) that uniquely describe each record Primary key: unique and cannot be blank McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
14
Databases – Logical Ties within the Information
Foreign key – primary key of one file that appears in another file Foreign keys help create relationships among tables, without them you have no way of creating logical ties among various files. Table = file = relation (don’t confuse yourself) McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
15
Databases – Logical Ties within the Information
McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
16
Databases – Built-in Integrity Constraints
Integrity constraint – rule that helps ensure the quality of information Examples Primary keys must be unique (can’t be blank “Null”) Foreign keys cannot be blank (in general foreign keys can be “Null” but you can add a constraint that require to be entered (can’t be null) Sales price cannot be negative Phone numbers must have an area code McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
17
DBMS TOOLS Database management system (DBMS) – helps you specify the logical organization for a database and access and use the information within a database Word processing software = document Spreadsheet software = workbook DBMS software = database McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
18
DBMS TOOLS 5 software components DBMS engine Data definition subsystem
Data manipulation subsystem Application generation subsystem Data administration subsystem McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
19
DBMS TOOLS McGraw-Hill
© 2007 The McGraw-Hill Companies, Inc. All rights reserved.
20
DBMS Engine DBMS engine – accepts logical requests from the various other DBMS subsystems, converts them into their physical equivalent, and actually accesses the database and data dictionary exist on a storage device. DBMS engine separates the logical from the physical McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
21
DBMS Engine Physical view – how information is arranged, stored, and accessed on a storage device Logical view – how you (knowledge worker) need to arrange and access information to meet your business needs. Databases – you work only with logical views There is one physical view but there maybe numerous knowledge workers who have different logical views of the information in a database McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
22
Data Definition Subsystem
Data definition subsystem – helps you create and maintain the data dictionary and define the structure of the files in a database Must create data dictionary for a database and define the structure of the table before entering any information McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
23
Data Definition Subsystem
Add a new field in the data dictionary and delete a given field for all record in a file Define logical structure (properties) of information in data dictionary Field name: customer number, order date, .. Field type: alphabetic, numeric, date, time Form: is an area code required for phone number? Default value: if no order date is entered the default is today’s date Validation rule: can amount exceed 8? Is an entry required?:must u enter delivery address for an order or it can be blank? Can there be duplicates?: primary keys can’t be duplicated, but what about amounts?
24
Data Manipulation Subsystem
Data manipulation subsystem – helps you add, change, and delete information Primary interface between you and a database Views Report generators QBE tools SQL McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
25
Views View – allows you to see the contents of a database file
Similar to a spreadsheet view Make changes, add and remove records. Sort Query Find McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
26
Views Sort Find Add a new record McGraw-Hill
© 2007 The McGraw-Hill Companies, Inc. All rights reserved.
27
Report Generators Report generator – helps you quickly define formats of reports and what information you want to see in a report Save report formats to use later Uses a wizard interface McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
28
Report Generators Specify the fields you want in a report
Specify the layout of the report McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
29
Report Generators McGraw-Hill
© 2007 The McGraw-Hill Companies, Inc. All rights reserved.
30
QBE Tools Query-by-example (QBE) tool – helps you graphically design the answer to a question “What driver most often delivers concrete to Triple A Homes?” McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
31
QBE Tools McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
32
SQL Structured query language (SQL) – standardized fourth-generation language found in most DBMSs Performs same task as QBE Uses sentence structure instead Mostly used by IT people McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
33
Application Generation Subsystem
Application generation subsystem – contains facilities to help you develop transaction-intensive applications Data entry screens (called forms in Access) Programming languages Interfaces Mostly used by IT people McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
34
Data Administration Subsystem
Data administration subsystem – helps you manage the overall database environment Backup and recovery Security management(CRUD) Query optimization Concurrency control Change management McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
35
Data Administration Subsystem
Backup and recovery Periodically back up information Recover a database after a failure Backup: a copy of information stored on a computer. Recovery: the process of reinstalling the backup information in the event information was lost. Security management Who has access to what information What type of access those people have (Who can perform CRUD tasks on information) (CRUD: Create, Read, Update and Delete) McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
36
Data Administration Subsystem
Query optimization Take queries from users and restructure them to minimize response time. (find shortest route to information) Reorganization facilities: how DBMS engine physically access information and reorganize how information is physically stored Concurrency control The validity of database updates when multiple users attempt to access and change the same information What happens if two people simultaneously try to change the same information? McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
37
Data Administration Subsystem
Change management What is the effect of structural changes to a database? What if you add a new column? What happens if you delete a column? What happens if you change a column’s attributes? McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
38
DATA WAREHOUSES & DATA MINING
Data warehouses support OLAP and decision making Data warehouses do not support OLTP Data-mining tools are tools for working with data warehouse information DBMS software = database Data-mining tools = data warehouse McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
39
What Is a Data Warehouse?
Data warehouse – logical collection of information – gathered from operational databases – used to create business intelligence that supports business analysis activities and decision-making tasks McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
40
What Is a Data Warehouse?
McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
41
What Is a Data Warehouse?
Multidimensional Rows and columns Also layers Many times called hypercubes What are the dimensions in Figure 3.8 on page 140? Data warehouse contains summarized information it doesn’t contains details about each interaction. McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
42
What Are Data-Mining Tools?
Data-mining tools – software tools that you use to query information in a data warehouse Query-and-reporting tools Intelligent agents Multidimensional analysis tools Statistical tools McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
43
What Are Data-Mining Tools?
McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
44
Query-and-Reporting Tools
Query-and-reporting tools – similar to QBE tools, SQL, and report generators in the typical database environment Also similar to pivot tables in Excel McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
45
Intelligent Agents Use various Artificial Intelligence (AI) tools such as neural networks and fuzzy logic to form the basis for “information discovery” and building BI Help you find hidden patterns in information Chapter 4 focuses on these McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
46
Multidimensional Analysis Tools
Multidimensional analysis (MDA) tools – slice-and-dice techniques that allow you to view multidimensional information from different perspectives Bring new layers to the front Reorganize rows and columns McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
47
Statistical Tools Help you apply various mathematical models to the information stored in a data warehouse to discover new information Regression analysis Analysis of variance And so on McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
48
Data Marts Data warehouses are organizationwide
Data marts have subsets of an organizationwide data warehouse Data mart – subset of a data warehouse in which only a focused portion of the data warehouse information is kept. It is smaller more manageable data for particular needs. Data marts use the same data mining tools like data warehousing (Query-and-reporting tools, Intelligent agents, Multidimensional analysis tools and Statistical tools) McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
49
Data Marts McGraw-Hill
© 2007 The McGraw-Hill Companies, Inc. All rights reserved.
50
Data Mining as a Career Opportunity
Knowledge of data mining can be a substantial career opportunity for you Business Objects SAS Cognos Informatica Many others McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
51
Considerations in Using a Data Warehouse
Do you need a data warehouse? DBMS may offer all you need and data warehouse are expensive and require extensive and most often expensive support. Do all employees need the entire data warehouse? Consider a data mart How up-to-date must information be? “Snapshot” concept to compare database information with data warehouse What data-mining tools do you need? Training can be expensive McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
52
BI Revisited BI: improve timeline and quality of input for decision making by helping knowledge worker understands: Capabilities available in organization State of arts, trends and future direction of market Technological, demographic, economic, social and regulatory environment in which organization compete Action of competitors and implications of these actions BI : INTERNAL INFORMATION EXTERNAL INFORMATION Competitive Intelligence: business intelligence focused on external competitive environment
53
BUSINESS INTELLIGENCE
BI : INTERNAL INFORMATION EXTERNAL INFORMATION Competitive Intelligence: business intelligence focused on external competitive environment
54
Strategic and Competitive Opportunities with BI
Corporate performance management Optimizing customer relations Traditional decision support Management reporting of BI Information right time, location, and form (personal information dimensions)
55
IT Support for Business Intelligence
Web supports many BI systems Movement toward specialized BI packages Digital dashboard – displays key information gathered from several sources on computer screen in format tailored to an individual needs and wants
56
IT Support for Business Intelligence
57
INFORMATION OWNERSHIP
Strategic management support The sharing of information with responsibility Information cleanliness McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
58
Strategic Management Support
Chief privacy officer (CPO) – ensuring that information is used in an ethical way Chief security officer (CSO) – ensuring security of information (e.g., firewalls) Chief information officer (CIO) – oversees every aspect of an organization’s information resource McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
59
Strategic Management Support
Data administration – plans for, oversees the development of, and monitors the information resource. Assure all information requirement can be and are being met. Database administration – responsible for the more technical aspects and operational aspects of managing information Defining and organizing database structures and contents, developing security procedures, monitor developing DB. Both often report to the CIO McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
60
The Sharing of Information with Responsibility
Someone has to accept full responsibility for providing a specific peace of information and ensuring the quality of that information. If you create it, you “own” it You will also share it with others Because you “own” it, you are responsible for its quality McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
61
Information Cleanliness
Database and data warehouse information must be “clean” No errors No duplicates DBMS can find utilities to help clean your information McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved.
62
Designing and building relational database
Four steps to go about designing a database Define entity classes and primary keys Define relationships among entity classes Define information fields for each relation and Define cardinality for each relation Use data definition language to create your DB Before you design a database you need to capture business rules which help you define the correct structure of your DB. Business Rules: are statements concerning the information you need to work with and the relationships within the information
63
Extended learning module
E-R diagram: entity relation diagram Entity class: concept (people, places or things) about which you wish to store information and that you can identify with a unique key. Instance: an occurrence of an entity class that can be uniquely described with a primary key. Intersection relation (composite relation): a relation you create to eliminate many to many relationship. It called so because it represent an intersection of the primary keys between the first two relations Composite primary key: consist of the primary key field from the two intersection relations Cardinality: numerical nature of a relationship (1:1, 1:M, M:1, M:M)
64
Normalization Normalization : a process of assuring that a relational database structure can be implemented as a series of two- dimensional relations Eliminate repeating groups or M:M relationships Assure that each field in a relation depends only on the primary key for that relation Removed all derived field from the relation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.