12 1 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel 12.4 Online Analytical Processing OLAP creates an advanced data analysis environment that supports decision making, business modeling, and operations research OLAP systems share four main characteristics: –Use multidimensional data analysis techniques –Provide advanced database support –Provide easy-to-use end-user interfaces –Support client/server architecture
12 2 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 3 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Multidimensional Data Analysis Techniques Multidimensional view allows end users to consolidate or aggregate data at different levels Multidimensional view of data allows a business data analyst to easily switch business perspectives Multidimensional Data Analysis Techniques are augmented by: –Advanced data presentation functions: 3D graphics, pivot tables, crosstabs, etc. –Advanced data aggregation, consolidation, and classification functions: multiple aggregation levels, etc. –Advanced computational functions: business oriented variables, statistical and forecasting, etc. –Advanced data modeling functions: support “what-if” scenarios, variable assessment, linear programming, etc.
12 4 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 5 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Advanced Database Support Access to many different kinds of DBMS, flat files, and internal and external data sources Access to aggregated data warehouse data as well as to the detail operational data Advanced data navigation features such as drill-down and roll-up Rapid and consistent query response times The ability to map end-user requests to the appropriate data source and then to appropriate data access language (usually SQL): through meta-data Support for very large databases
12 6 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Each data analyst must have a powerful computer
12 7 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 8 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 9 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 10 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Relational OLAP ROLAP builds on existing relational technologies and represents a natural extension to companies that already use relational DBMS. ROLAP adds the following extension –Multidimensional data schema support: star schema (discussed in 12.5) –Data access language and query performance are optimized for multidimensional data: differentiate between access for data warehouse data and operational data; advanced indexing, such as bitmapped indexes –Support for very large DBs: to import, integrate, and populate the data warehouse with operational data
12 11 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Of Figure 12.2 Used when the number of possible values is small
12 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 13 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Multidimensional OLAP An MDBMS stores data in matrix-like n-dimensional arrays MDBMS end users visualize the stored data as a 3D cube known as a data cube Data cubes are static: you could only query pre- created cubes with defined axes To speed data access, data cubes are normally held in memory, called cube cache The recreation of data cubes is time-consuming Scalability is limited to avoid lengthy data access time MDBMS must handle sparsity effectively to reduce processing overhead and resource requirement MOLAP is a good solution for shops where small- to medium-sized DB are norm and application software speed is critical
12 14 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 15 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel ROLAP and MOLAP vendors are integrating their solutions within a unified decision support framework
12 16 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel 12.5 Star Schemas The star schema is a data modeling technique used to map multidimensional decision support data into a relational database Creates the near equivalent of a multidimensional database schema from the existing relational database Yield an easily implemented model for multidimensional data analysis, while still preserving the relational structures on which the operational database is built Has four components: facts, dimensions, attributes, and attribute hierarchies
12 17 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Four Components of Star Schema Facts –Numeric measurements that represent a specific business aspect or activity –The fact table contains facts that are linked thru their dimensions Dimensions –Qualifying characteristics that provide additional perspectives to a given fact –The magnifying glass thru which we study the facts –Stored in dimension tables Attributes: used to search, filter, or classify facts
12 18 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 19 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 20 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 21 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel The ability to focus on slices of the cube to perform a more detailed analysis important dimension
12 22 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel The attribute hierarchy provides a top-down data organization for Aggregation and drill-down/roll-up data analysis. It is not necessary for all attributes to be part of an attribute hierarchy. For example: product group vs. product brand
12 23 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 24 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel 1. Composite primary key 2. The largest table in the star schema
12 25 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Star Schema Representation A DSS-optimized data warehouse DBMS first searches the smaller dimension tables before accessing the larger fact tables Data warehouses usually have many fact tables. –If the orders department uses the same time periods as the sales department, time can be represented by the same time table. Otherwise, different time table are needed
12 26 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 27 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Star Schema Performance-Improving Techniques Price: multi-table joins
12 28 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 29 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Star Schema Performance-Improving Techniques De-normalizing the Fact tables –Improves data access performance and saves data storage space –Use one single record to store data that normally take many records –Design criteria, such as frequency of use and performance requirements, are evaluated against the possible overload place don the DBMS to manage these de-normalized relations Table Partitioning and Replication –It is common to have one fact table for each level of aggregation in the time dimension. These fact tables must have an implicit or explicit periodicity defined.
12 30 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel 12.6 Implementing a Data Warehouse Numerous constraints: –Available funding –Management’s view of the role played by an IS department and of the extent and depth of the information requirements –Corporate culture No single formula can describe perfect data warehouse development
12 31 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Factors Common to Data Warehousing The data warehouse as an active decision support framework –Data warehouse is not a static database. It is a dynamic framework for decision support that is always a work in progress A company-wide effort that requires user involvement –Data warehouse data cross departmental lines and geographical boundaries –It requires managerial skills to deal with conflict resolution, mediation, and arbitration –Designers must Involve users in the process, Secure end-users’ commitment from the beginning, Create continuous end- user feedback, Manage end-user expectations, Establish procedures for conflict resolution
12 32 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Factors Common to Data Warehousing Satisfy the trilogy: data, analysis and users –Must satisfy: Data integration and loading criteria Data analysis capabilities with acceptable query performance End-user data analysis needs Apply database design procedures –Database design procedures must be adapted to fit the data warehouse requirement
12 33 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 34 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel 12.7 Data Mining With typical data analysis tools, if the end user fails to detect a problem, no action is taken In contrast, data mining is proactive: automatically search the data for anomalies and possible relationships, thereby identifying problems that have not yet been identified by the end user Data mining tools –analyze data –uncover problems or opportunities hidden in data relationships, –form computer models based on their findings, and then –use the models to predict business behavior Data mining tools require minimal end-user intervention Data mining tools initiate analyses to create knowledge
12 35 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 36 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
12 37 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Research focus: inductive or intelligent DB that could learn and extract knowledge from the stored data
12 38 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Summary Data analysis is used to derive and interpret information from data Decision support is a methodology designed to extract information from data and to use such information as a basis for decision making Decision support system is an arrangement of computerized tools used to assist managerial decision making within a business Data warehouse is an integrated, subject- oriented, time-variant, nonvolatile database that provides support for decision making
12 39 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Summary ( continued ) Online analytical processing is an advanced data analysis environment that supports decision making, business modeling, and operations research Star schema is a data-modeling technique used to map multidimensional decision support data into a relational database The implementation of any company-wide information system is subject to conflicting organizational and behavioral factors
12 40 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel Summary ( continued ) Data mining automates analysis of operational data with the intention of finding previously unknown data characteristics, relationships, dependencies, and/or trends Data warehouse is storage location for decision support data