Download presentation
Presentation is loading. Please wait.
Published byWilliam Rose Modified over 8 years ago
2
Data Warehouses, Online Analytical Processing, and Metadata 11 th Meeting Course Name: Business Intelligence Year: 2009
3
Bina Nusantara University 3 Source of this Material (2).Loshin, David (2003). Business Intelligence: The Savvy Manager’s Guide. Chapter 6
4
The Business Case There is a significant difference between the traditional use of databases for business purpose and the us of databases for analytical purposes. The traditional use revolves around transaction processing as the means by which a business’s operation is modeled. On the other hand, the representation of information in this framework is not suitable for analytical purposes. BI community has developed a different kind of data model that more efficiently represents data that is to drive analytic applications and decision support, called a dimensional model. By creating a centralized data repository using this kind of data model and aggregating data sets from all areas of the corporate enterprise in this repository, a data warehouse can be created that can supply data to the individual analytic applications. Bina Nusantara University 4
5
5 Data Models A data model is a discrete structured data representation of a real-world set of entities related to one another. There is a significant difference between how we use data in an operational/tactical manner (i.e., to “run the business”) and the ways we use data in a strategic manner (i.e., “improve the business”). The traditional modeling technique for operational systems revolves around the entity-relationship model. Entity-Relationship Models Relational database, in which the way that information was modeled was viewed in the context of representing entities within separate tables and relating those entities within a business process context between tables using some form of cross-table linkage. One essential goal of the entity-relationship model is the ability to ease the development of transaction processing by providing a reasonable scheme for mapping a business process to a grouped sequence of table operations to be executed as a single unit of work.
6
Bina Nusantara University 6 Data Models (cont…) Another essential goal of the relational model is the identification and elimination of redundancy within a database, process, called normalization, analyzes tables to find instances of replicated data within one table that can be extracted into a separate table that can be linked relationally through a foreign key. (see Figure 11-1 and 11-2) Dimensional Models Dimensional modeling captures the basic unit of representation as a single multikeyed entry in a slender fact table, with each key exploiting the relational model to refer to the different dimensions associated with those facts. A maintained table of facts, each of which is related to a set of dimensions, is a much more efficient representation for data in a data warehouse. Fact Tables and Star Schemes The representation of a dimensional model is straightforward. A fact table contains records that refer to observable objects, usually within a business context (Figure 11- 3). The fact table is related to dimensions in a star schema. Each entry in a dimension represents a description of the individual entities within that dimension.
7
Bina Nusantara University 7 Data Models (cont…) Figure 11-1 Figure 11-2
8
Bina Nusantara University 8 Data Models (cont…) Figure 11-3 Benefits of The Dimensional Model for BI Using a dimensional model for managing data in a data warehouse has a number of benefits. The framework is simple and predicable No matter what the dimensional breakdown, there is no inherent bias lent to any individual dimension Because the dimensional model is easily extensible
9
Data warehouse is the primary source of information that feeds the analytical processing within an organization. A data warehouse is centralized repository of information. A data warehouse is arranged around the relevant subject areas important to the corporation as a whole. A data warehouse is queryable source of data for enterprise. A data warehouse is used for analysis and not for transaction processing. The data in a data warehouse is nonvolatile. A data warehouse is the target location for integrating data from multiple sources, both internal and external to an enterprise. Bina Nusantara University 9 The Data Warehouse
10
Bina Nusantara University 10 The Data Mart A data mart is a subject-oriented data repository, similar in structure to the enterprise data warehouse, but it holds the data necessary for the decision support and BI needs of a specific department or group within the organization. A data mart could be constructed solely for the analytical purposes of the specific group or could be derived from an exiting data warehouse. Data marts are also built using the star join structure.
11
Bina Nusantara University 11 Online Analytical Processing Online analytical processing tools provide a means for presenting data sourced from a data warehouse or data mart in a way that allows the data consumer to view comparatives metrics across multiple dimensions. The dimensions of data to be analyzed in an OLAP environment are arranged in cube structure (actually, a hypercube), where summaries of any dimension can be seen in the context of other dimensions. Because of the cube structure, there is an ability to rotate the perception of the data to provide different views into the data using alternate base dimensions. The value of an OLAP tool is derived from the ability to quickly analyze the data from multiple points of view, and so OLAP tools are designed to precalculate the aggregations and store them directly in the OLAP databases.
12
Bina Nusantara University 12 Metadata The standard definition of metadata is “data about the data”. Essentially, metadata is an shareable master key to all the information that is feeding the business analytics, from the extraction and population of the central repository to the provisioning of data out of the warehouse and onto the screens of the business clients. The Importance of Metadata The management of metadata is probably one of the most critical tasks associated with a successful BI program, for a number of reasons. Metadata encapsulates both the logical and physical business knowledge Metadata captures the structure and meaning of the data that is being fed into the warehouse. The recording of operational metadata provides a road map One can capture differences associated with how data is manipulated Metadata provides the means for tracing the evolution of information
13
Bina Nusantara University 13 Metadata (cont…) Technical Metadata Technical Metadata characterizes the structure of data, the way that data move, and how it is transformed as it moves from one location to another. This may incorporate some or all of the following. Connectivity Metadata Table Information Record Structure Information Record Manipulation Metadata Index Metadata Data Practitioners Security and Access Metadata Data Model Metadata Physical Features Metadata Reference Metadata Management Metadata
14
Bina Nusantara University 14 Metadata (cont…) Transformation Metadata Process Metadata Supplied Data Metadata Business Metadata Business metadata incorporates much of the same information as technical metadata, as well as: Metadata that describes the structure of data as perceived by business clients. Descriptions of the methods for accessing data for client analytical applications. Business meaning for tables and their attributes Data ownership characteristics and responsibilities Data domains and mapping between those domains, for validation Aggregation and summarizations directives Reporting directives
15
Bina Nusantara University 15 Metadata (cont…) Security and access policies Business rules that describes constraints or directives associated with data within a record or between records as joined through a join condition The Metadata Repository As primary source of knowledge about the inner workings of the BI environment, it is important to build and maintain a metadata repository that is available to all knowledge workers involved in the BI program. Whether the metadata repository is physically centralized or distributed across multiple systems and however its accessed, it is important to provide a mechanism for publishing metadata.
16
Bina Nusantara University 16 Management Issues The significant management issues associated with the topics in this chapter deal with aspects of this. Dueling Opinions There are basically two different schools of though about how to build a data warehouse and a BI program, and for some reason there seems to be an almost religious adherence by practitioners of these different schools. The Technology Trap There are many interesting technologies associated with data warehousing, but too often technologists drive these project. It is important to keep in mind that the coolest way to do something is not necessarily the best way to do it. The Vendor Trap Be aware that there are many vendors producing canned solutions and products under the guise of data warehouse, data mart, metadata repositories, and OLAP environments. There are many examples of high-cost software products that are too complicated for the customer to use without additional investment in training and consulting, and there ultimately end up as “shelfware.”
17
End of Slide Bina Nusantara University 17
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.