InfoCubes and Aggregates

Slides:

Advertisements

Similar presentations

Dimensional Modeling.

Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.

BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.

BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading.

OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.

Alternative Database topology: The star schema

Sales & Distribution (SD)

Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.

The Fulfillment Process

Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.

Mgt 240 Lecture MS Excel and Access: Introduction to Databases September 23, 2004.

Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.

Chapter 14 The Second Component: The Database.

CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.

Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.

Data Warehousing (Kimball, Ch.2-4) Dr. Vairam Arunachalam School of Accountancy, MU.

©2008 TTW Where “Lean” principles are considered common sense and are implemented with a passion! Product Training Sales Invoices.

Customers Training Where “Lean” principles are considered common sense and are implemented with a passion!

Material Management (MM) Master Data & Records EGN 5620 Enterprise Systems Configuration (Professional MSEM) Fall, 2013.

UNWBW1 – Business Information Warehouse NetWeaver Support Consultant Training Data Loading.

Warehouse Activity Profiling

Business Planning & Simulation and BW Monitoring

Phase II: Procurement SAP University Alliances Version 2.2

Material Management (MM) Master Data & Records EGN 5620 Enterprise Systems Configuration Spring, 2012.

 SAP AG 2000 Main Steps Overview for details see Business Blueprint Step by Step Guide BUSINESS BLUEPRINT.

Contents: Sales Process Handling Issues in Sales – A/R Sales - A/R.

Purchasing – A/P Contents: Basic Procurement Process

BW Know-How Call : Performance Tuning dial-in phone numbers! U.S. Toll-free: (877) International: (612) Passcode: “BW”

DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.

Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.

Material Management (MM) Master Data & Records EGN 5620 Enterprise Systems Configuration Spring, 2013.

Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.

Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.

Object Persistence (Data Base) Design Chapter 13.

Object Persistence Design Chapter 13. Key Definitions Object persistence involves the selection of a storage format and optimization for performance.

C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.

Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.

Reports. Report Summary Warehouse Reports Returned Material Serial Numbers Not Found This report list the serial numbers of material returned which were.

Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.

DIMENSIONAL MODELING MIS2502 Data Analytics. So we know… Relational databases are good for storing transactional data But bad for analytical data What.

Basic Model: Retail Grocery Store

©2008 TTW Where “Lean” principles are considered common sense and are implemented with a passion! Product Training Sales Orders.

Designing a Data Warehousing System. Overview Business Analysis Process Data Warehousing System Modeling a Data Warehouse Choosing the Grain Establishing.

MM Master Data & Records SAP MM Master Data and Records

Business Intelligence Training Siemens Engineering Pakistan Zeeshan Shah December 07, 2009.

1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.

INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation.

Purchase Order Creation Manually or automatically With or without reference to other documents Data Documents: Purchase requisition, Purchase order, RFQ,

Data Resource Management Application Layer TPS A RCHITECTURE Data Layer Sales/MarketingHR Finance/Accounting Operations Spreadsheet Data MS Access Accounts.

I am Xinyuan Niu I am here because I love to give presentations. Data Warehousing.

 XX Consulting 2000 MM01 - Organizational Structure & Master Data in MM November, 2000.

3 Copyright © 2006, Oracle. All rights reserved. Business, Logical, and Dimensional Modeling.

Operation Data Analysis Hints and Guidelines EIN 6133 Enterprise Engineering Fall, 2015.

 Andersen Consulting IM01 - Organizational Structure & Master Data in IM November, 2000.

 TATA CONSULTANCY SERVICES MATERIALS MANAGEMENT.

Order Types order category control indicators open item management CO partner update revenues classification retention periods retention period 1 retention.

Operation Data Analysis Hints and Guidelines

Inventory Transactions庫存交易

Inventory is used to illustrate:

Retail Sales is used to illustrate a first dimensional model

CRM Analytics Architecture

Introduction to Customizing Reports in SAP

Management Information Systems

Retail Sales is used to illustrate a first dimensional model

Introduction to Customizing Reports in SAP

Management Information Systems

Retail Sales is used to illustrate a first dimensional model

Dimensional Model January 16, 2003

Presentation transcript:

InfoCubes and Aggregates UNWBW1 – Business Information Warehouse NetWeaver Support Consultant Training

BW-BPS Business Planning & Simulation Monitoring & Technical Risks Content Introduction Reporting Business content Data loading InfoCube Design Aggregates BW-BPS Business Planning & Simulation Monitoring & Technical Risks

Quantities Revenues Costs Rev./Group Competition Dimension Star Schema Product Dimension Customer Dimension Sales Dimension Product Dimension Quantities Revenues Costs Rev./Group Time Customer Dimension Time Dimension Competition Dimension Star Schema (Logical) The terms InfoCubes, dimensional analysis, star schemas, and DataMarts, depending on their context, are essentially referring to the same thing - how data is structured within BW tables. The term ‘star schema’ is used when discussing the table structure from a conceptual, or data modeling perspective; whereas ‘InfoCube’ is typically used when referring to the actual set of tables where data is stored. Within BW, a user creates a query against an InfoCube.

Dimensions Dimension tables are groupings of related characteristics. A dimension table contains a generated primary key and characteristics. The keys of the dimension tables are foreign keys in the fact table. Customer dimension C Customer # Region … 13970522 West ... Product dimension Time dimension P Product # Product group … 2101004 Displays ... T Period Fiscal year … 10 1999 ... From a technical perspective the characteristics of the dimension table form the ‘edges’ of the InfoCube. The dimensions are connected with the fact table using the DIMs, or dimension IDs. The access to the data in the fact table takes place using the selection of characteristics and/or their characteristic values from the dimension tables and the generation of a corresponding SQL statement that accesses the fact table.

Example: Sales Infocube Dimensions Customer Product Sales Customer number Customer name Cust Category Cust Subcategory Division Industry Revenue Class Transportation zone Currency VAT # Legal Status Regional market Cust Statistics group Incoterms Billing schedule Price group Delivering plan ABC Classification Account assignment group Address State Country Region Material number Material text Material type Category Subcategory Market key MRP Type Material group 1 Planner Forecast model Valuation class Standard cost Weight Volume Storage conditions Creation Date Salesperson Rep group Sales territory Sales region Sales district Sales planning group Distribution key Competition Nielsen indicator SEC Code Primary competitor Secondary Competitor Time Date Week Month Fiscal Year The dimension tables contain the values that need to be analyzed. These characteristics are often master data elements or organizational elements or values that can be used to describe one of these. Three dimension tables are required by the system; the time, unit and info package dimensions. Up to 13 other dimension tables may be added to a BW star schema.

Fact Table A record of the fact table is uniquely defined by the keys of the dimension tables A relatively small number of columns (key figures) and a large number of rows is typical for fact tables A fact table is maintained during transaction data load Fact table P C T Quantity Revenue Discount Sales overhead 250 500,000 $ 50,000 $ 280,000 $ 50 100,000 $ 7,500 $ 60,000 $ … … … ... Strong entities are the main characteristics which occur in the application being analyzed. The fact table contains the data (key figures) for a certain combination of characteristic values of the dimension tables. The referencing of the fact table takes place using the artificially entered dimension key (DIM-ID). As artificial keys are formed for the connection between the dimension and fact table, changes to the master data table can take place relatively problem-free, without having to rebuild the (natural) key every time. In the evaluation, a resulting quantity is first formed by the selections in the dimension tables. This is then selected directly from the fact table by the artificial key.

Facts - Sales Example: Sales Facts Quantity sold List price Discounts Invoice price Fixed mfg. cost Variable cost Moving average price Standard cost Contribution margin Expected ship date Actual ship date The fact table contains key figures, or in other words, values that help a business person evaluate their company and make the appropriate decisions. These key figures could be calculated or brought directly over from the source system.

Example: Sales Star Schema Competition Nielsen indicator SEC Code Primary competitor Secondary Competitor Customer Customer number Customer name Cust. Category Cust. Subcategory Division Industry Revenue Class Transportation zone Currency VAT # Legal Status Regional market Cust. Statistics group IncoTerms Billing schedule Price group Delivering plan ABC Classification Account assignment group Address State Country Region Facts Customer Material Competition Sales Time Material Sales Material number Material text Material type Category Subcategory Market key MRP Type Material group 1 Planner Forecast model Valuation class Standard cost Weight Volume Storage conditions Creation Date Salesperson Rep group Sales territory Sales region Sales district Sales planning group Distribution key Qty sold List price Discounts Invoice price Fixed mfg cost Variable cost Moving average price Standard cost Contribution margin Expected ship date Actual ship date Time Date Week Month Fiscal Year This slide pulls the concept together for a sales-related star schema. The fact table contains the key figures as well as the keys, or links, to the dimension tables. The data can then be sliced into many different combinations of these values within a query. This is the essence of creating a query against an SAP BW InfoCube.

Extending the Star Schema In a basic Star Schema we are limited: Only characteristics of the dimension tables can be used to access facts. No structured drill downs can be created. Support for many languages is difficult. In BW, the Extended Star Schema adds access to: Master data tables and their associated fields (attributes). Text tables with extensive multilingual descriptions. External hierarchy tables for structured access to the data. With the extended Star Schema, master data characteristics (and their attributes, texts and hierarchies) can referred to from multiple InfoCubes.

SAP BW: Extended Star Schema Customer Text Table Material Attributes Table Material Text Table CUSTOMER_ID MATERIAL_ID MATERIAL_ID Customer Name Material Group Material Name Material SID-Table Customer SID-Table InfoCube MATERIAL_ID CUSTOMER_ID Customer Dimension Table Material Dimension Table SID_MATERIAL SID_CUSTOMER DIM_ID_CUSTOMER DIM_ID_MATERIAL external Material Hierarchy Customer Attributes Table SID_CUSTOMER SID_MATERIAL CUSTOMER_ID Fact Table City Region DIM_ID_PACKAGE DIM_ID_TIME DIM_ID_UNIT DIM_ID_MATERIALDIM_ID_CUSTOMER Amount SID-Table Amount Sales AMOUNT_ID Datapackage Dimension Table Unit Dimension Table SID_AMOUNT DIM_ID_PACKAGE DIM_ID_UNIT SID_REQUEST Time Dimension Table SID_AMOUNT SID_CURRENCY Currency SID-Table The BW extended star schema is different to the basic star schema. It is subdivided into a solution-dependent part (InfoCube) and a solution-independent part (attribute tables, text tables, and hierarchy tables) that is also shared among the other InfoCubes. The dimension attributes of the dimension tables are called characteristics. The attributes located in the master data table of a characteristic are called the attributes of the characteristic. The great challenge when designing a solution is to decide whether to store an attribute in a dimension table (and therefore in the InfoCube) or in a master data table. Data is loaded separately into the master data tables (attribute tables) text tables and hierarchy tables. The SID table is the link between the master data and the dimension tables. DIM_ID_TIME CURRENCY_ID SID_MONTH SID_YEAR SID_CURRENCY Request SID-Table REQUEST_ID Calendar Month SID-Table Calendar Year SID-Table SID_REQUEST MONTH_ID YEAR_ID SID_MONTH SID_YEAR

Dimensions up to 16 dimensions 3 dimensions exist with each InfoCube (whether they are used and thus visible or not) Time dimension Unit dimension Packet dimension The remaining 13 dimensions are for individual schema design Each dimension table may be up to 248 characteristics. Gebiet 1 Gebiet 2 Gebiet 3 Bezirk 1 Gebiet 3a Bezirk 2 Region 1 Gebiet 4 Gebiet 5 Bezirk 3 Region 2 Gebiet 6 Bezirk 4 Gebiet 7 Gebiet 8 Bezirk 5 Region 3 Vertriebsorganisation Material Group Hierarchy Table Number Language Code Material Name Material Text Table Material_Dimension_ID Material Dimension Attribute Table Material Type Dimension

Summary The center of a multidimensional schema in BW are the fact tables. The fact tables are surrounded by dimensions. Dimension Table In BW the attributes of the dimension tables are called characteristics (e.g. material). Master Data Tables: Attribute Tables Dependent attributes of a characteristic can be stored in an Attribute Table for the characteristic. Text Tables Textual descriptions of a characteristic are stored in a separate text table. External Hierarchy Tables Hierarchies of characteristics or attributes may be stored in separate hierarchy tables.

Compressing the InfoCube Records added to InfoCube fact tables have several “keys” which uniquely identify the record. Request ID is just one of several fields in a record that helps identify the data. But, Request ID can be removed, and each record can still be uniquely identified. Compression finds records which are identical except for Request ID, then aggregates these to one single record. If a compression is not performed, the “Group by” condition of any query’s SQL statement will remove duplicates. This results in decreased query performance. An InfoCube compression is important in maintaining query performance. It helps manage Fact table size. Drawback: Requests can no longer be deleted from the InfoCube selectively Only the compressed table (E table) can be partitioned by a date range When you load data into the InfoCube, entire requests can be inserted at the same time. Each of these requests has its own request ID, which is included in the fact table in the packet dimension. This makes it possible to pay particular attention to individual requests. One advantage of the request ID concept is that you can delete complete requests from the InfoCube. However, the request ID concept can also cause the same data record (all characteristics except the request ID match) to appear more than once in the fact table. This unnecessarily increases the data volume and reduces the performance in reporting, as the system has to aggregate every time you perform a query using the request ID. By compressing you can eliminate these disadvantages and bring the data from the different requests together into one single request (request ID 0). If you are using an Oracle database as your BW database, you can also carry out a report using the relevant InfoCube while the compression is running. With other manufacturers’ databases, you will see a warning if you try to carry out a report using an InfoCube while the compression is running. In this case you can only report on the relevant InfoCube when the compression has finished running.

Compressing the InfoCube Request IDs Lost !!! Request Date Record Cost 1 01.01.2002 100 02.06.2002 2 200 Request Date Record Cost 01.01.2002 1 300 02.06.2002 2 200 04.10.2002 Request Date Record Cost 2 01.01.2002 1 200 04.10.2002 300 E-Fact table F-Fact table Compressing an InfoCube means that the request ID will be deleted, and all rows with the same key will be summarized (for non-cumulative key figures). This function is critical, because it means that you are no longer able to use the request IDs to delete data from the InfoCube. Before you proceed, make sure that the data in the InfoCube is correct. You must compress the InfoCube at regular intervals. This saves space. COMPRESSION

BW-BPS Business Planning & Simulation Monitoring & Technical Risks Content Introduction Reporting Business content Data loading InfoCube Design Aggregates BW-BPS Business Planning & Simulation Monitoring & Technical Risks

Aggregates ... ... are like InfoCubes, ... are always based on InfoCubes ... summarize ("aggregate") data of the originating InfoCube, ... contain redundant information, but ... accelerate the access to that information, ... are performance-enhancing features. Like a query, an aggregate constitutes a subset of the star schema of the related InfoCube. However, it uses its own private fact table and possibly its own dimension tables. In this example, aggregates can discard certain levels of details, such as "day" and "city" or the sales organization and keep data on a summarized level. This means that an aggregate's fact table can be much smaller. In our example this translates into an immediate performance benefit when processing our query on this aggregate rather than on the original InfoCube. Obviously, an aggregate does not contain all the detail information of the original InfoCube and as such cannot replace that InfoCube. However, a handful, well-defined aggregates can substantially improve the performance of the standard queries that users will be executing. Aggregates are only available for InfoCubes, not for other InfoProviders as ODS or MultiProvider

Aggregates - Example Data for queries like ‘sales for all countries’, ‘sales in Germany’, or ‘overall sales’ can be read out of the aggregate (country *). Fact Table: Sales Data Aggregate Tables: Sales Data Country Customer Sales Country * USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Country Sales USA Germany Austria 40 35 20 The aggregate (country *) may also be useful for queries like ‘sales for an navigational attribute of country’ or ‘sales for a node of a country-hierarchy’

Aggregates - Example using filters Data for queries like ‘sales for all customers in Germany' can be read out of the aggregate (country =Germany; customer=*) Fact Table: Sales Data Aggregate Tables: Sales Data Country Customer Sales Country Germany Customer * USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Country Customer Sales Germany Ocean Networks Funny Duds Inc. 15 20 Data for queries like: “sales for all customers in Germany” can be read out of the aggregate (country F, Germany; customer *) Aggregates with filters are only useful for queries with the same filter

Aggregates - Example using master data Attribute Table: Customer Customer Industry Buggy Soft Inc. Funny Duds Inc. Ocean Networks Thor Industries Technology Consumer Products Chemical Aggregate Tables: Sales Data Fact Table: Sales Data Industry * Country Customer Sales USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Sales Industry Technology Consumer Products Chemical 60 25 10 Data for queries like “sales grouped by industries” can be read out of the aggregate (industry *). In this example, the navigational attribute ‘industry’ is read from the master data table.

Aggregates - Example using hierarchies Hierarchy for Country All Europe America Germany Austria USA Fact Table: Sales Data Aggregate Tables: Sales Data Country Customer Sales Country Hierarchy, Level 2 USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Country Sales 40 55 America Europe Queries like “sales for Europe”, “sales for ALL”, “overall sales”, or “sales for all countries ordered by the country hierarchy up to level 1 or 2” may use the aggregate (country H Level 2). Aggregates with a hierarchy are useful for queries which use nodes of the hierarchy as a filter or which use the hierarchy as a presentation hierarchy. The level of the desired nodes must be less than or equal to the level in the aggregate.

Aggregates - Maintenance Show aggregate hierarchy Transport Activate & Fill Switch on/off BDS unsaved changes Menu path: Admin-Workbench > InfoCubes > right mouse click on InfoCube > Maintain aggregates... The screen is split into two subscreens: left hand side: InfoCube with its characteristics and time-independent attributes right hand side: Aggregates with header-information and components. The components of the aggregate are sorted by dimensions technical names and text are visible Definition of aggregates: drag & drop ‘Wizard’-like when defining fixed-values and hierarchy-levels, started with right mouse-click on the InfoObject activate and fill is one single step Transport of the aggregate definition is possible documents can be stored for an aggregate (BDS) an aggregate can be set inactive for queries, but it is still filled and will be included in a rollup and change-run

Aggregate Maintenance After new data is loaded existing aggregates have to be adjusted in order to make the new data available for reporting: Aggregate Rollup: The newly uploaded transactional data is added to the aggregates Changerun (Master Data Activation): The newly uploaded master data is applied to the aggregates and activated. During the change run, all aggregates containing navigational attributes and/or hierarchies are realigned