Presentation is loading. Please wait.

Presentation is loading. Please wait.

InfoCubes and Aggregates

Similar presentations


Presentation on theme: "InfoCubes and Aggregates"— Presentation transcript:

1 InfoCubes and Aggregates
UNWBW1 – Business Information Warehouse NetWeaver Support Consultant Training

2 BW-BPS Business Planning & Simulation Monitoring & Technical Risks
Content Introduction Reporting Business content Data loading InfoCube Design Aggregates BW-BPS Business Planning & Simulation Monitoring & Technical Risks

3 Quantities Revenues Costs Rev./Group Competition Dimension
Star Schema Product Dimension Customer Dimension Sales Dimension Product Dimension Quantities Revenues Costs Rev./Group Time Customer Dimension Time Dimension Competition Dimension Star Schema (Logical) The terms InfoCubes, dimensional analysis, star schemas, and DataMarts, depending on their context, are essentially referring to the same thing - how data is structured within BW tables. The term ‘star schema’ is used when discussing the table structure from a conceptual, or data modeling perspective; whereas ‘InfoCube’ is typically used when referring to the actual set of tables where data is stored. Within BW, a user creates a query against an InfoCube.

4 Dimensions Dimension tables are groupings of related characteristics.
A dimension table contains a generated primary key and characteristics. The keys of the dimension tables are foreign keys in the fact table. Customer dimension C Customer # Region … West ... Product dimension Time dimension P Product # Product group … Displays ... T Period Fiscal year … From a technical perspective the characteristics of the dimension table form the ‘edges’ of the InfoCube. The dimensions are connected with the fact table using the DIMs, or dimension IDs. The access to the data in the fact table takes place using the selection of characteristics and/or their characteristic values from the dimension tables and the generation of a corresponding SQL statement that accesses the fact table.

5 Example: Sales Infocube Dimensions
Customer Product Sales Customer number Customer name Cust Category Cust Subcategory Division Industry Revenue Class Transportation zone Currency VAT # Legal Status Regional market Cust Statistics group Incoterms Billing schedule Price group Delivering plan ABC Classification Account assignment group Address State Country Region Material number Material text Material type Category Subcategory Market key MRP Type Material group 1 Planner Forecast model Valuation class Standard cost Weight Volume Storage conditions Creation Date Salesperson Rep group Sales territory Sales region Sales district Sales planning group Distribution key Competition Nielsen indicator SEC Code Primary competitor Secondary Competitor Time Date Week Month Fiscal Year The dimension tables contain the values that need to be analyzed. These characteristics are often master data elements or organizational elements or values that can be used to describe one of these. Three dimension tables are required by the system; the time, unit and info package dimensions. Up to 13 other dimension tables may be added to a BW star schema.

6 Fact Table A record of the fact table is uniquely defined by the keys of the dimension tables A relatively small number of columns (key figures) and a large number of rows is typical for fact tables A fact table is maintained during transaction data load Fact table P C T Quantity Revenue Discount Sales overhead ,000 $ 50,000 $ 280,000 $ 50 100,000 $ 7,500 $ 60,000 $ … … … ... Strong entities are the main characteristics which occur in the application being analyzed. The fact table contains the data (key figures) for a certain combination of characteristic values of the dimension tables. The referencing of the fact table takes place using the artificially entered dimension key (DIM-ID). As artificial keys are formed for the connection between the dimension and fact table, changes to the master data table can take place relatively problem-free, without having to rebuild the (natural) key every time. In the evaluation, a resulting quantity is first formed by the selections in the dimension tables. This is then selected directly from the fact table by the artificial key.

7 Facts - Sales Example: Sales Facts Quantity sold List price Discounts
Invoice price Fixed mfg. cost Variable cost Moving average price Standard cost Contribution margin Expected ship date Actual ship date The fact table contains key figures, or in other words, values that help a business person evaluate their company and make the appropriate decisions. These key figures could be calculated or brought directly over from the source system.

8 Example: Sales Star Schema
Competition Nielsen indicator SEC Code Primary competitor Secondary Competitor Customer Customer number Customer name Cust. Category Cust. Subcategory Division Industry Revenue Class Transportation zone Currency VAT # Legal Status Regional market Cust. Statistics group IncoTerms Billing schedule Price group Delivering plan ABC Classification Account assignment group Address State Country Region Facts Customer Material Competition Sales Time Material Sales Material number Material text Material type Category Subcategory Market key MRP Type Material group 1 Planner Forecast model Valuation class Standard cost Weight Volume Storage conditions Creation Date Salesperson Rep group Sales territory Sales region Sales district Sales planning group Distribution key Qty sold List price Discounts Invoice price Fixed mfg cost Variable cost Moving average price Standard cost Contribution margin Expected ship date Actual ship date Time Date Week Month Fiscal Year This slide pulls the concept together for a sales-related star schema. The fact table contains the key figures as well as the keys, or links, to the dimension tables. The data can then be sliced into many different combinations of these values within a query. This is the essence of creating a query against an SAP BW InfoCube.

9 Extending the Star Schema
In a basic Star Schema we are limited: Only characteristics of the dimension tables can be used to access facts. No structured drill downs can be created. Support for many languages is difficult. In BW, the Extended Star Schema adds access to: Master data tables and their associated fields (attributes). Text tables with extensive multilingual descriptions. External hierarchy tables for structured access to the data. With the extended Star Schema, master data characteristics (and their attributes, texts and hierarchies) can referred to from multiple InfoCubes.

10 SAP BW: Extended Star Schema
Customer Text Table Material Attributes Table Material Text Table CUSTOMER_ID MATERIAL_ID MATERIAL_ID Customer Name Material Group Material Name Material SID-Table Customer SID-Table InfoCube MATERIAL_ID CUSTOMER_ID Customer Dimension Table Material Dimension Table SID_MATERIAL SID_CUSTOMER DIM_ID_CUSTOMER DIM_ID_MATERIAL external Material Hierarchy Customer Attributes Table SID_CUSTOMER SID_MATERIAL CUSTOMER_ID Fact Table City Region DIM_ID_PACKAGE DIM_ID_TIME DIM_ID_UNIT DIM_ID_MATERIALDIM_ID_CUSTOMER Amount SID-Table Amount Sales AMOUNT_ID Datapackage Dimension Table Unit Dimension Table SID_AMOUNT DIM_ID_PACKAGE DIM_ID_UNIT SID_REQUEST Time Dimension Table SID_AMOUNT SID_CURRENCY Currency SID-Table The BW extended star schema is different to the basic star schema. It is subdivided into a solution-dependent part (InfoCube) and a solution-independent part (attribute tables, text tables, and hierarchy tables) that is also shared among the other InfoCubes. The dimension attributes of the dimension tables are called characteristics. The attributes located in the master data table of a characteristic are called the attributes of the characteristic. The great challenge when designing a solution is to decide whether to store an attribute in a dimension table (and therefore in the InfoCube) or in a master data table. Data is loaded separately into the master data tables (attribute tables) text tables and hierarchy tables. The SID table is the link between the master data and the dimension tables. DIM_ID_TIME CURRENCY_ID SID_MONTH SID_YEAR SID_CURRENCY Request SID-Table REQUEST_ID Calendar Month SID-Table Calendar Year SID-Table SID_REQUEST MONTH_ID YEAR_ID SID_MONTH SID_YEAR

11 Dimensions up to 16 dimensions
3 dimensions exist with each InfoCube (whether they are used and thus visible or not) Time dimension Unit dimension Packet dimension The remaining 13 dimensions are for individual schema design Each dimension table may be up to 248 characteristics. Gebiet 1 Gebiet 2 Gebiet 3 Bezirk 1 Gebiet 3a Bezirk 2 Region 1 Gebiet 4 Gebiet 5 Bezirk 3 Region 2 Gebiet 6 Bezirk 4 Gebiet 7 Gebiet 8 Bezirk 5 Region 3 Vertriebsorganisation Material Group Hierarchy Table Number Language Code Material Name Material Text Table Material_Dimension_ID Material Dimension Attribute Table Material Type Dimension

12 Summary The center of a multidimensional schema in BW are the fact tables. The fact tables are surrounded by dimensions. Dimension Table In BW the attributes of the dimension tables are called characteristics (e.g. material).  Master Data Tables: Attribute Tables Dependent attributes of a characteristic can be stored in an Attribute Table for the characteristic. Text Tables Textual descriptions of a characteristic are stored in a separate text table. External Hierarchy Tables Hierarchies of characteristics or attributes may be stored in separate hierarchy tables.

13 Compressing the InfoCube
Records added to InfoCube fact tables have several “keys” which uniquely identify the record. Request ID is just one of several fields in a record that helps identify the data. But, Request ID can be removed, and each record can still be uniquely identified. Compression finds records which are identical except for Request ID, then aggregates these to one single record. If a compression is not performed, the “Group by” condition of any query’s SQL statement will remove duplicates. This results in decreased query performance. An InfoCube compression is important in maintaining query performance. It helps manage Fact table size. Drawback: Requests can no longer be deleted from the InfoCube selectively Only the compressed table (E table) can be partitioned by a date range When you load data into the InfoCube, entire requests can be inserted at the same time. Each of these requests has its own request ID, which is included in the fact table in the packet dimension. This makes it possible to pay particular attention to individual requests. One advantage of the request ID concept is that you can delete complete requests from the InfoCube. However, the request ID concept can also cause the same data record (all characteristics except the request ID match) to appear more than once in the fact table. This unnecessarily increases the data volume and reduces the performance in reporting, as the system has to aggregate every time you perform a query using the request ID. By compressing you can eliminate these disadvantages and bring the data from the different requests together into one single request (request ID 0). If you are using an Oracle database as your BW database, you can also carry out a report using the relevant InfoCube while the compression is running. With other manufacturers’ databases, you will see a warning if you try to carry out a report using an InfoCube while the compression is running. In this case you can only report on the relevant InfoCube when the compression has finished running.

14 Compressing the InfoCube
Request IDs Lost !!! Request Date Record Cost 1 100 2 200 Request Date Record Cost 1 300 2 200 Request Date Record Cost 2 1 200 300 E-Fact table F-Fact table Compressing an InfoCube means that the request ID will be deleted, and all rows with the same key will be summarized (for non-cumulative key figures). This function is critical, because it means that you are no longer able to use the request IDs to delete data from the InfoCube. Before you proceed, make sure that the data in the InfoCube is correct. You must compress the InfoCube at regular intervals. This saves space. COMPRESSION

15 BW-BPS Business Planning & Simulation Monitoring & Technical Risks
Content Introduction Reporting Business content Data loading InfoCube Design Aggregates BW-BPS Business Planning & Simulation Monitoring & Technical Risks

16 Aggregates ... ... are like InfoCubes,
... are always based on InfoCubes ... summarize ("aggregate") data of the originating InfoCube, ... contain redundant information, but ... accelerate the access to that information, ... are performance-enhancing features. Like a query, an aggregate constitutes a subset of the star schema of the related InfoCube. However, it uses its own private fact table and possibly its own dimension tables. In this example, aggregates can discard certain levels of details, such as "day" and "city" or the sales organization and keep data on a summarized level. This means that an aggregate's fact table can be much smaller. In our example this translates into an immediate performance benefit when processing our query on this aggregate rather than on the original InfoCube. Obviously, an aggregate does not contain all the detail information of the original InfoCube and as such cannot replace that InfoCube. However, a handful, well-defined aggregates can substantially improve the performance of the standard queries that users will be executing. Aggregates are only available for InfoCubes, not for other InfoProviders as ODS or MultiProvider

17 Aggregates - Example Data for queries like ‘sales for all countries’, ‘sales in Germany’, or ‘overall sales’ can be read out of the aggregate (country *). Fact Table: Sales Data Aggregate Tables: Sales Data Country Customer Sales Country * USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Country Sales USA Germany Austria 40 35 20 The aggregate (country *) may also be useful for queries like ‘sales for an navigational attribute of country’ or ‘sales for a node of a country-hierarchy’

18 Aggregates - Example using filters
Data for queries like ‘sales for all customers in Germany' can be read out of the aggregate (country =Germany; customer=*) Fact Table: Sales Data Aggregate Tables: Sales Data Country Customer Sales Country Germany Customer * USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Country Customer Sales Germany Ocean Networks Funny Duds Inc. 15 20 Data for queries like: “sales for all customers in Germany” can be read out of the aggregate (country F, Germany; customer *) Aggregates with filters are only useful for queries with the same filter

19 Aggregates - Example using master data
Attribute Table: Customer Customer Industry Buggy Soft Inc. Funny Duds Inc. Ocean Networks Thor Industries Technology Consumer Products Chemical Aggregate Tables: Sales Data Fact Table: Sales Data Industry * Country Customer Sales USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Sales Industry Technology Consumer Products Chemical 60 25 10 Data for queries like “sales grouped by industries” can be read out of the aggregate (industry *). In this example, the navigational attribute ‘industry’ is read from the master data table.

20 Aggregates - Example using hierarchies
Hierarchy for Country All Europe America Germany Austria USA Fact Table: Sales Data Aggregate Tables: Sales Data Country Customer Sales Country Hierarchy, Level 2 USA Germany Austria Buggy Soft Inc. Ocean Networks Funny Duds Inc. Thor Industries 10 15 5 20 25 Country Sales 40 55 America Europe Queries like “sales for Europe”, “sales for ALL”, “overall sales”, or “sales for all countries ordered by the country hierarchy up to level 1 or 2” may use the aggregate (country H Level 2). Aggregates with a hierarchy are useful for queries which use nodes of the hierarchy as a filter or which use the hierarchy as a presentation hierarchy. The level of the desired nodes must be less than or equal to the level in the aggregate.

21 Aggregates - Maintenance
Show aggregate hierarchy Transport Activate & Fill Switch on/off BDS unsaved changes Menu path: Admin-Workbench > InfoCubes > right mouse click on InfoCube > Maintain aggregates... The screen is split into two subscreens: left hand side: InfoCube with its characteristics and time-independent attributes right hand side: Aggregates with header-information and components. The components of the aggregate are sorted by dimensions technical names and text are visible Definition of aggregates: drag & drop ‘Wizard’-like when defining fixed-values and hierarchy-levels, started with right mouse-click on the InfoObject activate and fill is one single step Transport of the aggregate definition is possible documents can be stored for an aggregate (BDS) an aggregate can be set inactive for queries, but it is still filled and will be included in a rollup and change-run

22 Aggregate Maintenance
After new data is loaded existing aggregates have to be adjusted in order to make the new data available for reporting: Aggregate Rollup: The newly uploaded transactional data is added to the aggregates Changerun (Master Data Activation): The newly uploaded master data is applied to the aggregates and activated. During the change run, all aggregates containing navigational attributes and/or hierarchies are realigned


Download ppt "InfoCubes and Aggregates"

Similar presentations


Ads by Google