Download presentation
Presentation is loading. Please wait.
Published byColleen Townsend Modified over 8 years ago
1
INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation. The first time you run an incremental aggregation session, It processes the entire source. At the end of the session, the server stores aggregate data from that session run in two files, the index file and the data file. Each subsequent time you run the session with incremental aggregation, you use only the incremental source changes in the session.
2
INCREMENTAL AGGREGATION For each input record, it checks historical information in the index file for a corresponding group. If it finds a corresponding group, the Server performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental change. If it does not find a corresponding group, the Server creates a new group and saves the record data. When writing to the target, the Server applies the changes to the existing target. It saves modified aggregate data in the index and data files to be used as historical data the next time you run the session.
3
INCREMENTAL AGGREGATION Circumstances for using incremental aggregation You can capture new source data Use incremental aggregation when you can capture new source data each time you run the session. Incremental changes do not significantly change the target. Use incremental aggregation when the changes do not significantly change the target. If processing the incrementally changed source alters more than half the existing target, the session may not benefit from using incremental aggregation. In this case, drop the table and re-create the target with complete source data.
4
EXAMPLE you might have a session using a source that receives new data every day. You can capture those incremental changes because you have added a mapping variable to the mapping that removes pre-existing data from the flow of data. You then enable incremental aggregation. When the session runs with incremental aggregation enabled for the first time on March 1, you use the entire source. This allows to read and store the necessary aggregate data. On March 2, when you run the session again, you filter out all the records except those time-stamped March 2. The Server then processes only the new data and updates the target accordingly.
5
REINITIALIZING THE AGGREGATE CACHE In session properties we have an option for reinitializing the aggregate cache. When you enable this option, it overwrites the aggregate cache each time you run the session. When you disable this option it updates the new aggregate data from the source with the existing historical cache data. Example you can reinitialize the aggregate cache if the source for a session changes incrementally every day and completely changes once a month. When you receive the new monthly source, you might configure the session to reinitialize the aggregate cache, truncate the existing target, and use the new source table during the session.
6
SALES STAR SCHEMA Customer Dimension Period Dimension Sales Fact Product Dimension
7
Description of sales schema Customer Dimension – SCD Type 1 Product Dimension – SCD Type 2 Period Dimension Sales Fact – Monthly Level Granularity In sales fact combination of cust_id, prod_id, month_id is primary key Each and every day we get daily sales transaction details and we have to populate our warehouse daily.
8
Mapping (with out using incremental aggregation in session properties)
9
Source Qualifier Properties $$MAP_VAR mapping variable is used in Source filter to do an incremental extract
10
Expression properties Converting sales date to Month and year Assigning max value of sales_id to mapping variable using SETMAXVARIABLE() function
11
Product dimension Lookup Looking up the product dimension and getting the corresponding surragate_id
12
Period Dimension Lookup Looking up the Period Dimension to get the corresponding month_id
13
Aggregator properties Aggregating the quantity and total price Group by cust_id, prod_surr_id,month_id Because this combination is primary key in fact
14
Sales fact Lookup Looking up sales fact to determine whether cust_id,prod_surr_id,m onth_id exist or not Lookup condition
15
Expression to flag new row and return Qty and Totprice Flag to Find new record This function returns sum of Qty from source and Qty that exist in fact if that record exist in fact, else it returns Qty from source This function returns sum of totalprice from source and totalprice that exist in fact if that record exist in fact, else it returns totalprice from source
16
Router Transformation Group filter condition for new records Existing record will fall into default group
17
Update strategy Transformation Existing records will be updated in the sales fact
18
Target Fact Sales fact in which new rows are inserted and existing rows are updated
19
With incremental aggregation in session properties
20
Incremental aggregation option Incremental aggregation option in session properties
21
Conclusion With out incremental aggregation option in session properties With incremental aggregation option in session properties The two mappings differs only in their expression Transformations. The first mapping (with out incremental aggregation) we manually perform a calculation in expression Transformation to do incremental aggregation. And in second mapping (with Incremental aggregation option) The power center server itself does the incremental aggregation.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.