Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.

Similar presentations


Presentation on theme: "ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08."— Presentation transcript:

1 ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08

2 ISV Innovation Presented by Agenda 1.Dimension Table Load 2.Fact Table Load 3.Working with SCD’s 4.SSIS SCD Task 5.Load Data using Staging Tables 6.Top 10 Best Practices 2 Data Loading

3 ISV Innovation Presented by Data Flow Task 3  Encapsulates the data flow engine Extract Transform Load Data Loading

4 ISV Innovation Presented by Populating Dimension Tables 4 Y Insert new record Update changed column(s) Expire existing record Transform Correlate records N N Y Type 2 change? Y Type 1 change? New record? Dimension source Data Loading

5 ISV Innovation Presented by Populating Fact Tables 5 Y Insert new record Insert new dimension record Lookup dimension key N Lookup failed? Repeat for each dimension key Transform Fact source Data Loading

6 ISV Innovation Presented by Slowly Changing Dimension Transformation 6  The wizard-based configuration promotes rapid ETL development  Supports  Type 0 (Fixed Attribute)  Type 1 (Changing Attribute)  Type 2 (Historical Attribute)  Inferred member management  Automatically constructs the downstream data flow  Handles the majority of slowly changing dimension scenarios Data Loading

7 ISV Innovation Presented by Populating Dimension Tables 7 Y Insert new record Update changed column(s) Expire existing record Transform Correlate records N N Y Type 2 change? Y Type 1 change? New record? Dimension Source Data Loading

8 ISV Innovation Presented by 8Data Cleansing

9 ISV Innovation Presented by LastName update to Valdez-Smythe SCD Type 1 9  Existing record is updated  History is not preserved Data Loading

10 ISV Innovation Presented by SCD Type 2 10  Existing record is ‘expired’ and new record inserted  History is preserved  Most common form of Slowly Changing Dimension SalesTerritoryKey update to 10 Data Loading

11 ISV Innovation Presented by Configuring the SCD Transformation Step 1 11  Select the target dimension table  Configure the relationship between the source data and the dimension table The relationship is established with the business key stored in the dimension table Data Loading

12 ISV Innovation Presented by Configuring the SCD Transformation Step 2 12  Select the columns and their change type:  Fixed(Type 0)  Changing(Type 1)  Historical(Type 2) Data Loading

13 ISV Innovation Presented by Configuring the SCD Transformation Step 3 13  Configure the behavior if Fixed attributes change  Configure whether Changing attributes should update the current record or all matching records Data Loading

14 ISV Innovation Presented by Configuring the SCD Transformation Step 4 14  Configure how Historical attributes identify current and expired records:  Single Boolean column, or  Start and End date columns It is best practice to store Start and End dates Data Loading

15 ISV Innovation Presented by Configuring the SCD Transformation Step 5 15  If inferred members are stored in the dimension table, define how they are identified:  When all columns with a change type are null, or  By a single Boolean column Data Loading

16 ISV Innovation Presented by The Slowly Changing Dimension Transformation 16 Based on your configuration, the wizard completes the downstream data flow Data Loading

17 ISV Innovation Presented by 17Data Cleansing

18 ISV Innovation Presented by Staging Tables 18 1.Create a staging table that matches the logical structure of the target table. Load the bulk data into this table. This loading can be much faster (compared to loading directly into the target table) because the staging table has no indexes or constraints on it. More importantly, while the new data is being loaded, the existing data is fully available for all transactions without any impact, because the data load is taking place on a separate staging table. 2.Create constraints and indexes on the staging table that are equivalent to those that exist on the target table. While you create constraints and indexes on the staging table, the existing data is fully available for all transactions without any impact, because this is taking place on the staging table. 3.Execute the ALTER TABLE … SWITCH statement to move the data from the staging table to an empty partition of the target table. This is a metadata-only operation, and is very fast (usually under a second) when there are no long transactions on the target table. This way, all the loading and indexing happens outside the main table, and then the data quickly moves into the main table. One important thing to note here is that the ALTER TABLE … SWITCH operation requires a schema modification (Sch-M) lock on the table. Long running transactions can block the switch operation from acquiring the Sch-M lock and make it wait Data Loading

19 ISV Innovation Presented by Partitions 19  In data warehouse applications, it is very common to partition data on the date dimension. This helps in periodically archiving or deleting the old data as new data comes in, with minimal effect on performance and availability. Depending on data volume, database designers pick yearly, quarterly, monthly, weekly, daily or even hourly partitions.  One way to partition a table is to create one partition for each incremental load. For example, if you load data on a daily basis, create one partition for each day. By using this partitioning approach, you can use the technique described earlier in this paper (load and index data in a staging table, and then use ALTER TABLE … SWITCH) very efficiently. One partition is added for every incremental load, and the number of partitions increases over time. Data Loading

20 ISV Innovation Presented by Optimize Performance Partitioning  Query processing improvements  Partition-aware seeks  Parallel query plan strategies  Partition-aligned indexed views  Switched together with the partition  Easy-to-switch partitions P1 P2 P3 New Partition Agg(P1) Agg(P2) Agg(P3) Aggregates for New Partition Detail Data Day level Indexed View Month, Year level Switch new partition Data Loading

21 ISV Innovation Presented by 21Data Cleansing

22 ISV Innovation Presented by ISV Innovation Presented by Thank You SQL Customer Advisory Team www.sqlcat.com Working With Partitions http://www.microsoft.com/technet /prodtechnol/sql/bestpractice/loa ding_bulk_data_partitioned_table. mspx 22 Data Loading

23 ISV Innovation Presented by ISV Innovation Presented by www.isvinnovation.com 23 More recordings available at:


Download ppt "ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08."

Similar presentations


Ads by Google