Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical BI solution Architecture.

Similar presentations


Presentation on theme: "Www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical BI solution Architecture."— Presentation transcript:

1 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical BI solution Architecture

2 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical BI Solution Architecture Admin & Finance Application s

3 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical BI Solution Server Topology

4 www.ocsinfotech.in OCS Infotech Proprietary & Confidential DW Design Methodology

5 www.ocsinfotech.in OCS Infotech Proprietary & Confidential DW Design Methodology 1.Choosing the process 2.Choosing the grain 3.Identifying and confirming the dimensions 4.Choosing the facts 5.Storing pre-calculations in the fact table 6.Rounding out the dimension tables 7.Choosing the duration of the database 8.Tracking slowly changing dimensions 9.Deciding the query priorities and the query modes

6 www.ocsinfotech.in OCS Infotech Proprietary & Confidential 9 Steps DW Design Methodology  Step 1: Choosing Process –The chosen process (function) refers to the subject matter of a particular data mart, for example: a Bill Payment Process  Step 2: Choosing The Grain –Decide what a record of the fact table is to represent, i.e.. the grain. For example, the grain is a single Payment  Step 3: Identifying and conforming the dimensions –Dimensions set the context for asking questions about the facts in the fact table. e.g. Who made the Bill Payment  Step 4: Choosing the Facts –Facts should be numeric and additive.

7 www.ocsinfotech.in OCS Infotech Proprietary & Confidential 9 Steps DW Design Methodology  Step 5: Storing pre-calculations in the fact table –Once the facts have been selected each should be re-examined to determine whether there are opportunities to use pre- calculations. (denormalization)  Step 6: Rounding out the dimension tables –What properties to include in dimension table to best describe it. Should be intuitive and understandable  Step 7: Choosing the duration of the database –How long to keep the data for

8 www.ocsinfotech.in OCS Infotech Proprietary & Confidential 9 Steps DW Design Methodology  Step 8: Tracking slowly changing dimensions –Type 1: where a changed dimension attribute is overwritten –Type 2: where a changed dimension attribute causes a new dimension record to be created –Type 3: where a changed dimension attribute causes an alternate attribute to be created so that both the old and new values of the attribute are simultaneously accessible in the same dimension record  Step 9: Deciding the query priorities and the query modes –Consider physical decision issues Indexing for performance, Indexed Views, partitioning, physical sort order, etc. Storage, backup, security

9 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Data Warehouse Architecture Source Systems Division A Division B Division C External Data Data Staging Area Data Warehouse Repository Extract, Transformation and Load (ETL)

10 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Data Staging Area Subject Area Oriented Data Structure more closely mirrors Operational System Data Layouts Supports Identification of Changed Data Acts as a Working Area to Support the Transformation Process

11 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Data Warehouse Repository Organized around Conformed Dimensions and Facts Promotes Usability and Intuitiveness Consolidated and Cross-Functional Historical and Atomic Representation of Data Insulated from Source System Modifications and Additions

12 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical Star Schema Time_DimTime_Dim TimeKey TheDate. TheDate. Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey TimeKey EmployeeKey ProductKey CustomerKey ShipperKey Required Data (Business Metrics) or (Measures). Required Data (Business Metrics) or (Measures). Employee_DimEmployee_Dim EmployeeKey EmployeeID. EmployeeID. Product_DimProduct_Dim ProductKey ProductID. ProductID. Customer_DimCustomer_Dim CustomerKey CustomerID. CustomerID. Shipper_DimShipper_Dim ShipperKey ShipperID. ShipperID.

13 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical Snow Flake Schema

14 www.ocsinfotech.in OCS Infotech Proprietary & Confidential SSIS Design Best Practices

15 www.ocsinfotech.in OCS Infotech Proprietary & Confidential SSIS Best Practices SSIS (SQL Server Integration Services) is the ETL tool from Microsoft BI stack, and we follow Microsoft recommended design best practices. Few of them are listed below Making the sure the design of staging schema(s) and target DW schema is ready and data mapping document is prepared Necessary table design best practices to handle high volumes of data. Using Column names instead ‘*’ to utilize optimal usage of package buffer size which improves performance of the package Setting the right values for ‘Rows per batch’ and ‘Maximum insert commit size’ whilst loading high volumes of data Avoiding asynchronous transformations wherever necessary to achieve the maximum benefit of parallel execution tree mechanism of SSIS

16 www.ocsinfotech.in OCS Infotech Proprietary & Confidential SSIS Best Practices Enable Logging to identify the issues during run time Incorporating proper error handling mechanism for both dimension and fact loads separately Usage of ‘Checkpoints’ in the package to resume the execution from the same state of failure in case it happens Auditing of the packages / Data flow tasks to track the result of package execution and other important attributes Appropriately using send mail tasks to communicate the package execution status to primary stakeholders

17 www.ocsinfotech.in OCS Infotech Proprietary & Confidential SSIS Best Practices Using ‘SCD (Slowly Changing Dimensions)’ transformation whilst creating dimension packages to handle the constant changes in the master tables and their maintenance Using alternative mechanism instead of ‘Look up’ transformation whenever the data volume is high to increase the performance Using Appropriate incremental loading mechanism for incremental updates to fact tables Creating the master package, to load all the dimensions in parallel, and sequencing them if necessary to take care of any dependencies and then triggering the fact load packages

18 www.ocsinfotech.in OCS Infotech Proprietary & Confidential SSAS Design Best Practices

19 www.ocsinfotech.in OCS Infotech Proprietary & Confidential SSAS Best Practices Design Best Practices for Cube Dimensions Consolidating multiple hierarchies into single dimension Avoid ROLAP Storage mode Use of Parent-Child and many-many relationship dimensions prudently Design Best Practices for Attributes / Hierarchies Define all possible attribute relationships Remove redundant attribute relationships Use natural hierarchies where possible Design Best Practices for Measures Use smallest numeric data type possible Use semi-additive aggregate functions instead of MDX calculations to achieve same behavior

20 www.ocsinfotech.in OCS Infotech Proprietary & Confidential SSAS Best Practices Creating Partitions if there are more than 20 M rows in the fact table (approx.) Managing storage settings (MOLAP, HOLAP, ROLAP) by usage patterns Designing aggregations appropriately by accurately giving the counts for attributes and fact table counts Writing optimized MDX queries Learn the data security that the client expects and implement the security Data level Role based Data Mining – Create Mining Structures and add mining models to it. Different models possible are as follows Classify – decision trees, Neural networks Forecasting – time series analysis Shopping basket – shopping basket analysis (generally used in retail to identify which commodities are purchased as a group)

21 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical OLAP Cube Design Apples Cherries Grapes Melons Q4 Q1Q2Q3 Time Dimension Dallas Denver Chicago Markets Dimension Atlanta Product Dimension

22 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical OLAP Cube Design Q4 Cherries Grapes Melons Q1Q2Q3 Time Dimension Dallas Denver Chicago Markets Dimension Atlanta Product Dimension Sales Fact Apples

23 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Storage Methodology Relational On-Line Analytical Processing (ROLAP): The information that is stored in the Data Warehouse is held in a relational structure. Aggregations are performed on the fly either by the database or in the analysis tool. Multidimensional On-Line Analytical Processing (MOLAP): This information is aggregated in a predefined manner based on the characteristics of the Measures and the defined hierarchy of the Dimensions. Since the data is pre-aggregated, navigating through the hierarchies is instantaneous. The user is simply navigating to a point within the Multidimensional Cube and not performing any on the fly aggregations.

24 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Storage Methodology  Hybrid On-Line Analytical Processing (HOLAP): This is a combination of MOLAP and ROLAP. A portion of the data is predefined and aggregated. This would typically be the set of information that is accessed most frequently. Additional detail can be held in a ROLAP structure and allow a user to drill through the MOLAP structure into the ROLAP structure.

25 www.ocsinfotech.in OCS Infotech Proprietary & Confidential Storage Methodology Selection Criteria Client perspective MOLAPMOLAPHOLAPHOLAPROLAPROLAP Query performance Storage consumption Fastest High Faster Medium Fast Low


Download ppt "Www.ocsinfotech.in OCS Infotech Proprietary & Confidential Typical BI solution Architecture."

Similar presentations


Ads by Google