Analysis Services Analysis Services vs. the Data Warehouse vs. OLTP DB Precalculated Aggregates Hierarchical Awareness Creating the OLAP DB The Data Source View The Cube Fact & Dimension Tables The Time Dimension Measures Using the Cube
Analysis Services—Why we Care Three levels of data sources for analysis All three are separate copies of the data Each has advantages/disadvantages & purposes Source data OLTP relational databases Other source data including external Data Warehouse Source data integrated Still a relational database Analysis Services (OLAP) Database
Analysis Services—Why we Care (cont.) Data Warehouse Strengths Integrated data 'Scrubbed' data Shortened relationship paths→Simpler queries Optimized for queries rather than throughput Data Warehouse Limitations Still a relational database Performance lags when querying and summing millions of records
The OLAP/Analysis Services Approach In their simplest forms OLAP databases have a logical structure similar to the star or snowflake schema we saw in the DW Fact tables Dimension tables Data storage structure is wildly different from relational DBMS Fact/Dimension tables are stored in 'cubes' Multi-dimensional (not just three) relationships between fact and dimension tables Preprocessed aggregates stored in DB
The OLAP/Analysis Services Approach (cont.) Recall that fact tables contain Keys that indicate dimensionality of the data Measures that contain values of interest We will design Cubes based around a single fact table Other approaches acceptable including multi-fact table cubes
The OLAP/Analysis Services Approach (cont.) The OLAP engine is aware of The relationship between the values in the dimensional key columns and the measures in the fact table Every sale is for one Time Key Customer Key etc Hierarchies in dimensional tables Country→State →City Year →Month →Date
Precalculated Aggregates The OLAP Engine precalculates aggregates along dimensions in the fact table If querying total sales value by customer and product this value may already be stored for each combination of customer and product Aggregates are calculated and stored during processing of the cube (later)
Intelligent Hierarchies OLAP intelligently uses precalculated aggregates to total on hierarchies If aggregates are already calculated for sales by product by customer… … sales by product by country use the precalculated aggregate rather than querying the detail data There are special tools for establishing hierarchical relationships among time dimension components Relationships in snowflake schema will be automatically detected Others can be established at design time
Creating the OLAP DB Create the OLAP DB from the DW They can also be created directly from source data Use Business Intelligence Development Studio to design, create, and load the OLAP DB The Visual Studio project contains the definitions needed to design, create, and load The project also creates the Analysis Services DB SQL Server & Analysis Services must both be running
Creating the OLAP DB (cont.) Steps Create Analysis Services Project with connection(s) Create Data Source View to define data to load Generate OLAP DB Load OLAP DB OLAP DB available for use Direct browsing Serving via Analysis Services server Reporting Services Excel
Creating the OLAP DB New Business Intelligence Project Type is Analysis Services Manage project file locations
Create Data Source(s) for the Project Create a data source just as we did for the data warehouse load project Point the data source to the data warehouse DB Create new connection if necessary Select "Default" Impersonation Information if DW DB does not require login
Create the Data Source View The Data Source View (DSV) is a map from the source data (data warehouse in our case) to the OLAP DB May include data transformations Allows fact data and dimensions to be identified Allows hierarchies to be established Special tools for time hierarchies Create new DSV in Solution Explorer Set Data Source
Create Data Source View (cont.) Select fact table to be loaded Select dim tables Use Add Related Tables button Manually select Include hierarchical tables as necessary from snowflake schema Name DSV when all tables are selected
Create Data Source View (cont.) DSV template is created from the selected tables Template may be modified Add calculated columns Ready to add new cube when DSV is complete
Create Cube Create new cube from Solution Explorer Select Build cube from data source Select the DSV that was created to be the basis for the new cube
Build Cube—Confirm Fact & Dimension Tables Confirm suggested fact and dimension tables Wizard frequently misidentifies dimension tables as fact tables Just check Be sure to identify the Time Dimension Table
Build Cube—Map Time Dimension Columns Time has built in hierarchies Map the Time Dim columns to the predefined time hierarchical concepts Not all will be mapped
Build Cube—Identify Measures Uncheck spurious columns that will not be used as measures in the fact table Next step detects hierarchies No operator choices
Build Cube—Review Hierarchies The hierarchies screen allows you to review, delete, & modify hierarchies
Building the Cube—Name the Cube Give the cube a meaningful name Default is the same as the DSV which should probably not be used Click Finish to build the cube design
Building the Cube—Reviewing Cube Design
Building the Cube—Reviewing Cube Design The cube structure tab will show the cube design in a way that looks much like the DSV Any changes will be reflected Calculated columns Renamed columns
Processing the Cube The cube must be processed before it can be used Select Process… from the Cube menu On first run you will be prompted to build and deploy the project first Select Run from the Process Cube dialog This may take some time—this is where the data is being loaded into the OLAP DB and initial aggregations created