Download presentation
Presentation is loading. Please wait.
Published byIra Bridges Modified over 9 years ago
1
Section D
2
OLAP (or Online Analytical Processing) has been growing in popularity due to the increase in data volumes and the recognition of the business value of analytics. Until the mid- nineties, performing OLAP analysis was an extremely costly process mainly restricted to larger organizations. The major OLAP vendor are Hyperion, Cognos, Business Objects, MicroStrategy. The cost per seat were in the range of $1500 to $5000 per annum. The setting up of the environment to perform OLAP analysis would also require substantial investments in time and monetary resources. This has changed as the major database vendor have started to incorporate OLAP modules within their database offering - Microsoft SQL Server 2000 with Analysis Services, Oracle with Express and Darwin, and IBM with DB2.
3
OLAP allows business users to slice and dice data at will. Normally data in an organization is distributed in multiple data sources and are incompatible with each other. ◦ A retail example: Point-of-sales data and sales made via call-center or the Web are stored in different location and formats. It would a time consuming process for an executive to obtain OLAP reports such as - What are the most popular products purchased by customers between the ages 15 to 30? Part of the OLAP implementation process involves extracting data from the various data repositories and making them compatible. Making data compatible involves ensuring that the meaning of the data in one repository matches all other repositories. An example of incompatible data: Customer ages can be stored as birth date for purchases made over the web and stored as age categories (i.e. between 15 and 30) for in store sales. It is not always necessary to create a data warehouse for OLAP analysis. Data stored by operational systems, such as point-of- sales, are in types of databases called OLTPs. OLTP, Online Transaction Process, databases do not have any difference from a structural perspective from any other databases. The main difference, and only, difference is the way in which data is stored.
4
Examples of OLTPs can include ERP, CRM, SCM, Point-of-Sale applications, Call Center. OLTPs are designed for optimal transaction speed. When a consumer makes a purchase online, they expect the transactions to occur instantaneously. With a database design, call data modeling, optimized for transactions the record 'Consumer name, Address, Telephone, Order Number, Order Name, Price, Payment Method' is created quickly on the database and the results can be recalled by managers equally quickly if needed.
5
Data are not typically stored for an extended period on OLTPs for storage cost and transaction speed reasons. OLAPs have a different mandate from OLTPs. OLAPs are designed to give an overview analysis of what happened. Hence the data storage (i.e. data modeling) has to be set up differently. The most common method is called the star design.
6
Data are not typically stored for an extended period on OLTPs for storage cost and transaction speed reasons. OLAPs have a different mandate from OLTPs. OLAPs are designed to give an overview analysis of what happened. Hence the data storage (i.e. data modeling) has to be set up differently. The most common method is called the star design.
7
The central table in an OLAP start data model is called the fact table. The surrounding tables are called the dimensions. Using the above data model, it is possible to build reports that answer questions such as: ◦ The supervisor that gave the most discounts. ◦ The quantity shipped on a particular date, month, year or quarter. ◦ In which zip code did product A sell the most.
8
To obtain answers, such as the ones above, from a data model OLAP cubes are created. OLAP cubes are not strictly cuboids - it is the name given to the process of linking data from the different dimensions. The cubes can be developed along business units such as sales or marketing. Or a giant cube can be formed with all the dimensions.
9
OLAP can be a valuable and rewarding business tool. Aside from producing reports, OLAP analysis can aid an organization evaluate balanced scorecard targets.
10
Fast - 90% of queries back in under 10 secs and no query takes longer than 30 secs. Analysis - Drill down, multiple aggregation techniques, sophisticated graphics, trends all form part of this Shareable - good security at the back end and available to a wide community of users.also multi currency, multi lingual to cope with the global economy. Multi-Dimensional - Excel pivot tables but more so. The ability to have any multiple dimensions of information on each axis of a cross-tab with other dimensions being used to further filter the results returned. Information - Real world KPI's rather than raw numbers.
11
In OLAP the cube is the database structure tha t is queried on and to get a handle on how this works below is a simple 3 dimension al cube
12
The coordinate system in a cube not only has a reference to a point in multidimensional space it also has an understanding of hierarchies. So the cube 'knows' that January 2007 has a parent called 2007 in the example. This forms a key part of the OLAP concept - that the results of calculations can be stored at the parent level rather than using on the fly aggregation of all the children E.g. the sales total for 2007 is stored in the cube for bike, components etc. as is the cost of sale. The profit margin % has to be worked out on the fly for bikes for 2007 but this is quick as the cost of sales and the sales that contribute to this calculation are pre-calculated. This gives OLAP it's speed while allowing for rich calculations to be stored.
13
Basic Concepts
14
A data warehouse is a: subject-oriented integrated Time varying non-volatile collection of data in support of the management's decision-making process. A data warehouse is a centralized repository that stores data from multiple information sources and transforms them into a common, multidimensional data model for efficient querying and analysis.
16
OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi- access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF). - OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi- dimensional schemas (usually star schema).
17
We can divide IT systems into transactional (OLTP) and analytical (OLAP). In general we can assume that OLTP systems provide source data to data warehouses, whereas OLAP systems help to analyze it
18
OLTP SYSTEMOLAP SYSTEM Source of data Operational data; OLTPs are the original source of the data. Consolidation data; OLAP data comes from the various OLTP Databases Purpose of data To control and run fundamental business tasks To help with planning, problem solving, and decisionsupport What the data Reveals a snapshot of ongoing business processes Multi-dimensional views of various kinds of business activities Inserts and Updates Short and fast inserts and updates initiated by end users Periodic long-running batch jobs refresh the data Queries Relatively standardized and simple queries Returning relatively few records Often complex queries involving aggregations
19
OLTPOLAP Processing SpeedTypically very fast Depends on the amount of data involved; batch data refreshes and complex queries may take many hours; query speed can be improved by creating indexes Space Requirements Can be relatively small if historical data is archived Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP DatabaseDesign Highly normalized with many tables Typically de-normalized with fewer tables; use of star and/or snowflake schemas Backup and Recovery Backup religiously; operational data is critical to run the business, data loss is likely to entail significant monetary loss and legal liability Instead of regular backups, some environments may consider simply reloading the OLTP data as a recovery method
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.