Download presentation
Presentation is loading. Please wait.
Published byMarylou Garrison Modified over 9 years ago
1
Some OLAP Issues CMPT 455/826 - Week 9, Day 2 Jan-Apr 2009 – w9d21
2
OLAP Features To Consider In A Data Warehousing System (based on Gorla) Jan-Apr 2009 – w9d22
3
Gaining Acceptance Hypothesis: –New technology won’t be utilized effectively if it isn’t accepted –and acceptance is based on: perceived usefulness (PU) –the degree to which a person believes that using a particular system would enhance his or her job performance perceived ease of use (PEU) –the degree to which a person believes that using a particular system would be free of effort Jan-Apr 2009 – w9d23
4
OLAP Features Visualization –allows users to create summary tables and charts interactively –Measures: the presence of multidimensional tables multidimensional graphics Jan-Apr 2009 – w9d24
5
OLAP Features Summarization –the “degree of aggregation” of information i.e. supporting directed acyclic graphs within a dimension –Measures: the number of hierarchies allowed (in a single dimension) the level of detail the capability to swap between summarized and detailed levels Jan-Apr 2009 – w9d25
6
OLAP Features Navigation –the capability to drill-down or role-up between levels of detail –the capability to get to the information you want drill-down is going from using more general to more detailed information in a particular domain (e.g. changing location focus from state to city) roll-up is going from using more detailed to more general information in a particular domain (e.g. changing location focus from city to state) slicing is selecting certain rows in a table and ignoring the rest dicing is selecting certain attributes in a table and ignoring the rest –Measures: shareability (number of concurrent users allowed) data navigatability (availability of drill-down, slicing-dicing, and drag-drop facilities) the ability to extract detailed and real-time data Jan-Apr 2009 – w9d26
7
OLAP Features Query Function: –Query engines extract data from multidimensional databases and generate outputs in 3D graphics Measures: –using pre-constructed query capability –simple query building with click-select feature –query building with query languages –concurrent run of queries Jan-Apr 2009 – w9d27
8
OLAP Features Sophisticated Analysis: –measures: (six most common types of analyses used in decision support) statistical profiling –(e.g. list customers with highest combined sales) moving averages cross dimension comparison –(e.g. compare product sales by region over a period of time) queries with self-defined formula exception condition what-if analysis Jan-Apr 2009 – w9d28
9
OLAP Features Dimensionality –Measures: the number of allowable dimensions capability to redefine a dimension time for data refresh after redefinition Jan-Apr 2009 – w9d29
10
OLAP Features Performance –Measures: (response times for four basic functions) standard report generation customized report generation graphic/chart generation data navigation Jan-Apr 2009 – w9d210
11
An Analysis of Additivity in OLAP Systems (based on Horner) Jan-Apr 2009 – w9d211
12
Typical operations Roll-up –increases the level of aggregation along one or more classification hierarchies; Drill-down –decreases the level of aggregation along one or more classification hierarchies; Slice-Dice –selects and projects the data; Pivoting –reorients the multi-dimensional data view to allow exchanging facts for dimensions symmetrically; and, Merging –performs a union of separate roll-up operations Jan-Apr 2009 – w9d212
13
Summarization of measures The roll-up and merge operations –both use aggregate operators to combine finer-grained measures into summary data But not all fact data –is mathematically summarizable In certain instances –using the sum operator to summarize data can result in inaccurate summary outputs Jan-Apr 2009 – w9d213
14
Measures additive –along a dimension if the sum operator can be used to meaningfully aggregate values along all hierarchies in that dimension fully-additive –if it is additive across all dimensions semi-additive –if it is only additive across certain dimensions non-additive –if it is not additive across any dimension Jan-Apr 2009 – w9d214
15
Hierarchies A strict hierarchy –is one where each object at a lower level belongs to only one value at a higher level A non-strict hierarchy –can be thought of as a many-to many relationship between a higher level of the hierarchy and the lower level –can result in multiple or alternate path hierarchies, whereby the lower object splits into two distinct higher level objects Jan-Apr 2009 – w9d215
16
Hierarchies Alternate and multiple path hierarchies –are important when summarizing measures, and can specifically present problems when merging data An inaccurate summarization can result –if summaries from different paths of the same hierarchy are merged Data cannot be merged –among classification attributes that have overlapping data instances Jan-Apr 2009 – w9d216
17
Non-additive measures Derived data –Ratios and Percentages –Measures of Intensity –Average/Maximum/Minimum Numbers used for other than quantities –Measurements of direction –Codes (& arbitrarily assigned numbers) –Dates and time of day Jan-Apr 2009 – w9d217
18
So, What does this all mean? It is important for us to be able to consider –how these papers on OLAP and Data Warehousing –relate to the other material –we have covered in this course Jan-Apr 2009 – w9d218
19
Going back to the beginning We discussed a taxonomy of data related concepts: –Wisdom is de-contextualized truth, that is always true is different from raw facts, which are highly context specific –Knowledge is context-specific truth, that includes decisions is the result of applying rules/algorithms to information in a given context –Information is processed (extracted, summarized, etc.) data that is useful for making some decision –Data is raw facts, which need not be numbers Jan-Apr 2009 – w9d219
20
Going further Metadata –is data that can be used to understand / process data –can include rules / algorithms –can be implicit (in the data types or structures) –can be explicit (in additional data attributes / tables / databases) Jan-Apr 2009 – w9d220
21
Going further - Metadata (cont) It is always important to consider for each data attribute –the syntax (generally dealt with in a database) –the semantics (often not dealt with explicitly) the operations in which it can participate and the TASKS they serve –maintaining the data –analyzing the data (including what queries can use it and how) the USERS of the data, who –own it –input / change it –have access to read it other important attributes of the attribute –value –privacy –risks –etc. Jan-Apr 2009 – w9d221
22
Ontologies can help Ontologies –contain names of concepts descriptions rules –structure concepts from high level to individual attributes –support sharing of information between the database and a user across databases / systems / users Jan-Apr 2009 – w9d222
23
Dimensions can help, too Dimensions –organize data conceptually –in directed acyclic graphs –can support exploration within a dimension between multiple dimensions Jan-Apr 2009 – w9d223
24
Data Warehouses –can provide dimensional storage of data to aid in exploration –don’t rely on traditional normalization and other technology focused techniques –do rely on techniques such as OLAP to help users explore them –need a good understanding of the data so that it can be cleaned before being placed in them Jan-Apr 2009 – w9d224
25
OLAP –refers to a collection of techniques for exploring data in Data Warehouses and other Dimensionally organized systems –So far the papers have focused on exploration that involves summarizing, extracting, and processing DATA producing largely numerical information –What more should it do? –What’s needed so that it can do this? Jan-Apr 2009 – w9d225
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.