Download presentation
Presentation is loading. Please wait.
1
Compressing Relations And Indexes
Jonathan Goldstein Raghu Ramakrishnan Uri Shaft Department of Compter Sciences, University of Wisconsin-Madison June 18, 1997
2
Agenda Introduction Compressing A Relation
Compression Applied to Rectangle Base Indexes Performance Evaluation Questions and Remarks
3
Introduction Page level Compression Performance Study
Application to B-trees and R-trees Multidimensional bulk loading algorithm
4
Introduction
5
Introduction
6
Compressing A relation
Frames Of Reference Non numeric attributes File level compression
7
Frames of Reference
8
Point approximation in lossy compression
9
Compressing an indexing structure
Compressing a B-tree Compressing a rectangle based indexing structure Compression oriented Bulk Loading
10
Rectangle Based indexing qualities
11
Changing the frame of reference
12
Bulk-Loading Algorithm
Input. A set of points in some n-dimentional space. Output. A partition of the inut into subsets. Requirements. The partition shuold group points that are close to each other in the same group as much as possiblg
13
GB-Pack compression oriented bulk loading
14
GB-Pack compression oriented bulk loading
Qualities: trading off some tree quality for increased compression. number of entries per page is data-dependent. cutting a dimension in a value boundary in the data.
15
GB-Pack compression oriented bulk loading
16
GB-Pack compression oriented bulk loading
17
GB-Pack compression oriented bulk loading
18
Performance Evaluation
Relational Compression Experiments. CPU vs. I/O Costs. Comparison With Techniques in commercial systems. Importance of Tuple-Level Decompression. R-tree Compression Experiments.
19
Synthetic Data Sets Size: The number of tuples in the relation.
Dimensionality: The number of attributes of the relations. Range: The range of values for the attributes. Distribution :uniform(worst case) / exponential. Partition Strategy. Page size.
20
Sales Data Set Sales data set. Compression Achieved versus dimensionality
21
CPU vs. I/O Costs
22
R-tree Compression Experiments
Testing the quality of R-trees on Sales Data Set.
23
Questions And Remarks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.