Download presentation
Presentation is loading. Please wait.
Published byMirella Amarante Batista Modified over 6 years ago
1
Native Multidimensional Indexing in Relational Databases
David Hoksza, Tomáš Skopal Charles University in Prague Department of Software Engineering Czech Republic
2
Presentation Outline Multidimensional querying Indexing in PostgreSQL
attribute number growth indexing methods contemporary DB systems Indexing in PostgreSQL user-defined access methods external indexing framework Experiments COMAD 2008
3
Indexing Single-attribute based indexing
to avoid sequential scan B-tree Multi-attribute based indexing window query straightforward solution – multiple B-trees SELECT * FROM Products WHERE BrandID BETWEEN (3 AND 11) SELECT * FROM Products WHERE BrandId BETWEEN (13 AND 14) AND ProductTypeID BETWEEN (13 AND 24) PeriodID BETWEEN (3 AND 11) COMAD 2008
4
Multi-attribute Based Indexing
Multiple B-trees attribute number growth → exponential growth of partial result-sets → sequential scan dimensionality curse COMAD 2008
5
Multi-dimensional access methods
B+-tree with Compound Keys most often employed solution multiple keys – single chained value key components compared in lexicograhpical order assymetry in the order of the keys R-tree UB-tree transformation of n-dim points into 1-dim Z-address → Z-curve Z-curves divided into Z-regions indexed by a B+-tree COMAD 2008
6
Native Multi-dimensional Indexing
SELECT * FROM Products WHERE BrandId BETWEEN (13 AND 14) AND ProductTypeID BETWEEN (13 AND 24) PeriodID BETWEEN (3 AND 11) Table rows points in n-dimensional space R-tree Queries cubes in n-dimensional space (n-dimensional windows) COMAD 2008
7
Contemporary DB systems
COMAD 2008
8
PostgreSQL Object-relational DBMS Open-source
Since 2005 (v. 8.0) runs on Windows Emphasis on extensibility data types operators procedural languages access methods … COMAD 2008
9
Relation Types in PostgreSQL
Heap relations (HRs) user relations system catalog undefined order Index relations (IRs) <key,value> pairs external fast access to heap relations internal access methods’ structures COMAD 2008
10
User-defined Access Methods (AM) in PostgreSQL
Implement a set of functions communicating with PostgreSQL’s core (AM implementation). Register the functions (located in libraries). Define an AM (index type) by connecting the functions with a newly created AM. Establish a class of operators and types for the AM. Use the index. COMAD 2008
11
User-defined Access Methods in PostgreSQL – cont.
required functions index_build creating a structure index_insert inserts a record index_beginscan starts a new scan index_gettuple gets a record fulfilling search conditions index_getmulti gets a set of records index_endscan finishes a search index_markpos marks actual position in a scan index_restrpos returns to a marked position index_rescan repeats scan with the same structure of search keys index_bulkdelete removes a set of records index_costestimate estimates cost of a search COMAD 2008
12
External Indexing AM in PostgreSQL store data in IRs returning IRs’ TIDs to the core External indexing framework Storing IRs’ TIDs in external index storage PostgreSQL’s interface to access methods still requires high level of mastering of PostgreSQL’s inner mechanisms Framework for external indexing COMAD 2008
13
Framework interface void FW_CreateStructure (Relation index_relation);
void* FW_PrepareInsert (Relation index_relation); void FW_InsertTuple (void *fw_data, Relation index_relation, IndexTuple index_tuple, BlockNumber block_number, OffsetNumber offset); void FW_FinishInsert (void *fw_data); void FW_InitSearch (IndexScanDesc scan, ScanDirection dir); bool FW_GetNextTID (IndexScanDesc scan, ScanDirection dir, BlockNumber *block_number, OffsetNumber *offset); void FW_DeleteTuple (BlockNumber block number, OffsetNumber offset); COMAD 2008
14
Experimental evaluation
The testbed Uniform clusters of uniformly distributed (up to 15-dimensional) objects Gauss (up to 3-dimensional) objects following Gaussian distribution DBLP 435,373 DBLP database records author, type of publication, year of publication, number of pages Studied costs index access count real-time COMAD 2008
15
Experiments – Database Growth
COMAD 2008
16
Experiments – Query Selectivity
COMAD 2008
17
Experiments – Dimension Growth
COMAD 2008
18
Experiments – Real-time
COMAD 2008
19
Conclusion We have proposed and implemented Results show
Native multidimensional indexing by R-tree Indexing framework for PostgreSQL Implementation of native external R-tree index Results show big speed-up on the real-world data according to index-access metric poor physical implementation of access methods in PostgreSQL in comparison to Oracle and MSSQL Server COMAD 2008
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.