Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Database Systems: DBS CB, 2nd Edition

Similar presentations


Presentation on theme: "Advanced Database Systems: DBS CB, 2nd Edition"— Presentation transcript:

1 Advanced Database Systems: DBS CB, 2nd Edition
Advanced Topics of Interest: In-Memory DB (IMDB) and Column-Oriented DB

2 Outline In-Memory Database (IMDB) Column-Oriented Database (C-Store)

3 In-Memory Database (IMDB)
3 3 3

4 Introduction In Memory database system (IMDB)
Data resides permanently on main physical memory Backup copy (optionally) on disk Disk Resident database system (DRDB) Data resides on disk Data may be cached into memory for access. Main difference is that in IMDB, the primary copy. lives permanently in memory

5 Questions about IMDB Is it reasonable to assume that the entire database fits in memory? Yes, for some applications! What is the difference between a IMDB and a DRDB with a very large cache? In DRDB, even if all data fits in memory, the data structures and algorithms are designed for disk access.

6 Differences in properties of main memory and disk
The access time for main memory is orders of magnitude less than for disk storage Main memory is normally volatile, while disk storage is not The layout of data on disk is much more critical than the layout of data in main memory

7 Impact of memory resident data
The differences in properties of main-memory and disk have important implications in: Concurrency control Commit processing Access methods Data representation Query processing Recovery Performance

8 Concurrency control Access to main memory is much faster than disk access, so we can expect that transactions complete more quickly in an IMDB system Lock contention may not be as important as it is when the data is disk resident

9 Commit Processing As protection against media failure, it is necessary to have a backup copy and to keep a log of transaction activity The need for a stable log threatens to undermine the performance advantages that can be achieved with memory resident data

10 Access Methods The costs to be minimized by the access structures (indexes) are different, i.e., B-tree vs. T-tree…

11 Data representation Main memory databases can take advantage of efficient pointer following for data representation

12 Tobin J. Lehman Michael J. Carey VLDB 1986
A study of Index Structures for Main Memory Database Management Systems Tobin J. Lehman Michael J. Carey VLDB 1986

13 Disk versus In-Memory Primary goals for a disk-oriented index structure design: Minimize the number of disk accesses Minimize disk space Primary goals of an In Memory index design: Reduce overall computation time Using as little memory as possible

14 Classic index structures
Arrays: A: use minimal space, providing that the size is known in advance D: impractical for anything but a read-only environment AVL Trees: Balanced binary search tree The tree is kept balanced by executing rotation operations when needed A: fast search D: poor storage utilization

15 Classic index structures (cont)
B trees: Every node contains some ordered data items and pointers Good storage utilization Searching is reasonably fast Updating is also fast

16 Hash-based indexing Chained Bucket Hashing:
Static structure, used both in memory and in disk A: fast, if proper table size is known D: poor behavior in a dynamic environment Extendible (Dynamic) Hashing: Dynamic hash table that grows with data A hash node contain several data items and splits in two when an overflow occurs Directory grows in powers of two when a node overflows and has reached the max depth for a particularly directory size

17 Hash-based indexing (cont)
Linear Hashing: Uses a dynamic hash table Nodes are split in predefined linear order Buckets can be ordered sequentially, allowing the bucket address to be calculated from a base address The event that triggers a node split can be based on storage utilization Modified Linear Hashing: More oriented towards main memory Uses a directory which grows linearly Chained single items nodes Splitting criteria is based on average length of the hash chains

18 The T-tree A binary tree with many elements kept in order in a node (evolved from AVL tree and B tree) Intrinsic binary search nature Good update and storage characteristics Every tree has an associated minimum and maximum count Internal nodes (nodes with two children) keep their occupancy in the range given by min and max count

19 The T-tree Proceedings of the Twelfth International Conference on Very Large Data Bases, Kyoto, August, 1986

20 Search algorithm for T-tree
Similar to searching in a binary tree Algorithm: Start at the root of the tree Loop: If the search value is less than the minimum value of the node Then search down the left sub-tree If the search value is greater than the maximum value in the node Then search the right sub-tree Else search the current node The search fails when a node is searched and the item is not found, or when a node that bounds the search value cannot be found

21 Insert algorithm Insert (x): Search to locate the bounding node
If a bounding node is found: Let a be this node If value fits then insert it into a and STOP Else remove min element amin from node Insert x Go to the leaf containing greatest lower bound for a and insert amin into this leaf

22 Insert algorithm (cont)
If a bounding node is not found Let a be the last node on the search path If insert value fits then insert it into the node Else create a new leaf with x in it If a new leaf was added For each node in the search path (from leaf to root) If the two sub-trees heights differ by more than one, then rotate and STOP

23 Delete algorithm (1) Search for the node that bounds the delete value; search for the delete value within this node, reporting an error and stopping if it is not found (2) If the delete will not cause an underflow then delete the value and STOP Else, if this is an internal node, then delete the value and ‘borrow’ the greatest lower bound Else delete the element (3) If the node is a half-leaf and can be merged with a leaf, do it, and go to (5)

24 Delete algorithm (cont)
(4) If the current node (a leaf) is not empty, then STOP Else free the node and go to (5) (5) For every node along the path from the leaf up to the root, if the two sub-trees of the node differ in height by more than one, then perform a rotation operation STOP when all nodes have been examined or a node with even balanced has been discovered

25 LL Rotation Left rotation
Proceedings of the Twelfth International Conference on Very Large Data Bases, Kyoto, August, 1986

26 LR Rotation A C Ar B Cr Bl Cl
Proceedings of the Twelfth International Conference on Very Large Data Bases, Kyoto, August, 1986 A Right rotation Left rotation C Ar B Cr Bl Cl

27 Summary We introduced a new In Memory index structure, the T-tree
For unordered data, Modified Linear Hashing should give excellent performance for exact match queries For ordered data, the T-tree provides excellent overall performance for a mix of searches, inserts and deletes, and it does so at a relatively low cost in storage space

28 But… Even if the T-trees have more keys in each node, only the two end keys are actually used for comparison Since for every key in node we store a pointer to the record, and most of the time the record pointers are not used, the space is ‘wasted’

29 Column-Oriented Database
C-Store: A Column-Oriented RDBMS; Michael Stonebraker Column-Oriented Database 29 29 29

30 Traditional Row-Oriented Database
Store fields in one record contiguously on disk Use B-tree indexing Use small (e.g. 4K) disk blocks Align fields on byte or word boundaries Conventional (row-oriented) query optimizer and executor (technology from 1979) Aries-style transactions

31 Terminology -- “Row Store”
Record 1 Record 2 Record 3 Record 4 E.g. DB2, Oracle, Sybase, SQLServer, …

32 Row-Stores are Write Optimized
Can insert and delete (IUD) a record in one physical write Good for On-Line Transaction Processing (OLTP) But not for read mostly applications Data warehouses CRM

33 Elephants Have Extended Row Stores
With Bitmap indices Better sequential read Integration of “data cube” products Materialized views But there may be a better idea…….

34 Column Stores

35 At 100K Feet…. Ad-hoc queries read 2 columns out of 20
In a very large warehouse, Fact table is rarely clustered correctly Column store reads 5-10% of what a row store reads

36 C-Store (Column Store) Project
Brandeis/Brown/MIT/UMass-Boston project Usual suspects participating Enough coded to get performance numbers for some queries Complete status later Pioneering Work Sybase IQ (early ’90s) MonetDB (see CIDR ’05 for the most recent description)

37 C-Store Technical Ideas
Code the columns to save space No alignment Big disk blocks Only materialized views (perhaps many) Focus on Sorting not indexing Automatic physical DBMS design Optimize for grid computing Innovative redundancy Xacts – but no need for Mohan Data ordered on anything, Not just time Compression Column optimizer and executor

38 No Alignment Dense pack columns
E.g. a 5 bit field takes 5 bits Current CPU speed going up faster than disk bandwidth Faster to shift data in CPU than to waste disk bandwidth

39 Big Disk Blocks Tunable Big (minimum size is 64K)

40 Only Materialized Views
Projection (materialized view) is some number of columns from a fact table Plus columns in a dimension table – with a 1-n join between Fact and Dimension table Stored in order of a storage key(s) Several may be stored! With a permutation, if necessary, to map between them Table (as the user specified it and sees it) is not stored! No secondary indexes (they are a one column sorted MV plus a permutation, if you really want one)

41 Example: User view: Possible set of MVs: EMP (name, age, salary, dept)
Dept (dname, floor) Possible set of MVs: MV-1 (name, dept, floor) in floor order MV-2 (salary, age) in age order MV-3 (dname, salary, name) in salary order

42 Automatic Physical DBMS Design
Not enough 4-star wizards to go around Accept a “training set” of queries and a space budget Choose the MVs auto-magically Re-optimize periodically based on a log of the interactions

43 Optimize for Grid Computing
I.e. shared-nothing Dewitt (Gamma) was right Horizontal partitioning and intra-query parallelism as in Gamma

44 Innovative Redundancy
Hardly any warehouse is recovered by a redo from the log Takes too long! Store enough MVs at enough places to ensure K-safety Rebuild dead objects from elsewhere in the network K-safety is a DBMS-design problem!

45 XACTS – No C. Mohan Undo from a log (that does not need to be persistent) Redo by rebuild from elsewhere in the network Snapshot isolation (run queries as of a tunable time in the recent past) To solve read-write conflicts Distributed Xacts Without a prepare message (no 2 phase commit)

46 Storage (sort) Key(s) is not Necessarily Time Efficient
That would be too limiting So how to do fast updates to dense pack column storage that is not in entry sequence?

47 Solution – a Hybrid Store (H-Store) http://www. vldb2005
(Much like Monet) Write-optimized I/U store (Much like Monet) (Much like MonetDB) (Batch rebuilder) (Batch rebuilder) Tuple mover (What we have been talking about so far) Read-optimized Column store (What we have been talking about so far) (What we have been talking about so far) MIT

48 Column Executor Column Optimizer
Column operations – not row operations Columns remain coded – if possible Late materialization of columns Column Optimizer Chooses MVs on which to run the query Most important task Build in snowflake schemas Which are simple to optimize without exhaustive search Looking at extensions

49 Performance 100X popular row store in 40% of the space
7X popular row store in 1/6th of the space Code available with BSD license

50 University Research Extension of algorithms to non-snowflake schemas
Study of L2 cache performance Study of coding strategies Study of executor options Study of recovery tactics Non-cursor interface Study of optimizer primitives

51 END


Download ppt "Advanced Database Systems: DBS CB, 2nd Edition"

Similar presentations


Ads by Google