Main memory DB PDT Ján GENČI
2 Obsah Motivation DRDBMS MMDBMS DRDBMS versus MMDBMS Commit processing Support in commercial systems
3 Motivation For a long time, hard disks were the only technology that could store enough information to hold a database and offered random access at the same time. Therefore, conventional database management systems were tuned to take this technology to the maximum. But in recent years, main memories have become cheaper and grown to a point that for some fields of application it allows one to keep the whole information of a database in main memory and therefore speed up operation.
4 Disk based DB Disk storage is persistent Disk accesses exhibit a high fixed cost per access due to seek time. To achieve good performance, accesses should be: –sequential –transfer large amounts of needed data
5 Main memory based DB The access time of main memory is of orders of magnitude smaller than for disk storage, The access time of main memory is less dependent on the location (fragmentation)
6 DRDBMS vers. MMDBMS (1) Both „approaches“ process data in main memory, and both keep a (backup) copy on disk The key difference is that in an MMDBMS the primary copy of the database lives permanently in main memory and this has important implications for the algorithms and data structures used.
7 DRDBMS vers. MMDBMS (2) Even if all DB of DRDBMS is cached in main memory, it will not provide best performance since a DRDBMS is not tuned for this case (buffer manager). Index structures in DRDBMSs are designed for disk access and may trade computing power and storage efficiency for a lower number of disk accesses since disk access is the processing- time dominating factor in DRDBMSs. This overhead incurs even if all data is cached in main memory.
8 DRDBMS vers. MMDBMS (3) In MMDBMSs data is guaranteed to stay present in main memory Index structures and all the other parts of the database do not need to consider disk access Algorithms can be tuned for low computational cost.
9 DRDBMS The most important in optimizations: –to reduce the number of disk accesses, –to prefer sequential access, –to keep the processor busy while waiting for I/O. To reduce the number of disk accesses: –caching is used, –special index structures (B+ tree) were developed. –data is grouped on disk to be able to access it using one single sequential read. A high degree of concurrency is employed to keep the processor busy while other transactions are waiting for I/O, and therefore a small locking granularity down to record level locking is used. Each access has to go to the buffer manager to bring data into main memory or to make sure that data is already in main memory
10 MMDBMS Pointers can be used to address data (using pointers reduce the size of the index and complexity of dealing with long or variable length fields). Since a new access time gap between on- processor cache and main memory data layout becomes more critical again (column data store) Index structures can be tuned for a low consumption of main memory and processing power (no disk accesses; T-trees) Performance is only determined by the efficiency of the algorithms used (SSEx, GPGPU).
11 T-tree
12 Commit Processing – recovery A commit must be able to guarantee persistence Since data in main memory is volatile, some kind of mechanism must exist that protects data in case of a failure: –logging to disk (bottleneck) –Solid-state discs –Backed-up DRAMS –Logging over a network
13 Commit Processing – backup Main memory is volatile and is lost in case of a power failure - backup copy of the database must exist. Obvious solution is to backup data to disk(s). Backup must be kept up-to-date - commonly used mechanism is checkpointing Some problems not found in DRDBMSs: –How and where should data be stored? –How should pointers be treated?
14 ORACLE - TimesTen htmlhttp:// html
15 DB2 – SolidDB ware/database/showArticle.jhtml?articleID = http:// ware/database/showArticle.jhtml?articleID = SolidDB is an in-memory relational database that can serve as a high- performance front end to either DB2 or Informix Dynamic Server
16 End of presentation