Download presentation
Presentation is loading. Please wait.
Published byLinda Singleton Modified over 8 years ago
1
Storage Tuning for Relational Databases Philippe Bonnet – Spring 2015
2
Agenda Disk abstraction – Unix file system Row store – Extent-based allocation – Page layout – Row layout Column store – Virtual IDs – PAX model
3
File Layer How to represent files?
4
Inode Name Layer How to avoid carrying inodes around?
5
File Name Layer How about user-friendly names?
6
File Name Layer Representing directories
7
Path Name Layer Hierarchy of Directories
8
Path Name Layer How about flexible management of files How to avoid cycles in a path? How to rename a file?
9
Absolute Path Name Layer How to name a file regardless of the current working directory?
10
Unix File System Naming Scheme Disk Layout for a file system
11
Symbolic Link Layer How to attach new disks to a file system? (1) represents a file system (2) device and root inode for the given file system Inode pinned in memory for floppy Inode pinned in memory for /dev/fd1 (1) name of parent inode, i.e., floppy
12
Symbolic Link Layer How to create links across file systems (where the inode numbers are not unique)?
13
Naming Layers in Unix File System
14
Putting it all together: inode
15
API Calls: Open f_table fd_table
16
API Calls: Read f_table fd_table
17
What Would a DB Designer Do? Similarities with FS: - name mapping (from table, attributes at API level, array of bytes at disk level) - quantized IOs (block device abstraction of secondary storage) Differences from FS: - Structured Data – A Table is a multiset of records – Indexed access Using SQL Server v7 as Example
18
Storage Architecture 1.Row Store2.Column Store rowidAtt1Att2Att3Att4 1A098zerherP 2... 3 4 5 6 7 8 idAtt1 1A 2... 3 4 5 6 7 8 9 idAtt2 1098 2 3 4 5 6 7 8 9 idAtt3 1zerher 2 3 4 5 6 7 8 9 idAtt4 1P 2 3 4 5 6 7 8 9
19
Pages Structure page { block contents[PAGE_SIZE]; } integer PAGE_SIZE = N // N = 16 for a 8KB page and 512B disk sectors integer PAGE_SIZE_IN_BYTES = 8 * 1024; Procedure PAGE_ID_TO_PAGE (integer page_id) returns instance of page { offset = page_id * PAGE_SIZE; Instance of page p; for i from 0 to PAGE_SIZE -1 { p.block[i] = BLOCK_NUMBER_TO_BLOCK(offset + i) } return p; }
20
Database Files Extent-based allocation 1 Extent = 8 pages Mixed/Uniform extents GAM bitmap over 64000 extents Is extent allocated? SGAM bitmap over 64000 extents Is extent mixed and has at least 1 unused page? PFS page over 8000 pages 1B per page: How much is page used?
21
Representing Tables How to store this data? Bootstrapping Problem!!
22
Finding Data Pages
23
Row store Page Layout structure record_id { integer page_id; integer row_id: } procedure RECORD_ID_TO_BYTES(int record_id) returns bytes { pid = record_id.page_id; p = PAGE_ID_TO_PAGE(pid); byte byte_array[PAGE_SIZE_IN_BYTES]; byte_array = p.contents; byte_address = byte_array + PAGE_SIZE_IN_BYTES-1; row_start = byte_address – record_id.row_id * 2 // each address entry is 2B return RECORD_ADDRESS_TO_BYTES(int row_address); }
24
Record Structure Procedure column_id_to_bytes return bytes
25
Storing Large Attributes
26
Inserting Data CREATE TABLE Variable (Col1 char(3) NOT NULL, Col2 varchar(250) NOT NULL, Col3 varchar(5) NULL, Col4 varchar(20) NOT NULL, Col5 smallint NULL) name colid xtype length xoffset ---- ------ ----- ------ ------- Col1 1 175 3 4 Col2 2 167 250 -1 Col3 3 167 5 -2 Col4 4 167 20 -3 Col5 5 52 2 7 INSERT Variable VALUES ('AAA', REPLICATE('X',250), NULL, 'ABC', 123) id name indid first minlen ----------- -------- ------ -------------- ------ 1333579789 Variable 0 0xC90000000100 9 sysindexes syscolumns
27
Operations on pages: 1. new row 2. row delete 3. row update: rtrx id 4. row update: roll pointer 5. row updare: field 6. row offset array update 7. page header update 8. page trailer update Other operations on pages: 9. checkpoint
28
Columnstore Ids Explicit IDs – Expand size on disk – Expand size when transferring data to RAM Virtual IDs – Offset as virtual ID – Trades simple arithmetic for space I.e., CPU time for IO time – Assumes fixed width attributes Challenge when using compression
29
Page Layout source: IEEE Row store: N-ary Storage Model – NSM) Decomposed Storage Model – DSM PAX Model – Partition Attributes Across
30
PAX Model Invented by A.Ailamaki in early 2000s IO Pattern of NSM Great for cache utilization – columns packed together in cache lines
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.