Big Data Yuan Xue CS 292 Special topics on
Part I Relational Database (Physical Database Design) Yuan Xue
Review and Look Forward What we know so far Design: Database schema Optimization objective: minimum redundancy with information preservation Operation: Database access and manipulation via SQL Next step Design: How data is stored in database? Operation: How data is accessed and manipulated; how to configure the data access design via SQL Optimization: How data storage and access method can be designed so that applications (and SQL queries) (average) execution time is minimized Conceptual Design Data model mapping Next step Entity/Relationship model Logic Design Logical Schema] Normalization Normalized Schema Physical Design Physical (Internal) Schema
Primitives Storage Hierarchy in computer systems Processor cache Memory (data cache, volatile) Disk (persistent) Tapes Persistent data storage vs. transient(volatile) data storage Correlation exists currently between storage speed and price E.g., using Solid State Drives (SSDs) for speed by DynamoDB Faster storage is typically volatile Disk basics Slow: rotation delay + seek time Block is the unit of data access and transfer between disk and memory Buffering is a low-level technique to save data access and transfer time
Primitives File System A Layer of OS that transforms Block interface of disks into File (directories, etc) File System component Disk management: organize disk blocks into files Contiguous allocation, linked allocation, indexed allocation E.g. inode in Unix Naming: interface to find files by name, not by blocks Protection: Layers to keep data secure Reliability and durability File Operations Open, close Read (random, sequential), write (modification, insertion, append, delete)
Placing and Accessing File Records on Disk Record Fixed-length records Variable-length records Separator BLOB (binary large object) in mySQL Media file, image, etc Option 1: stored separately from its record, keep a pointer in the record Option 2: using BLOB or TEXT type Use MySQL string handling Better security support Putting Record into Blocks Spanned vs. unspanned CREATE TABLE MiniTwitter.User (IDVARCHAR(20)NOT NULL, NameVARCHAR(20)NOT NULL, VARCHAR(20)NOT NULL, PasswordCHAR(10)NOT NULL, ; IDName Password Aadf1234 com qwer6789 Block 1 Block 2
Database Physical Design Overview Two aspects Storage organization Access method Primary file organization How file records are physically placed on the disk How the records can be accessed Secondary organization (auxiliary access structure) Allows efficient access to file records based on alternative fields than those that have been used for the primary file organization Most of these exist as indexes Database engine (or storage engine) in DBMS component that handles data create, read, update and delete (CRUD) operations on a database. Successful design: Perform as efficient as possible over the frequent operations
Primary File Organization Several types Heap file (unsorted) Sorted file (ordered by a particular field – sort key) Hash file B-tree, B+ tree mainly used for index file
Heap file (unsorted) Operations Insertion: efficient Retrieval/Search: inefficient, linear Deletion May lead to unused space Mark selected records as "deleted" requires periodic reorganization Advantages efficient for bulk loading data efficient for relatively small relations as indexing overheads are avoided efficient when retrievals involve large proportion of stored records Disadvantages not efficient for selective retrieval using key values sorting may be time-consuming
Sorted File Sorted file: ordered by a particular field – ordering key Operations Search based on ordering key is efficient (binary search) Insertion, deletion are expensive Solution: overflow file (idea used in BigTable/Hbase) Create a temporary unordered file New record added to overflow file Periodically, overflow file is sorted and merged with master file Increased complexity in search (have to search in both overflow and master files) Advantages Efficient for range access of ordering key Efficient for (random) write Disadvantages Sequential scan perform the same as Heap file No benefit for accessing based on nonordered fields Sorted files are rarely used in database application without primary index (coming up in slide 15) is defined as an additional access path
Hash File Basic idea Hash function : Hash field of a record location of this record Hash function Even distribution Collision handling: open addressing, chained overflow External hashing for disk files Target address space divided into buckets, each hold multiple records. Bucket : one block, or contiguous disk blocks Hash function maps the key to a bucket number; a table in the file header converts the bucket number into disk block address Static hashing vs. Extensible hashing Pros and cons efficient for exact matches on key field not suitable for range retrieval, which requires sequential storage
Row vs. Column Row-oriented storage all data associated with a given row is stored together. Column-oriented storage (adopted by Dremel for example) store all data from a given column together in order Quickly serve data warehouse-style queries Read only Access a large range of certain attributes together Comparison Column-oriented organizations are more efficient when an aggregate needs to be computed over many rows but only for a notably smaller subset of all columns of data new values of a column are supplied for all rows at once at writing Row-oriented organizations are more efficient when many columns of a single row are required at the same time row-size is relatively small, as the entire row can be retrieved with a single disk seek writing a new row if all of the row data is supplied at the same time
Secondary Organization (Index) Index Persistent data structure, stored in database Ordered file fixed length record A binary search on the index yields a pointer to the file record Also called as access path on the field (index field) The index file usually occupies considerably less disk blocks than the data file because its entries are much smaller Primary mechanism to get improved performance on a database Difference between full table scans and immediate access Indexes can be added or removed without changing database application logic Indexes can also be characterized as dense or sparse Dense index has an index entry for every search key value (and hence every record) in the data file. Sparse (or nondense) index has index entries for only some of the search values
Secondary Organization (Index) Down side Extra space marginal Index creation medium Index maintenance slow data maintenance can offset benefits Indices can be implemented using a variety of data structures. Balanced trees (B trees, B+ trees) Logarithm access, support range queries Hash tables Constant time access
Primary Index Defined on an ordered data file [primary organization method] data file is ordered on a key field Includes one index entry for each block in the data file index entry has the key field value for the first record in the block, which is called the block anchor A primary index is a nondense (sparse) index, since it includes an entry for each disk block of the data file and the keys of its anchor record rather than for every search value.
Secondary Index Secondary means of accessing a file with primary access. Defined on an ordered, unordered, or hashed data file [primary organization method] Defined on a field that can be candidate key with a unique value in every record, non-key with duplicate values Each record in the index has two fields. The first field has the indexing field The second field is either a block pointer or a record pointer. There can be many secondary indexes (and hence, indexing fields) for the same file. When a block pointer is used in secondary index, to access a record, the disk block is transferred to memory, then a search for the record is carried out in the memory
Multi-Level Indexes A single-level index is an ordered file a primary index to the index itself can be created the original index file is called the first-level index the index to the index is called the second-level index Repeat the process creating a third, fourth,..., top level until all entries of the top level fit in one disk block A multi-level index can be created for any type of first-level index (primary, secondary) Such a multi-level index is a form of search tree Insertion and deletion of new index entries is a severe problem because every level of the index is an ordered file Solution: B-tree and B+-tree
B-Trees and B+-Trees Most multi-level indexes use B-tree or B+-tree data structures because of the insertion and deletion problem In B-Tree and B+-Tree data structures, each node corresponds to a disk block Each node is kept between half-full and completely full An insertion into a node that is not full is quite efficient If a node is full the insertion causes a split into two nodes; Splitting may propagate to other tree levels A deletion is quite efficient if a node does not become less than half full If a deletion causes a node to become less than half full, it must be merged with neighboring nodes
B-tree Structures
Difference between B-tree and B+-tree In a B-tree, pointers to data records exist at all levels of the tree In a B+-tree, all pointers to data records exists at the leaf-level nodes A B+-tree can have less levels (or higher capacity of search values) than the corresponding B-tree
The Nodes of a B+-tree FIGURE The nodes of a B+-tree (a) Internal node of a B+-tree with q –1 search values. (b) Leaf node of a B+-tree with q – 1 search values and q – 1 data pointers.
An Example of an Insertion in a B+-tree
An Example of a Deletion in a B+-tree
Create Index How to create using SQL Which attribute should be used? CREATE INDEX index_name ON table_name (column_name) Consider a simple relation V(M,N) with two attributes M, N; both take integer values [1, 100]; For the following SQL queries SELECT * FROM V WHERE M=? SELECT * FROM V WHERE N=? SELECT * FROM V WHERE M=? AND N=? Which index is helpful for each query? 1.Index on V(M) 2.Index on V(N) 3.Index on V(M,N)
Example Value of V(M) Pointers to records (or blocks) … … 100 Index on V(M) List of pointers to (3,1) (3,2), …(3, 100) SELECT * FROM V WHERE M=3 For SQL query
Example Value of V(N) Pointers to records (or blocks) … … 100 Index on V(N) List of pointers to (1,3) (2,3), …(100,3) SELECT * FROM V WHERE N=3 For SQL query
Example Value of V(N) Pointers to records 1 … 5 … … 100 Approach 1: Index on V(M) and Index on V(N) List of pointers to (1,5) …(100,5) SELECT * FROM V WHERE M=3 AND N=5 For SQL query Value of V(M) Pointers to records … … 100 List of pointers to (3,1) …(3, 100) {(3,1), (3,2),..} {(1,5), (2,5),..} Take the set intersection
Example Approach 2: Index on V(M,N) SELECT * FROM V WHERE M=3 AND N=5 For SQL query Value of V(M,N) Pointers to records 1,1 2,1 … 3,5 … 100 pointer to (3,5) Comparison: 1.Overhead in Index maintenance: 2.Efficiency in record access: 3.Flexibility in support different queries
Summary Meta-data Application Program/Queries Users Query Processing Data access (Database Engine) Data DBMS system Primary file organization Heap, Sorted, Hash Files Secondary organization (Index) Primary, Secondary Multiple-level index B-Tree, B+-Tree Create Index in SQL Database engine in DBMS handles data create, read, update and delete (CRUD) operations on a database. Physical Design Optimization: Create Index based on 1.Database statistics 2.Query patterns