Storage & File Structure Meghan Nagpal. Storage Media  Cache: Small, fastest form of storage; managed by the hardware; no effects about managing cache.

Slides:



Advertisements
Similar presentations
Storing Data: Disk Organization and I/O
Advertisements

Faculty of Information Technology Department of Computer Science Computer Organization Chapter 7 External Memory Mohammad Sharaf.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
CS 6560: Operating Systems Design
CMSC424: Database Design Instructor: Amol Deshpande
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7.
2P13 Week 11. A+ Guide to Managing and Maintaining your PC, 6e2 RAID Controllers Redundant Array of Independent (or Inexpensive) Disks Level 0 -- Striped.
1 Storing Data: Disks and Files Yanlei Diao UMass Amherst Feb 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Chapter 6 Storage and Multimedia: The Facts and More.
Recap of Feb 25: Physical Storage Media Issues are speed, cost, reliability Media types: –Primary storage (volatile): Cache, Main Memory –Secondary or.
Efficient Storage and Retrieval of Data
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
©Silberschatz, Korth and Sudarshan11.1Database System Concepts Chapter 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
12.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 12: Mass-Storage Systems.
CS 346 – Chapter 10 Mass storage –Advantages? –Disk features –Disk scheduling –Disk formatting –Managing swap space –RAID.
1 Database Systems Storage Media Asma Ahmad 21 st Apr, 11.
L/O/G/O External Memory Chapter 3 (C) CS.216 Computer Architecture and Organization.
Lecture 11: DMBS Internals
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
Physical Storage and File Organization COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
Chapter 10 Storage and File Structure Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
1 Storing Data: Disks and Files Chapter 9. 2 Disks and Files  DBMS stores information on (“hard”) disks.  This has major implications for DBMS design!
Chapter 10 Storage & File Structure. n Overview of Physical Storage Media n Magnetic Disks n Tertiary Storage n Storage Access n File Organization n Organization.
“Yea, from the table of my memory I’ll wipe away all trivial fond records.” -- Shakespeare, Hamlet.
Page 110/12/2015 CSE 30341: Operating Systems Principles Network-Attached Storage  Network-attached storage (NAS) is storage made available over a network.
CE Operating Systems Lecture 20 Disk I/O. Overview of lecture In this lecture we will look at: Disk Structure Disk Scheduling Disk Management Swap-Space.
Overview of Physical Storage Media
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
Database Management Systems,Shri Prasad Sawant. 1 Storing Data: Disks and Files Unit 1 Mr.Prasad Sawant.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
Chapter Ten. Storage Categories Storage medium is required to store information/data Primary memory can be accessed by the CPU directly Fast, expensive.
11.1Database System Concepts. 11.2Database System Concepts Now Something Different 1st part of the course: Application Oriented 2nd part of the course:
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Storage Structures. Memory Hierarchies Primary Storage –Registers –Cache memory –RAM Secondary Storage –Magnetic disks –Magnetic tape –CDROM (read-only.
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Chapter 5 Record Storage and Primary File Organizations
Storage and File Structure Malavika Srinivasan Prof. Franya Franek.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
1 Storing Data: Disks and Files Chapter 9. 2 Objectives  Memory hierarchy in computer systems  Characteristics of disks and tapes  RAID storage systems.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 10: Mass-Storage Systems.
Data Storage and Querying in Various Storage Devices.
File-System Management
Storage Overview of Physical Storage Media Magnetic Disks RAID
Database Applications (15-415) DBMS Internals- Part I Lecture 11, February 16, 2016 Mohammad Hammoud.
Module 11: File Structure
Lecture 16: Data Storage Wednesday, November 6, 2006.
Operating System I/O System Monday, August 11, 2008.
Performance Measures of Disks
Lecture 11: DMBS Internals
Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin
Chapter 10: Storage and File Structure
Disk Storage, Basic File Structures, and Buffer Management
Module 11: Data Storage Structure
Overview Continuation from Monday (File system implementation)
Storage and File Structure
Secondary Storage Management Brian Bershad
Mass-Storage Systems.
Secondary Storage Management Hank Levy
Presentation transcript:

Storage & File Structure Meghan Nagpal

Storage Media  Cache: Small, fastest form of storage; managed by the hardware; no effects about managing cache storage in database, but effects must be watched when designing queries & algorithms  Main memory: Storage media for data to be operated on; too small for entire database; volatile  Flash memory: Non-volatile; camera, cell phones, USBs

Storage Media  Magnetic-disk storage: Long-term online storage; entire database; system must move data from disk to main memory; non-volatile; disk storage devices can fail and destroy data  Optical storage: CDs, DVDs, Blue-Rays; ROM: Read-only; WORM: Write once, read many; RW: written many times; direct access to specified data  Tape storage: Backup & archival data; slower; sequential access to data

Hierarchy  Higher levels are more expensive, but faster Primary storage Secondary, online storage Tertiary, offline storage Volatile Non-Volatile

Magnetic Disk & Flash Storage Information recorded on platter surfaces 1 to 5 per disk Disk surface divided into tracks 50K to 100K per platter Tracks divided into sectors Unit of info read from or written to disk 512 bytes 500 to 1000 per inner track 1000 to 2000 per outer track Stores data magnetically, reverse direction of magnetic material Moves across platter to access different tracks Mounts platter Tracks mounted on disk arm Holds arms Lines up tracks When head on one platter on ith track, heads on all platters on the ith track ith tracks on all platters on ith cylinder

Disk Controllers  Disk controller: Interfaces between computer system and actual hardware and disk driver  Checksums: Attached by disk controllers to each sector that is written, computed from data to the sector. This checksum is compared to stored checksum to ensure accuracy.  In damaged sectors, disk controller can map sector to a different physical location  Can be connected to a high speed network. Remote access allows disks to be shared by multiple computers, run in parallel.

Performance measures of Disks  Access time: Time between read or write request and data transfer. Arm must move to position of correct track and wait for sector.  Average seek time: Average time to reposition the arm to track. Typically 4 to 10 ms.  Rotational latency: the time spent waiting for sector once head is on track. Typically 4 to 11.1 ms per rotation.  Data transfer rate: Average time data can be retrieved from or stored from to the disk. 25 to 100 MB per second.  Mean time to failure: Average time we can expect the risk to run continuously without failure. Most disks have a life expectancy of 5 years.

Optimization of Disk-Block-Access  Block: Fixed number of continuous sectors. Requests are made to disk address referred to as block number  Techniques to improve speed of access to blocks:  Buffering: Read blocks are temporarily stored in buffer for future requests. By OS or DB system.  Read-ahead: Consecutive blocks on same track are read into the memory buffer when disk block is accessed. Good for sequential access systems.  Scheduling: Algorithms to efficiently access tracks in same cylinder. Elevator algorithm is when arm moves like an elevator and services each track on the way where there is an access request. Changes direction and searches again.  File organization: Organize data blocks in a way which we would expect I to be accessed. If we want to access data sequentially, we would keep all data blocks of a file sequentially.  Non-volatile write buffers: Storing database updates on disk in event of system crash. Speed is crucial. Non-volatile RAM (NVRAM) to speed up disk writes when system requests block be written to a disk. Controller writes to disk when there are no more requests or when NVRAM buffer is full.  Log disk: Similar to NVRAM, all access is sequential and several consecutive blocks are written at once. Write to disk can happen afterwards so that DB system doesn’t have to wait for write to be complete. Log disk ca reorder write to minimize arm movement. File systems which do this are called journaling file systems and can keep data and the long on the same disk. For lower performance.

Flash Storage  NOR Flash: Access to individual words of memory. Read time comparable to main memory  NAND Flash: Reads entire page of data, 512 to 4096 bytes. Similar to sectors in a disk. Cheaper, more commonly used.  Faster access to random memory than magnetic disk  1 – 2 μs vs. 5 – 10 ms to retrieve data  Lower transfer rate at 20 MB/s  Memory cannot be overwritten; erased then re-written. Slow, 1 – 2 ms. Limit of to times.

Flash Storage  Logical page numbers can be mapped to an already erased physical page when updated and original location can be erased later. Physical page stores logical address and original page is marked as deleted when logical address is re-mapped. This makes up for slow erase speed.  Logical to physical page mapping is replicated in a translation table for quick access  Even distribution of erase operations is called wear levelling. Data updated rarely is called “cold data”, “hot data” updated regularly  Flash translation layer: Carries above actions. Above this layer, flash storage is identical to magnetic disk storage.  Hybrid disk drives combine magnetic storage with small amounts of flash memory, which is used as a cache for frequently accessed data

RAID – Improvement of Reliability via Redundancy  Large number of disks needed to store large amounts of data. Opportunity to improve rate at which data is read or written if operating in parallel.  Redundant arrays of independent disks (RAID) to improve performance and reliability  Redundancy: store extra information that can be used if one disk fails. Mirroring (duplicating a disk) is the simplest example. Mean time to data loss 55 to 110 years. However, failures aren’t necessarily independent (power failures, natural disasters, etc.)

RAID – Improvement in Performance through Parallelism  Bit-level striping: Splitting bits across multiple disks. Generally number of disks is a factor of 4 or 8. Every disk participates in every access so number of accesses per second are same as on a single disk, but each disk can be read 4 or 8 times faster than single disk.  Ex. In an array of 8 disks, bit i of each byte is written to each byte of disk i.  Block-level striping: Blocks stripped across multiple disks. Given n disks, logical block i is stored to disk (i mod n) + 1, stored in [i/n]th physical block of the disk.  In array of 8 disks, logical block 11 stored in physical block 1 of disk 4  High data transfer rate as n blocks are being fetched at a time from n disks  Goals of parallelism: 1. Load balance multiple small accesses (blocks) 2. Parallelize large accesses to reduce response time

RAID Levels  Striping isn’t always reliable. Alternative schemes to provide redundancy at lower cost Level 0: Disk striping at block level, no redundancy Level 1: Disk mirroring with block striping. This is an array of four disks, where C indicated a 2 nd copy of data Level 2: Extra bits from each byte stored in further disks and can reconstruct damaged data. This is an array of four disks and P indicates error correcting bits

RAID Levels Level 3: Similar to level 2, but uses just one error correcting bit on one extra disk. Reduces storage overhead. Level 4: Similar to level 3, except parity data is stored as a block on an extra disk Level 5: Parity and data blocks distributed among disks. Parity blocks not stored in same disk as data, as damaged data would be unrecoverable Level 6: Like Level 5, but stores extra redundant information though error correcting codes. 2 bits of redundant data for every 4 bits. Tolerates 2 disk failures

RAID  Choice of RAID levels depend on costs, efficiency, failure performance, and rebuild performance  Consider level of hardware as well. Some implementations use software. Otherwise there are special hardware implementations using nonvolatile RAM to complete incomplete writes in case of power failure.  Scrubbing is used when idle; damaged data is recovered.  Hot swapping: faulty disks can be removed and replaced by new ones when power is on, keeping an extra disk available in the array. Good for 24/7 systems.

Optical disks  CDs store software, multimedia data, electronically published info. 640 to 700 MB. Cheap to mass produce. Data transfers around 3 to 6MB/sec  DVDs are larger, up to 17 GB on DVD-18 formats (2 sides, 2 recording layers). Blue rays can store 27 to 54 GB. Data transfer 8 to 20 MB/sec.  Seek times longer than magnetic disk drives (100 ms) due to heavier assembly head. Slower rotations of 3000 rotations per minute.

Magnetic Tapes  Permanent  Large amounts of data. Some formats could store up 300 GB.  Sequential data  Slow. Moving to desired spot takes seconds or minutes. Data transfer of few to tens of MB/s  Tapes are cheap, but tape drives are more expensive to disk drives so not used as often as disks.

File Organization  File is sequence of records, records mapped onto disk blocks  Block sizes typically 4 to 8 KB  No record is larger than disk block. Each record is entirely contained in a single block.

Fixed Length Record Example: type instructor = record ID varchar (5); name varchar(20); dept name varchar (20); salary numeric (8,2); End  Must allocate 53 bytes. Allocate as many records to a block as would fit entirely in the block. Leave remaining bytes unused.  File header added to the beginning of file containing file information. Address of first deleted record stored in file headers and remaining records are stored in linked list known as free list. Insertion of new records in the file pointed by header and header then points to next available record. If no space available, then new record added to end of file.

Fixed Length Record Free List

Variable Length Record  Record has initial part which is fixed length attributes and variable length attributes following.  In variable length attribute, record noted by offset & length. Offset denotes where data begins and length is length in bytes of variable-sized attributes. Indicates which attributes have null values

Variable Length Records – Slotted Page Structure Header in the beginning of block containing number of record entries in header, end of free space in block, and array whose entries contain location and size of each record Records stored and inserted continguously, starting at end of block Free space contiguous between final entry and first record Other records moved upon deletion to occupy free space

Sequential File Organization  Records sorted in order aby some search key (attribute or set of attributes not necessarily primary key or superkey)  Records stored physically in search key order  Deletion managed by pointer chains. With insertion, store record in deleted block if available, if not, place in overflow and place new record in overflow block and adjust pointers. If too much overflow, reorganization must be done when system is low, though costly and time consuming.

Multitable Clustering File Organization  Records of one relation stored in a given block, but sometimes it’s advantageous to store records of more than one relation on block  Multitable clustering file organization stores records of two or more relation in one block  Allows us to read records that satisfy a join condition in one block read, so faster

Data-Dictionary Storage  Maintain metadata or data about the data  Stored in data dictionary or system catalogue  Names of relations, attribute names of relations, domains & lengths of attirbutes, names & definitions of the views, integrity constraints  User data stored in system  Names, authorization, and account information about users, authentication information  Statistical and descriptive data of relations  Number of tuples, method of storage (clustered, non-clustered)  Storage organization and location of relation

Data Dictionary Storage

Buffer Manager  Buffer: Part of main memory available for storage of copies of disk blocks. Buffer Manager allocates buffer space.  Buffer replacement strategy: Least recently used block is removed from scheme when no room left in buffer  Pinned blocks: Restrict times when block is written to disk in instances of crashes. Block not allowed to be written is referred to as pinned block.  Forced output of blocks: In situations when block is needed to be written to disk, it is forced out even if buffer space is not needed

Final Remarks  Hierarchy of storage starts from cache to tertiary storage devices  Redundancy can be used for improving reliability, parallel processing improves efficiency  Consider the system organization structure when designing database  Data dictionaries are useful for metadata