1 The Five-Minute Rule 20 Years Later (And How Flash Memory Changes The Rules) Goetz Graefe Presented By Abhinav Parate.

Slides:



Advertisements
Similar presentations
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
Advertisements

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
1 The 5 Minute Rule Jim Gray Microsoft Research Kilo10 3 Mega10 6 Giga10 9 Tera10 12 today,
Instructions for Playing Jeopardy Click on the question that you want to attempt, example $100 Read the question and click on the slide to advance to.
Case Study: Photo.net March 20, What is photo.net? An online learning community for amateur and professional photographers 90,000 registered users.
IT253: Computer Organization
DiskCon September 2004 Solid State Disks: The Future of Storage?
Storage Devices.
Types and components of computer systems
Storing Data Chapter 4.
Reading Scale (1) (Mental Maths)
Category Category 2Category 3Category 4Category
External sorting R & G – Chapter 13 Brian Cooper Yahoo! Research.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
JR Description JR-1000 is a simple easy-to-use single-channel data logger. It is ideal for quick and accurate temperature recordings.
Blank Jeopardy. Category #1 Category #2 Category #3 Category #4 Category #
Performance Tuning for Informer PRESENTER: Jason Vorenkamp| | October 11, 2010.
Queuing and Caching to Scalability James Kovacs
A New Cache Management Approach for Transaction Processing on Flash-based Database Da Zhou
Buffer Manager Extra!. Further Reading (Papers I like) Elizabeth J. O'Neil, Patrick E. O'Neil, Gerhard Weikum: The LRU-K Page Replacement Algorithm For.
What Is A Computer System?
Computer Basics – Things Your Should Know About Computers Dr. Alex Pan.
Using Secondary Storage Effectively In most studies of algorithms, one assumes the "RAM model“: –The data is in main memory, –Access to any item of data.
13.6 Representing Block and Record Addresses Ramya Karri CS257 Section 2 ID: 206.
The Hard Drive By “The Back Table”.
“Five minute rule ten years later and other computer storage rules of thumb” Authors: Jim Gray, Goetz Graefe Reviewed by: Nagapramod Mandagere Biplob Debnath.
Computer Systems. Note to the student Please make brief notes on the following computer hardware + software components After each section research the.
GENERAL COMPUTER Jeopardy ABCDE Points A What is this? Go back.
Database Architecture Introduction to Databases. The Nature of Data Un-structured Semi-structured Structured.
Bhanu Choudhary CS257 Section 1 ID: 101.  Introduction  Addresses in Client-Server Systems  Logical and Structured Addresses  Pointer Swizzling 
13.6 Representing Block and Record Addresses
Computers Inside and Out
70-294: MCSE Guide to Microsoft Windows Server 2003 Active Directory, Enhanced Chapter 4: Active Directory Architecture.
GCSE Computing Memory Powerpoint Templates.
Introduction to Computer Architecture. What is binary? We use the decimal (base 10) number system Binary is the base 2 number system Ten different numbers.
File Processing : Storage Media 2015, Spring Pusan National University Ki-Joune Li.
1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=
GCSE ICT Storing data - Internal memory, backing storage, and measuring memory.
Components of a Computer System
Allow computers to store programs and information for use at a later date Storage Devices.
Computers - The Journey Inside continues…
Computer Organization. The Five Hardware Units General purpose computers use the "Von Neumann" architecture Also referred to as "stored program" architecture.
1 Objectives Discuss reasons for taking this course on computers Outline the scope of this course Define the computer Differentiate between hardware and.
Magnetic drum memory Invented all the way back in 1932 (in Austria), it was widely used in the 1950s and 60s as the main working memory of computers. In.
BMTS 242: Computer and Systems Lecture 2: Memory, and Software Yousef Alharbi Website
Memory Hierarchies Sonish Shrestha October 3, 2013.
Storage devices 1. Storage Storage device : stores data and programs permanently its retained after the power is turned off. The most common type of storage.
Main Memory Main memory – –a collection of storage locations, –each with a unique identifier called the address. Word- –Data are transferred to and from.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
1 Query Processing Exercise Session 1. 2 The system (OS or DBMS) manages the buffer Disk B1B2B3 Bn … … Program’s private memory An application program.
Lab-Using Punnett Squares to Predict Actual Ratios
Query Processing Exercise Session 1.
Five-Minute Rule for trading memory for disc access-Jim Gray and G. F
Introduction to Computers
STORAGE DEVICES Towards the end of this unit you will be able to identify the type of storage devices and their storage capacity.
MEMORY BYTES. MEMORY BYTES MEMORY MEMORY OUR Internal External.
How do Computers Work ?.
Ramya Kandasamy CS 147 Section 3
Computing Hardware.
How will execution time grow with SIZE?
Introduction to Computer Architecture
STORAGE DEVICES Towards the end of this unit you will be able to identify the type of storage devices and their storage capacity.
1. Davinder is a music student
File Processing : Storage Media
STORAGE DEVICES Towards the end of this unit you will be able to identify the type of storage devices and their storage capacity.
File Processing : Storage Media
Presentation transcript:

1 The Five-Minute Rule 20 Years Later (And How Flash Memory Changes The Rules) Goetz Graefe Presented By Abhinav Parate

2 Storage Hierarchy FLASH

3 Comparing Flash with Disks

4 When should we increase main memory? Metrics to decide- – Cost of infrastructure – Cost of maintenance – Mean Time to Failure – Performance improvement Simplest answer: Increase RAM size if it is insufficient to hold frequently accessed data item What time period is frequent?

5 Cost of accessing a data item A disc provides N accesses per second and costs $D. D A : D/N = Cost of disc access per second M : Cost of 1 byte of main memory I : Expected interval when the same data is accessed again (in seconds) B : Size of data in bytes

6 Cost of accessing a data item Number of accesses per second for data item = 1/I Cost if item is accessed from disc = D A /I Cost if item is available in memory = M * B Keep data item in memory if main memory cost is less than disc access cost M * B < D A / I I < D A / (M * B) For 1 KB data item, I < 400s ~ 5 minutes at 1987 costs

7 The Five-Minute Rule In 1987, Keep a 1KB data item in main memory, if it is accessed repeatedly in less than 5 minutes. In 1967, the frequent period was 0.5 s In 2007, the authors predicted 5 hour rule At actual 2007 prices, the period turned out to be little under 6 hours.

8 Sample Case A database consists of 500,000 records of 1000 bytes each. Peak load consists of 600 transactions per sec. Only 6% of data gets 96% accesses and gets accessed in <5min. 6% data resides in main memory. Remaining data gets accessed via two hard disks to support 1 second access time. The design saved $3.5m at 1987 costs when compared with entirely main-memory design

9 Back to Present Technology changed Multiple cores Virtualization Size of data increased tremendously Gap between RAM and disks performance increased FLASH memory comes into the picture!

10 Flash memory characteristics Purchase cost Access Latency Bandwidth Density Power consumption Cooling costs Everything lies in between RAM and rotating hard disks!

11 Comparison: Flash and Disks

12 Desirability of Flash Memory Disk I/O is increasingly becoming bottleneck as the number of CPU instructions possible in a disk I/O is steadily increasing A faster intermediate memory in storage hierarchy is highly desirable

13 Limitation of Flash Memory Write-bandwidth is lower than read-bandwidth. Re-writing a block requires erasing of entire block. Reliability: 100,000-1M erase and write cycles Requires wear levelling mechanism Requires agent to erase blocks as soon as they are written to hard disk.

14 The presentation ahead... Key challenges in using flash memory Addressing challenges Lots of open questions Implications in greening the computing infrastructure.

15 #1: Which hardware interface to use? Use DIMM? Use Serial-ATA? Use new hardware interface? Defining and developing new hardware interface is time- consuming exercise Use one of the existing interfaces

16 #2: Use as Buffer or Persistent Storage? Database systems are concerned with providing consistency. Databases have large number of small updates and must maintain recovery logs. Write logs to persistent storage quickly. Use Flash as Persistent Storage!

17 #2: Use as Buffer or Persistent Storage? File-systems manipulates the file contents in memory and write file to disk in its entirety Consistency is achieved via careful write ordering, quick write-back and expensive file-system checks. Page movement between flash and disks is expensive if flash is considered as persistent storage. Use Flash memory as buffer pool!.

18 #3: How to track Frequent Pages? The estimation and administration of frequent pages in current system is done through LRU Maintain two LRU chains in RAM

19 Least Recently Used Chain LRU for RAM LRU for flash memory T(N)T(N-1)T(1)

20 #4: How to decide size of RAM and Flash? Use Five-Minute Rule!

21 #5: How to move pages among layers in hierarchy? RAM and flash – DMA Transfer Flash and Disk – DMA (hardware) – Transfer buffer in RAM (software)

22 #6: How to track Page Locations? File systems – Maintain pointer pages – Pointer points to data page or run of contiguous data pages – Individual page movement may require breaking up run and updating pointer pages

23 #6: How to track Page Locations? Database systems – Use B-Tree indexes – Other kinds of indexes have been implemented on B-Trees efficiently – Page movement requires updating pointers in parent node and neighbors

24 Benefits to Database Systems Check Point Processing – provides consistency in databases – writes dirty pages to persistent storage – persistent flash storage is faster – need to write to disk only if page-replacement policy requires Recovery Logs – quick writes

25 Benefits to Database Systems Query Processing – Index based selection is faster – Need to consider index based query plans – Index joins and intersections Example: Table Scan: 100M rows : 100s Index fetches 10K rows in 100s Table Scan is efficient if result has more than 10K rows. Flash index scan fetches 500K rows!

26 Problem of Optimal B-tree Page Size Two different optimal page sizes

27 Implications for Green Computing This work's focus is infrastructure cost. Energy optimization may lead to different optimal page sizes for B-trees. Infrastructure cost optimization can lead to significant reduction in RAM size and hence, lower energy consumption. Introduces large flash memory in the system.

28 Implications for Green Computing P_flash be power consumption with flash memory P_noflash be power consumption without flash Let T_flash,T_noflash denote system throughput with/without flash System is green if – P_flash / P_noflash < 1 – T_flash / T_noflash > 1

29 Implications for Green Computing What if P_flash / P_noflash > 1? In this case, system is green if – T_flash / T_noflash > P_flash / P_noflash – Gain in throughput is higher than extra power spent

30 Some calculations Assume linear relation between number of frequently accessed pages and the frequent period If M is RAM used in no-flash system – M/15 is RAM in flash-based system – 4M is flash memory P_flash = M/15 x p ram + 4M x p flash P_noflash = M x p ram P_flash < P_noflash if p flash < 14/60 p ram The relationship holds true.

31 Conclusions Desirable to have faster intermediate memory in storage hierarchy. Database systems are likely to benefit a lot. Things are not clear about file-systems. Flash can improve system throughput and reduce power consumption. Reduction in RAM usage can lead to significant power savings.

32 Thank You!