Presentation is loading. Please wait.

Presentation is loading. Please wait.

S M Faisal* Hash in a Flash: Hash Tables for Solid State Devices Tyler Clemons*Shirish Tatikonda ‡ Charu Aggarwal † Srinivasan Parthasarathy* *The Ohio.

Similar presentations


Presentation on theme: "S M Faisal* Hash in a Flash: Hash Tables for Solid State Devices Tyler Clemons*Shirish Tatikonda ‡ Charu Aggarwal † Srinivasan Parthasarathy* *The Ohio."— Presentation transcript:

1 S M Faisal* Hash in a Flash: Hash Tables for Solid State Devices Tyler Clemons*Shirish Tatikonda ‡ Charu Aggarwal † Srinivasan Parthasarathy* *The Ohio State University. Columbus, Ohio ‡ IBM Almaden Research Center. San Jose, California † IBM T.J. Watson Center. Yorktown Heights, New York

2 Motivation and Introduction  Data is growing at a fast pace  Scientific data, Twitter, Facebook, Wikipedia, WWW  Traditional Data Mining and IR algorithms require random out-of-core data access  Often Data is too large to fit in memory thus frequent random disk access is expected 11/30/2007 2

3 Motivation and Introduction (2)  Traditional Hard Disk Drives can keep pace with storage requirements but NOT random access workloads  Moving parts are physical limitations  Also contribute to rising energy consumption  Flash Devices have emerged as an alternative  Lack moving parts  Faster Random Access  Lower energy usage  But they have several drawbacks…. 11/30/2007 3

4 Flash Devices  Limited Lifetime  Supports limited number of rewrites  Also known as erasures or cleans.  Impacts response time  These are incurred at the block level.  Blocks consist of pages. Pages (4kb-8kb) are the smallest I/O unit  Poor Random Write Performance  Incurs many erasures and lowers lifetime  Efficient sequential write performance  Lowers erasures and increases lifetime 11/30/2007 4

5 On Flash Devices, DM, and IR  Flash Devices provide fast random read access  Common for many IR and DM algorithms and data structures  Hash Tables are common in both DM and IR  Useful for associating keys and values  Counting Hash Tables associate keys with a frequency  This is found in many algorithms that track word frequency  We will examine one such algorithm common in both DM and IR (TF-IDF)  They exhibit random access for writes and reads  Random Writes are an issue for Flash Devices 11/30/2007 5

6 Hash Tables for Flash Devices must:  Reduce erasures/cleans and Reduce random writes to SSD  Batch updates  Maintain reasonable query times  Data Structure must not incur unreasonable disk overhead  Nor should it require unreasonable memory restraints 11/30/2007 6

7 Our approach  Our approach makes two key contributions:  Optimize our designs for a counting hash table.  This has not been done by the previous approaches  (A. Anand ’10), (D. Andersen ’09), (B. Debnath, ’10), (D. Zelinalipour-Yatzi ’05)  The Primary Hash Table resides on the Flash Device.  Many designs use the SSD as a cache to the HDD  (D. Andersen ’09) (B. Debnath, ’10)  Anticipate data sets with high random access and throughout requirements 11/30/2007 7

8 Hash Tables for Flash Devices must:  Reduce erasures/cleans and Reduce random writes to SSD  Batch updates  Create In Memory Structure  Target semi-random updates or block level updates  Maintain reasonable query times  Data Structure must not incur unreasonable disk overhead  Carefully index keys on disk  Nor should it require unreasonable memory restraints  Memory requirement is at most fixed parameter 8

9 Memory Bounded(MB) Buffering 9 Updates are Hashed into a bucket in the RAM Updates are quickly combined in memory (64,2)(12,7) When full, batch updates to corresponding Disk Buckets If Disk Buckets are full, invoke overflow region

10 Memory Bounded(MB) Buffering  Two way Hash  On-Disk Closed Hash Table  Hash at page level  Update via block level  Linear Probing for collisions  In memory Open Hash table  Hash at block level  Combine updates  Flush with merge() operation  Overflow segment  Closed Hash table excess 11/30/2007 10

11 Can we improve MB?  Reduces number of write operations to flash device  Batch Updates only when memory buffer is full  Updates are semi-random  (Key,Value) changes are maintained in memory  Query times are reasonable  Memory buffer search is fast  Relatively fast SSD random access and linear probing (See Paper)  Prefetch pages  MB has disadvantages  Sequential Page Level operations are preferred  Fewer block updates  Limited by the amount of available memory  Think large disk datasets.  Updates may be numerous 11/30/2007 11

12 Introduce an On Disk Buffer  Batch updates from memory to disk are page level  Reduce expensive block level writes (time and cleans)  Increase Sequential writes  Increase buffering capability  Reduce expensive non semi-random Block Updates  May decrease cleans  Search space increases during queries  Incurred only if inserting and reading concurrently  However, less erasure time will decrease latency 11/30/2007 12

13 On Disk Buffering  Change Segment (CS)  Sequential Log Structure  sequential writes  stage() operation  Flushes memory to CS  Fast Page Level Operations  merge() operation  Invoked when CS is full  Combines CS with Data Segment  Less frequent than stage()  What is the structure of the CS? 11/30/2007 13

14 Change Segment Structure v1 14 Buckets are assigned specific Change Segment Buckets. Change Segment Buckets are shared by multiple RAM buffer buckets.

15 Memory Disk Bounded Buffer (MDB)  Associate a CS block to k data blocks  Semi random writes  Only merge() full CS blocks  Frequently updated blocks may incur numerous (k-1) merge() operations  Query times incur an additional block read  Packed with unwanted data 11/30/2007 15

16 Change Segment Structure v2 16 As buckets are flushed, they are written sequentially to the change segment one page at a time

17 MDB-L  No Partitions in CS  Allows frequently updated blocks to have maximum space  merge() all blocks when CS is full  Potentially expensive  Very infrequent  Queries are supported by pointers  As blocks are staged onto the CS, their pages are recorded for later retrieval  Prefetch 11/30/2007 17

18 Expectations  MB will incur more cleans than MDB or MDBL  Frequent merge() operation will incur block erasure  MDB and MDBL will incur slightly higher query times  Addition of CS  MDB and MDBL will have superior I/O performance  Most operations are page level  Less erasures  lower latency 11/30/2007 18

19 Experimental Setup (Application)  TF-IDF  Term Frequency-Inverse Document Frequency  Word importance is highest for infrequent words  Requires a counting hash table  Useful in many data mining and IR applications (document classification and search) 11/30/2007 19

20 Experimental Setup (DataSets)  100,000 Random Wikipedia articles  136M keywords  9.7M entries  MemeTracker (Aug 2009 dump)  402M total entries  17M unique 11/30/2007 20

21 Experimental Setup (Method)  1M random queries were issued during insertion phase  10 random workloads, queries need not be in the table  Measure Query Performance, I/O time, and Cleans  Used three SSD configurations  One Single Level Cell (SLC) vs two Multi Level Cell (MLC) configurations  MLC is more popular. Cheaper per GB but less lifetime  SLC have lower internal error rate, and faster response rates (See Paper for specific configurations)  DiskSim and Microsoft SSD Plugin  Used for benchmarking and fine-tuning our SSD 21

22 Results (AVERAGE Query Time) By varying the on memory buffer, as a percentage of the data segment, the average query time only reduces by fractions of a second. This suggest the majority of the query time is incurred by the disk. 11/30/2007 22

23 Results (AVERAGE Query Time) By varying the on disk buffer, as a percentage of the data segment, the average query time decreases substantiall for MDBL This reduction is seen in both datasets. MDB requires block reads in the CS. 11/30/2007 23

24 Results (AVERAGE Query Time) Using the Wiki dataset, we compared SLC with MLC We experience consistent performance 11/30/2007 24

25 Results(AVERAGE I/O) In this experiment, we set the in memory buffer to 5% and the CS to 12.5% of the primary hash table size Simulation time is highest for MB because of the block erasures (next slide). MDBL is faster than MDB because of the increased page level operations 25

26 Results(Cleans/Erasures) Cleans are extremely low for both MDB and MDBL relative to MB This is caused by the page level sequential operations Queries are effected by cleans because the SSD must allocate resources to cleaning moving 11/30/2007 26

27 Discussion and Conclusion  Flash Devices are gaining popularity  Low Latency, High Random Read Performance, Low Energy  Limited lifetime, poor random write performance  Hash tables are useful data structures in many data mining and IR algorithms  They exhibit random write patterns  Challenging for Flash Devices  We have demonstrated that a proper Hash table for Flash Devices will have  In-memory buffer for batch memory  disk updates  On disk data buffer with page level operations 11/30/2007 27

28 Future work  Our current designs rely on hash functions that use the mod operator  Extendible Hashing  Checkpoint methods for crash recovery  Examine on Real SSD  Disksim is great for finetuning and examining statistics 11/30/2007 28

29 Questions?


Download ppt "S M Faisal* Hash in a Flash: Hash Tables for Solid State Devices Tyler Clemons*Shirish Tatikonda ‡ Charu Aggarwal † Srinivasan Parthasarathy* *The Ohio."

Similar presentations


Ads by Google