FSort: External Sorting on Flash-based Sensor Devices Panayiotis Andreou, Orestis Spanos, Demetrios Zeinalipour-Yazti, George Samaras University of Cyprus,

Slides:



Advertisements
Similar presentations
IEEE SECON-2005, Tue, September 27, 2005 (C) Anirban Banerjee and Abhishek Mitra RISE & Co-S: A high performance Co-proceSsing Sensor architecture for.
Advertisements

CS 400/600 – Data Structures External Sorting.
Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting CS634 Lecture 10, Mar 5, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 11 External Sorting.
CS 4432lecture #31 CS4432: Database Systems II Lecture #3 Professor Elke A. Rundensteiner.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting 198:541. Why Sort?  A classic problem in computer science!  Data requested in sorted order e.g., find students in increasing gpa order.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Perimeter-based Data Acquisition and Replication in Mobile Sensor Networks Panayiotis Andreou (Univ. of Cyprus) Demetrios Zeinalipour-Yazti (Univ. of Cyprus)
SenseSwarm: A Perimeter-based Data Acquisition Framework for Mobile Sensor Networks Demetrios Zeinalipour-Yazti (Open Univ. of Cyprus) Panayiotis Andreou.
External Sorting Chapter 13.. Why Sort? A classic problem in computer science! Data requested in sorted order  e.g., find students in increasing gpa.
External Sorting Problem: Sorting data sets too large to fit into main memory. –Assume data are stored on disk drive. To sort, portions of the data must.
Lecture 11: DMBS Internals
Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.
Physical Storage and File Organization COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.
Indexing and Searching in Wireless Sensor Networks Demetris Zeinalipour [ ] School of Pure and Applied Sciences Open University of.
/38 Lifetime Management of Flash-Based SSDs Using Recovery-Aware Dynamic Throttling Sungjin Lee, Taejin Kim, Kyungho Kim, and Jihong Kim Seoul.
Speaker: 吳晋賢 (Chin-Hsien Wu) Embedded Computing and Applications Lab Department of Electronic Engineering National Taiwan University of Science and Technology,
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Workload-aware Optimization of Query Routing Trees in Wireless Sensor Networks Panayiotis Andreou (Univ. of Cyprus) Demetris Zeinalipour-Yazti (Open Univ.
MINT Views: Materialized In-Network Top-k Views in Sensor Networks Demetrios Zeinalipour-Yazti (Uni. of Cyprus) Panayiotis Andreou (Uni. of Cyprus) Panos.
MicroHash:An Efficient Index Structure for Flash-Based Sensor Devices Demetris Zeinalipour [ ] School of Pure and Applied Sciences.
ETC: Energy-driven Tree Construction in Wireless Sensor Networks Panayiotis Andreou (Univ. of Cyprus) Andreas Pamboris (Univ. of California – San Diego)
Overview of Physical Storage Media
Sorting.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
Indexing.
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
Database Management Systems, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13.
Design of Flash-Based DBMS: An In-Page Logging Approach Sang-Won Lee and Bongki Moon Presented by Chris Homan.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13 (Sec ): Ramakrishnan & Gehrke and Chapter 11 (Sec ): G-M et al. (R2) OR.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
MicroHash:An Efficient Index Structure for Flash-Based Sensor Devices Demetris Zeinalipour [ ] School of Pure and Applied Sciences.
1 External Sorting. 2 Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing gpa order.
1 Database Systems ( 資料庫系統 ) December 7, 2011 Lecture #11.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
CS411 Database Systems Kazuhiro Minami 11: Query Execution.
연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin
Katie Hintze.  A non-volatile storage device that stores digitally encoded data  Introduced in 1956  Long-term persistent storage.
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
DMBS Architecture May 15 th, Generic Architecture Query compiler/optimizer Execution engine Index/record mgr. Buffer manager Storage manager storage.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
External Sorting Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
Data Storage and Querying in Various Storage Devices.
Demetrios Zeinalipour-Yazti (Univ. of Cyprus)
Lecture 16: Data Storage Wednesday, November 6, 2006.
Chapter 12: Query Processing
Lecture 11: DMBS Internals
Database Management Systems (CS 564)
Computer Application Waseem Gulsher
Lecture 2- Query Processing (continued)
Chapter 12 Query Processing (1)
External Sorting.
Sorting We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each.
Database Systems (資料庫系統)
External Sorting Dina Said
CSE 190D Database System Implementation
Presentation transcript:

FSort: External Sorting on Flash-based Sensor Devices Panayiotis Andreou, Orestis Spanos, Demetrios Zeinalipour-Yazti, George Samaras University of Cyprus, Cyprus Panos K. Chrysanthis University of Pittsburgh, USA 6 th Intl. Workshop on Data Management for Sensor Networks (DMSN’09), in conjunction with VLDB’09, Lyon, France, 2009 DMSN’2009, Lyon, France © Andreou, Spanos, Zeinalipour-Yazti, Samaras, Chrysanthis

512 MB Motivation Wireless Sensor Devices have evolved from primitive devices into powerful computing devices. More Computation Power and More Storage! Mica2 / Mica2Dot: Imote-2: 32MB Flash Imote-2 Multimedia Board SunSpot: 4MB Flash Applications: Remote Logging / Environmental Monitoring Applications: Digital Image Processing, etc. 128KB Flash Applications: People- Centric Sensing (sensing data by people for people) IPhone: 32GB Flash (GPS, camera, mic ambient light, 3-axis accelerometer, digital compass, …) HTC Magic 512MB Flash RISE: 1GB Flash

3 Motivation As data storage mediums continue to grow, Sensor Device Developers tend to store more data in-situ, e.g., –To increase the Sensing Fidelity Capture data locally, preprocess it, then transmit the useful information over to the sink, e.g., sound detection application –To overcome the problem of a Distant Sink Applicable in Mobile Sensor Networks Storing large quantities of data locally on each sensor calls for database techniques … –Indexing (e.g., “MicroHash”, In USENIX FAST’05) –Architectures (e.g., FlashDB, In IPSN’07) –External Sorting (e.g., FSort, In DMSN’09) –…. MilliBots (CMU)

4 Sorting on Sensors: Why is it Necessary? C) “Find GPS locations where the temperature is in the range A to B” Problem: Temperature Data needs be sorted on the given attribute or some index structure might be necessary (e.g., B+tree) Other Examples: JOIN the readings of two sensors, Sort local Images, Remove duplicates,. “Find top-k rooms with the lowest noise” … X,Y,30 o C X,Y,25 o C X,Y,23 o C..... # # # # # # # # a) Acquire Data b) Store it Locally

5 Background: Flash Memory The most prevalent storage medium used for Sensor Devices is Flash Memory (NAND Flash) Fastest Growing Memory Market (‘05 $8.7B, ’06 $11B) Surface Mount FlashRemovable Flash

Background: Flash Memory Simple Cell Architecture  High capacity in a small surface  Economical Reproduction Fast Random Access (50-80 μs) compared to 10-20ms in Magnetic Disks Shock Resistant & Power Efficient  Appropriate for Sensor and Mobile Devices

7 Background: Characteristics of Flash 1.Delete-Constraint: Erasure of a page can only be performed at a block granularity (i.e. 8KB~64KB) 2.Write-Constraint: Data is written to flash on a page granularity (256B~512B) to an empty page (delete page if necessary). 3.Wear-Constraint: Each page can only be written a limited number of times (typically 10, ,000) 4.Asymmetric Read/Write: Reading Faster (Cheaper) than Writing (23mJ vs. 763mJ on the RISE sensor) X…

8 Presentation Outline 1.Motivation and Background 2.Overview of Sorting on Sensor Devices 3.The FSort External Sorting Algorithm 4.Experimental Evaluation 5.Conclusions and Future Work

Sorting can be performed in two modes: Online Sorting –Keep a target sequence sorted at all times. –i.e., upon reception of a new data record (e.g., after data acquisition), sort sequence using the InsertionSort idea. Offline Sorting –Log data records in an Unsorted File until Sorting function is requested. –Problem: Unsorted File might be too large to fit in main memory, thus In-Memory Sorting (using quicksort, mergesort, etc) is not feasible –Solution: Use an External Sorting Algorithm 9 Sorting on Sensor Devices

10 Inefficiency of Online Sorting … Data page Empty page 5 New page Flash memoryMain Memory … … 5 Block Erase … Reads + 7 Writes + 2 Block Erases Decreased Performance + Decreased Lifetime Online Sorting is Inefficient! For 1 write it took us: ABC

11 Presentation Outline 1.Motivation and Background 2.Overview of Sorting on Sensor Devices 3.FSort External Sorting Algorithm 4.Experimental Evaluation 5.Conclusions and Future Work

12 FSort Algorithm FSort is an external sorting algorithm appropriate for flash-media on sensor devices Characteristics Offline External Sorting Algorithm –It sorts on-demand, arbitrary large sequences with a fixed-sized (main) memory. Appropriate for Sensor Devices –Its performance is boosted by the temporal coherence in sensor readings (i.e., values that don’t change quickly over time). Designed for Flash –It deletes blocks sequentially –It minimizes read/write/block erase operations.

13 FSort: Outline of Operation FSort consists of two phases: Sorting phase: –Perform a Scan of the Flash media using M block iterators. –Use the Replacement-Selection concept to produce Sorted Runs (Sequences) –Append the Sorted Runs to the end of the Flash Media Merging phase –Merge the sorted runs recursively until a single sorted output is produced. Main memory Buffer Flash

14 FSort: Sorting Phase Flash Media Main Memory 7453 Flash Media a) Initialize B-1 Block Iterators. b) Write next smallest elem. >= current elem. to run X Run X Run X+1 If above not possible, output next elem. to run X+1

15 FSort: Merging Phase Example: B Buffer Pages (Input: B-1, Output: 1) Flash Media Number of Passes over Data? 2 times

16 FSort and the Delete Constraint A block is deleted only whenever all pages within the given block are empty –This satisfies the Flash Delete Constrain Block Erase Additionally, due to temporal coherence (i.e., values within a block are similar) blocks are usually freed up quickly.

17 Presentation Outline 1.Motivation and Background 2.Overview of Sorting on Sensor Devices 3.FSort External Sorting Algorithm 4.Experimental Evaluation 5.Conclusions and Future Work

18 Experimental Evaluation Testbed: Trace-driven Simulator –Written in C++ (idea also prototyped in LiteOS/C on IRIS Sensor) –Emulates Flash (Page=512B, Block=16KB) –Measures: # of read (1.17mA), write (37mA), erases (18mA) Energy Consumption Energy = Volts x Amperes x Seconds Dataset: Intel Research Berkeley –Temperature, Humidity, Light, Vol, etc. –2.3 million readings Compared Algorithms –Online vs. Offline Sorting InsertionSort vs. External Merge Sort –Comparison of Offline Sorting External Merge Sort vs. FSort

19 Online vs. Offline Sorting a)Online Sorting is 2 Orders slower (less efficient) than Offline Sorting b)Same number of Read/Write operations, yet writing is overall more Expensive Sorting 1000 records (1 record / page) 96mJ 3052mJ RISE Energy Model Write: 37mA Read: 1.17mA

20 External MergeSort vs. FSort FSort is 25-40% more efficient than External MergeSort. Task: Sort records (on temperature) after adding 1000 measurement to local flash media.

21 External Merge Sort Example: B Buffer Pages (Input: B-1, Output: 1) Flash MediaMain MemoryFlash Media Drawback of External Merge Sort: Small Runs  Here 4 rather than ~8 (Fsort),  In our experiments we found that they are 2-3 times smaller than FSort

22 Presentation Outline 1.Motivation and Background 2.Overview of Sorting on Sensor Devices 3.FSort External Sorting Algorithm 4.FSort Experimental Evaluation 5.Conclusions and Future Work

23 Conclusions and Future Work We have shown that FSort improves response time and energy consumption for sorting by % compared to EMS. In the future we plan to: –Formally analyze the performance of the FSort algorithm. –Empirically assess its performance on real devices and storage mediums (External Flash cards, Smart-phones, SSDs, …) –Study the utilization of complementary structures for sorting (e.g., indexes)

FSort: External Sorting on Flash-based Sensor Devices Thank you! This presentation will be available at: Questions? DMSN’2009, Lyon, France © Andreou, Spanos, Zeinalipour-Yazti, Samaras, Chrysanthis