Parallel Data Laboratory, Carnegie Mellon University

Slides:



Advertisements
Similar presentations
Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Advertisements

The TickerTAIP Parallel RAID Architecture P. Cao, S. B. Lim S. Venkatraman, J. Wilkes HP Labs.
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Disklets –Take streams as inputs, generate streams as outputs –Streams accessed using interface that delivers data in buffers with known size –Cannot allocate.
Active Disks: Programming Model, Algorithm and Evaluation Anurag Acharya, Mustafa Uysal, Joel Saltz.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas SDM CENTER.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
CS 440 Database Management Systems Lecture 5: Query Processing 1.
CS 540 Database Management Systems
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Tackling I/O Issues 1 David Race 16 March 2010.
Introduction to Computers - Hardware
OPERATING SYSTEM CONCEPT AND PRACTISE
Virtual memory.
Memory Management.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
15.1 – Introduction to physical-Query-plan operators
CS 540 Database Management Systems
CS 440 Database Management Systems
Chilimbi, et al. (2014) Microsoft Research
Parallel Databases.
Lecture 16: Data Storage Wednesday, November 6, 2006.
William Stallings Computer Organization and Architecture 8th Edition
Parallel Processing - introduction
Chapter 1: Introduction
Parallel Programming By J. H. Wang May 2, 2017.
Grid Computing.
Chapter 1: Introduction
Cache Memory Presentation I
Chapter 12: Query Processing
Chapter 1: Introduction
Database Performance Tuning and Query Optimization
Chapter 1: Introduction
Chapter 1: Introduction
Lecture 11: DMBS Internals
Chapter 15 QUERY EXECUTION.
Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz
April 30th – Scheduling / parallel
Operating System Concepts
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 8 11/24/2018.
Akshay Tomar Prateek Singh Lohchubh
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 12/1/2018.
So far… Text RO …. printf() RW link printf Linking, loading
Main Memory Background Swapping Contiguous Allocation Paging
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Chapter 1: Introduction
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
(A Research Proposal for Optimizing DBMS on CMP)
Chapter 8: Memory Management strategies
Chapter 1: Introduction
Chapter 1: Introduction
CSC3050 – Computer Architecture
Chapter 11 Database Performance Tuning and Query Optimization
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 4/5/2019.
Virtual Memory: Working Sets
Database System Architectures
Chapter 1: Introduction
Database System Architecture
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Parallel Data Laboratory, Carnegie Mellon University Active Disks Parallel Data Laboratory, Carnegie Mellon University www.pdl.cs.cmu.edu/Active Storage Systems CSE 598D - Yogesh Sreenivasan Page 1

References Active Disks: Programming Model, Algorithms and Evaluation - Anurag Acharya, Mustafa Uysal and Joel Saltz Active Disks for Large-Scale Data Processing - Erik Riedel, Christos Faloutsos, Garth A. Gibson and David Nagle

Motivation Data Storage is growing more rapidly than processor speeds, with this trend processors may not be able to keep up with growing requirements. CPU Speeds double every 18 months (Moore’s Law) Customers double their data every 5 months - Greg With improvements in data transfer rates, even a state-of- the-art processor can keep only a small number of drives busy. Not able to exploit the parallelism provided by the disks

Proposed Solution Offload bulk of data processing to the disk- resident processors. Host Processor mainly for coordination, scheduling and combining the results. Active Disks Execute application-level functions directly at the disks. Send only the results back to the host. CPU System Bus RAM Disk Pgm Host-Resident Component Disklets Active Disks

Advantages Leverage Parallelism offered in large storage systems. More aggregate CPU power in the disks, than in the server. Reduce interconnect Bandwidth. Disk based filtering discards fraction of data that would otherwise move across the interconnect. Appropriate Applications Execution time dominated by data-intensive core. Allows parallel implementation of core. Reasonable selectivity of processing.

Architecture Application Program partitioned into, Disklets Host-resident Component Disk-resident Component(“Disklets”) Disklets Downloaded from host, does major share of processing. Stream Based Programming model: Reads input streams from disk Writes output streams to disk Optional scratch space. Disklet cannot initiate I/O operations or allocate free memory.

Programming Model Stream Based Programming model for disklets and their interaction with host-resident peers. Three types of stream, Disk-resident streams - Files Host-resident streams - interact with disklets Pipe streams - pipe results among disklets. Communication between a disklet and its environment is restricted to its input and output streams.

Disk/Host-Level OS Disk OS Host-level OS Memory Management - allocation and management of buffers Stream Communication - possible to overlap data movement and communication Disklet Scheduling - disklet ready to run whenever there is data available Host-level OS Installation of disklets - analyzing disklet code to ensure safety. Management of host-resident buffers.

Algorithms SQL SELECT Filters tuples from a relation based on a user-specified predicate. SELECT id_1, id_2 FROM table_1 WHERE <condition> Conventional Disk Algorithm Read all tuples from the disk, keep only those that match the predicate. Active Disk Algorithm Disklet reads (on disk) tuples from input stream (file), write matching tuples to output stream. Send only the matching results to the host when finished (or partial results when output stream fills up)

Algorithms SQL GROUP-BY Partitions the relation into disjoint sets of tuples based on the value(s) of index attribute(s). SELECT MAX(id_1), id_2 FROM table_1 GROUP-BY id_2 Conventional Disk Algorithm Read all tuples from the disk, accumulate group-by results at the host. Active Disk Algorithm Perform local group-by at the disk, send the final vector of aggregates. Ship partial results, when out of space.

Algorithms External sort Datacube Image convolution To sort database tuples when main memory is not enough to cache all the records. Datacube General form of aggregation for relational databases (multidimensional aggregation). Perform GROUP-BY on multiple indexes. Image convolution Image processing operations like edge detection, gradient detection, smoothing and blurring etc. Generating composite satellite images Generate earth images by compositing satellite-based data acquired over multiple days.

Evaluation Experiments conducted for two configurations Systems that can be purchased at that time. Systems that are likely to be available by the end of the decade. Experiments performed with number of disks varying from 4 to 32 to explore scalability.

Performance Utility of Active Disks Performance improvements of 1.07 to 3.15 times for Current configurations. Performance improvements of 1.18 to 3.2 times for predicted future configurations.

Performance Variation in interconnect bandwidth Active Disks outperforms conventional disks for all configurations. Algorithms with significant computation per byte transferred cannot take advantage of faster interconnect.

Performance Scalability For conventional disk architectures, increasing the number of disks doesnot have any advantage. Processor can keep only a small number of drives busy Active disks scale well up to 16 disks

Performance Impact of host processor For large configurations active disk architecture outperforms conventional disks architectures. Disk processor is assumed to be 200MHz For sort algorithm, performance improves with increase in host processor speed for active disk architectures.

Performance Model Host saturation - interconnect or server bottleneck Disk saturation - raw disk limitation Disk-CPU saturation - active disks CPU limitation Transfer saturation - active disks saturate their interconnect

Additional Applications Nearest neighbor search Determines k items in a database that are closest to a particular input item. Target record is sent to each of the disks, and each disk returns the k closest records in its portion. Server combines the results to find the overall k closest records. Edge Detection Active disk system performs edge detection for each image directly at the drives and returns only the edges to the server. Image Registration Find rotation and translation from a reference image. Similar to edge detection.

Performance With more disks, active disk systems achieves a much higher throughput. Even in CPU-bound tasks, active disks show linear or near linear improvement with increase in number of disks.

Conclusion Active Disk systems, Can improve throughput in, Offer parallelism available in large storage systems Dramatically reduces interconnect bandwidth Can improve throughput in, Data mining and multimedia applications Applications that can leverage parallelism and selectivity. Active Disks or PC Clusters?