Dynamic Data Layout Optimization for High Performance Parallel I/O

Slides:



Advertisements
Similar presentations
Scheduling in Distributed Systems Gurmeet Singh CS 599 Lecture.
Advertisements

1 Stochastic Modeling of Large-Scale Solid-State Storage Systems: Analysis, Design Tradeoffs and Optimization Yongkun Li, Patrick P. C. Lee and John C.S.
On Large-Scale Peer-to-Peer Streaming Systems with Network Coding Chen Feng, Baochun Li Dept. of Electrical and Computer Engineering University of Toronto.
Network Coding in Peer-to-Peer Networks Presented by Chu Chun Ngai
International Conference on Supercomputing June 12, 2009
ANALYZING STORAGE SYSTEM WORKLOADS Paul G. Sikalinda, Pieter S. Kritzinger {psikalin, DNA Research Group Computer Science Department.
Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Students: Christopher Blandin and Dylan Machovec.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.
A New Broadcasting Technique for An Adaptive Hybrid Data Delivery in Wireless Mobile Network Environment JungHwan Oh, Kien A. Hua, and Kiran Prabhakara.
The Hardness of Cache Conscious Data Placement Erez Petrank, Technion Dror Rawitz, Caesarea Rothschild Institute Appeared in 29 th ACM Conference on Principles.
Continuous Retrieval of Replicated Data from Heterogeneous Storage Arrays 9/10/2014 Nihat Altiparmak and Ali Saman Tosun Mascots 2014.
Information Retrieval from Data Bases for Decisions Dr. Gábor SZŰCS, Ph.D. Assistant professor BUTE, Department Information and Knowledge Management.
Understanding Intrinsic Characteristics and System Implications of Flash Memory based Solid State Drives Feng Chen, David A. Koufaty, and Xiaodong Zhang.
Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li Pusan National University.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
File Processing - Database Overview MVNC1 DATABASE SYSTEMS Overview.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Mining High Utility Itemset in Big Data
Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data Nihat Altiparmak, Ali Saman Tosun The University of Texas at San.
A fast algorithm for the generalized k- keyword proximity problem given keyword offsets Sung-Ryul Kim, Inbok Lee, Kunsoo Park Information Processing Letters,
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
1 Murat Ali Bayır Middle East Technical University Department of Computer Engineering Ankara, Turkey A New Reactive Method for Processing Web Usage Data.
Solving the Maximum Cardinality Bin Packing Problem with a Weight Annealing-Based Algorithm Kok-Hua Loh University of Maryland Bruce Golden University.
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
We used ns-2 network simulator [5] to evaluate RED-DT and compare its performance to RED [1], FRED [2], LQD [3], and CHOKe [4]. All simulation scenarios.
Static Process Scheduling
Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.
Massively Distributed Database Systems Broadcasting - Data on air Spring 2015 Ki-Joune Li Pusan National University.
Static Timing Analysis
Embedded System Lab. Jung YoungJin Janus: Optimal Flash Provisioning for Cloud Storage Workloads C. Albrecht, A. Merchant, M. Stokely,
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Rethinking RAID for SSD based HPC Systems Yugendra R. Guvvala, Yong Chen, and Yu Zhuang Department of Computer Science, Texas Tech University, Lubbock,
CIS 375 Bruce R. Maxim UM-Dearborn
CMPE Database Systems Workshop June 16 Class Meeting
A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Cohesive Subgraph Computation over Large Graphs
CSCI5570 Large Scale Data Processing Systems
A Simulation Analysis of Reliability in Primary Storage Deduplication
Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University
Operating System I/O System Monday, August 11, 2008.
DISK SCHEDULING FCFS SSTF SCAN/ELEVATOR C-SCAN C-LOOK.
Optimal CyberSecurity Analyst Staffing Plan
Database Performance Tuning and Query Optimization
Juan Rubio, Lizy K. John Charles Lefurgy
Selectivity Estimation of Big Spatial Data
ISP and Egress Path Selection for Multihomed Networks
Supporting Fault-Tolerance in Streaming Grid Applications
Chapter 14 Based on the slides supporting the text
Query-Friendly Compression of Graph Streams
Preference Query Evaluation Over Expensive Attributes
Targeted Association Mining in Time-Varying Domains
ICOM 6005 – Database Management Systems Design
Physical Database Design
Efficient Evaluation of k-NN Queries Using Spatial Mashups
I don’t need a title slide for a lecture
The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’
2018, Spring Pusan National University Ki-Joune Li
Resource Allocation in a Middleware for Streaming Data
Chapter 11 Database Performance Tuning and Query Optimization
Data Placement Problems in Database Applications
Resource Allocation for Distributed Streaming Applications
Parallel Programming in C with MPI and OpenMP
Chapter 1. Formulations.
Survey on Coverage Problems in Wireless Sensor Networks
Outline Introduction Background Distributed DBMS Architecture
Design Tradeoffs for SSD Performance
Presentation transcript:

Dynamic Data Layout Optimization for High Performance Parallel I/O Everett Rush, Bryan Harris, Nihat Altiparmak University of Louisville, USA Ali Saman Tosun UT San Antonio, USA 12/20/2016 HiPC 2016

Outline Background Dynamic Data Layout Optimization Evaluation High Performance Parallel I/O Block Correlations Dynamic Data Layout Optimization Monitoring & Analysis Placement Planning Data Reorganization Evaluation References

High Performance Parallel I/O Five Parallel Disk Accesses One Parallel Disk Access Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Static Data Placement Disk Modulo [Du ’82] RAID [Patterson ’88] Field-wise Exclusive OR [Kim ’88] Hilbert [Faloutsos ’93] Generalized Fibonacci [Prabhakar ’98] AOPT: Almost Optimal [Atallah ’00] Periodic [Altiparmak ’12] 1 1 2 3 4 2 3 4 5 6 7 8 9 10 Dynamic Data Layout Optimization is necessary! 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 One Layout Fits All!

Block Correlations Blocks are correlated if they are requested together [Li ’04] Can exist intra or inter requests Commonly encountered in storage workloads: Correlated blocks should be placed in separate disks!

Dynamic Data Layout Optimization Framework A Generic framework for Self-optimizing Parallel Storage Systems Can be applied to storage arrays, parallel/distributed file systems, key-value stores, internal parallelism of NVM devices for high performance parallel I/O Automatically adapt to skewed, changing, co-existing patterns

Monitoring & Analysis Modules Monitoring Output Disk I/O Monitoring Monitoring of block level I/O requests Such as using an I/O tracing tool like blktrace [Axboe ’07] Creating sessions of block IDs that are requested together Data Analysis Analyze sessions and find block correlations Use Frequent Itemset Mining (FIM) [Borgelt ’12] algorithms to find correlated pairs and their frequency Use support for minimum frequency Analysis Output

Placement Planning Module The aim is placing correlated blocks into separate disks (parallel storage units) Basic Layout Optimization Problem (BLOP): Definition 1: Given a set C of correlated block pairs (i, j), and N disks; plan a placement strategy so that for every block pair (i, j) ∈ C, blocks i and j are stored in different disks. Theorem: BLOP is NP-complete and equivalent to the proper (vertex) k-coloring [Jensen ’11] problem for k = N.

Placement Planning Module BLOP outlines the main purpose, but needs to be modified to be applied in real settings Optimal coloring is generally not feasible (|V| ≫ N) Use soft coloring techniques by minimizing the conflicts [Fitzpatrick ’01] Each disk has a maximum capacity not to be exceeded Use traditional bin-packing techniques Min-Conflict Bin Packing (MCBP) [Khanafer ’12] Definition 2: Given a set I of items i of size wi, N bins of size W, and a conflict graph G = (I,E) where (i, j) ∈ E if items i and j cannot be packed in the same bin, compute the minimum number of conflicts that must occur if the set I is packed in N bins of size W. Theorem 2: MCBP is NP-complete.

Placement Planning Heuristic Start with initial placement Calculate Total Correlation Frequency (TCF) values of each vertex Perform local optimizations in TCF order Consider correlation strengths stored in edge weights in conflict calculation If there are more than one candidate color, consider disk capacities Repeat local optimizations until delta conflicts < ε Worst-case Time Complexity: O(|V|log|V| + |E|)

Data Reorganization Module Aim: Reconsider color-to-disk mapping so that: Each color is mapped to a separate disk The number of block movements are minimized Construct the problem as flow network and solve using min-cost flow techniques [Ford ’62]: Capacities are set to 1 Costs are set to 0 if the edge is not between color and disk to the amount of block movement caused by such mapping, otherwise Push C flows from s to t Worst-case Time Complexity: O(|E|3/2 log(|V|Max(Cost))) [Goldberg ’15]

Additional Optimizations Preserving sequentiality is important for HDDs Solution: Group the sequential blocks from the same HDD and reorganize groups together without breaking their sequentiality Create a single vertex in the correlation graph for each group Update edge weights of a group vertex considering group memberships Set an upper limit for maximum group size in bytes based on the transfer rate of the HDD Larger groups will work against parallel I/O

Evaluation Simulations using DiskSim [Bucy ’08] + SSD patch [Agrawal ’08] HDD-based Storage Array (HSA) topology with 100 HDDs SSD-based All-flash Array (AFA) topology with 14 SSDs Zipf-like distributions [Breslau ’99] to control the skew in access patters Five publicly available [IOTTA] storage workloads from Microsoft for existence/reoccurrence of block correlations, request size, request arrival time/rate, R/W ratio behavior

Evaluation: I/O Performance src2 Trace - AFA Read Performance Write Performance

Evaluation: I/O Performance wdev Trace - HSA Read Performance Write Performance

Evaluation: Migration Cost Migration Amount vs. Overall (R+W) Performance AFA: HSA:

References [Agrawal ’08] N. Agrawal et al., “Design tradeoffs for ssd performance,” in ATC’08: Usenix Annual Technical Conference, Berkeley, CA, USA, 2008. [Altiparmak ’12] N. Altiparmak and A. S. Tosun. “Equivalent disk allocations,” IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 3, 2012. [Atallah ’00] M. J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries, in PODS’00. [Axboe ’07] J. Axboe, blktrace User Guide, Feb 2007, available: http://www.cse.unsw.edu.au/∼aaronc/iosched/doc/blktrace.html [Borgelt ’12] C. Borgelt, “Frequent item set mining,” WIREs Data Mining Knowl. Discov., vol. 2, p. 437456, Nov. 2012. [Breslau ’62] L. Breslau et al., “Web caching and zipf-like distributions: evidence and implications,” in INFOCOM ’99, vol. 1, Mar 1999, pp. 126–134. [Bucy ’08] J. S. Bucy et al., “The disksim simulation environment version 4.0,” Carnegie Mellon University Parallel Data Lab, Tech. Rep., May 2008. [Du ’82] H. C. Du and J. S. Sobolewski. Disk allocation for cartesian product files on multiple-disk systems. ACM Trans. on Database Systems, 7(1):82–101, March 1982. [Faloutsos ’93] C. Faloutsos and P. Bhagwat. Declustering using fractals, in PDIS’93. [Fitzpatrick ’01] S. Fitzpatrick et al., “An experimental assessment of a stochastic, anytime, decentralized, soft colourer for sparse graphs,” in Stochastic Algorithms: Foundations and Applications, 2001, vol. 2264, pp. 49–64. [Ford ’62] L. R. Ford et al., Flows in Networks. Princeton University Press, 1962. [Goldberg ’15] A. V. Goldberg et al., “Minimum Cost Flows in Graphs with Unit Capacities,” in STACS ’15, 2015, pp. 406–419. [IOTTA] SNIA IOTTA Trace Repository, Storage Networking Industry Association, available: http://iotta.snia.org. [Jensen ’11] T. Jensen et al., Graph coloring problems. John Wiley & Sons, 2011. [Khanafer ’12] A. Khanafer et al., “The min-conflict packing problem,” Computers & Operations Research, vol. 39, no. 9, pp. 2122 – 2132, 2012. [Kim’88] M. H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval, in SIGMOD,’88. [Li ’04] Z. Li et al., “C-miner: Mining block correlations in storage systems,” in FAST ’04, Berkeley, CA, USA, 2004, pp. 173–186. [Patterson ’98] D. A. Patterson, .Garth Gibson, and Randy H. Katz “A case for redundant arrays of inexpensive disks (raid),” in SIGMOD ’88, 1988, pp. 109–116. [Prabhakar’98] S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. El Abbadi. Cyclic allocation of two-dimensional data, in ICDE’93.

Thank You! Any Questions?