Hamdy Fadl1 بسم الله الرحمن الرحيم Stork data scheduler: mitigating the data bottleneck in e-Science Hamdy Fadl 500512011 Dr. Esma YILDIRIM.

Slides:

Advertisements

Similar presentations

Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,

Advertisements

Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.

Lecture 11: Operating System Services. What is an Operating System? An operating system is an event driven program which acts as an interface between.

AT LOUISIANA STATE UNIVERSITY CCT: Center for Computation & LSU Stork Data Scheduler: Current Status and Future Directions Sivakumar Kulasekaran.

Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.

Distributed Multimedia Systems

End-to-end Data-flow Parallelism for Throughput Optimization in High-speed Networks Esma Yildirim Data Intensive Distributed Computing Laboratory University.

Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,

A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.

Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.

© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 OSI Transport Layer Network Fundamentals – Chapter 4.

Technical Architectures

GridFlow: Workflow Management for Grid Computing Kavita Shinde.

Quality of Service in IN-home digital networks Alina Albu 7 November 2003.

End-to-End Analysis of Distributed Video-on-Demand Systems Padmavathi Mundur, Robert Simon, and Arun K. Sood IEEE Transactions on Multimedia, February.

1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.

Web-Conscious Storage Management for Web Proxies Evangelos P. Markatos, Dionisios N. Pnevmatikatos, Member, IEEE, Michail D. Flouris, and Manolis G. H.

PRASHANTHI NARAYAN NETTEM.

DISTRIBUTED COMPUTING

Ch. 28 Q and A IS 333 Spring Q1 Q: What is network latency? 1.Changes in delay and duration of the changes 2.time required to transfer data across.

RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.

1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.

Switching Techniques Student: Blidaru Catalina Elena.

Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı

Distributed Quality-of-Service Routing of Best Constrained Shortest Paths. Abdelhamid MELLOUK, Said HOCEINI, Farid BAGUENINE, Mustapha CHEURFA Computers.

Chapter 4: Managing LAN Traffic

Computer System Architectures Computer System Software

Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.

Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.

QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.

Processes and OS basics. RHS – SOC 2 OS Basics An Operating System (OS) is essentially an abstraction of a computer As a user or programmer, I do not.

Challenges towards Elastic Power Management in Internet Data Center.

File and Object Replication in Data Grids Chin-Yi Tsai.

Switching breaks up large collision domains into smaller ones Collision domain is a network segment with two or more devices sharing the same Introduction.

 Circuit Switching  Packet Switching  Message Switching WCB/McGraw-Hill  The McGraw-Hill Companies, Inc., 1998.

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

Computer Networks with Internet Technology William Stallings

By Phani Gowthami Tammineni. Overview This presentation is about the issues in real-time database systems and presents an overview of the state of the.

STORK: Making Data Placement a First Class Citizen in the Grid Tevfik Kosar and Miron Livny University of Wisconsin-Madison March 25 th, 2004 Tokyo, Japan.

TCP Trunking: Design, Implementation and Performance H.T. Kung and S. Y. Wang.

PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.

The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.

Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.

ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.

Packet switching network Data is divided into packets. Transfer of information as payload in data packets Packets undergo random delays & possible loss.

Data Sharing. Data Sharing in a Sysplex Connecting a large number of systems together brings with it special considerations, such as how the large number.

Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison Managing and Scheduling Data.

The concept of RAID in Databases By Junaid Ali Siddiqui.

Chapter 24 Transport Control Protocol (TCP) Layer 4 protocol Responsible for reliable end-to-end transmission Provides illusion of reliable network to.

O PERATING S YSTEM. What is an Operating System? An operating system is an event driven program which acts as an interface between a user of a computer,

Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

A Fully Automated Fault- tolerant System for Distributed Video Processing and Offsite Replication George Kola, Tevfik Kosar and Miron Livny University.

1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.

CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.

Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – , India Multimedia Retrieval Architecture.

Switching. Circuit switching Message switching Packet Switching – Datagrams – Virtual circuit – source routing Cell Switching – Cells, – Segmentation.

CHARACTERIZING CLOUD COMPUTING HARDWARE RELIABILITY Authors: Kashi Venkatesh Vishwanath ; Nachiappan Nagappan Presented By: Vibhuti Dhiman.

Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.

Data Communication Networks Lec 13 and 14. Network Core- Packet Switching.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.

A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.

Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.

Database System Architectures

Computer Networks Protocols

Efficient Migration of Large-memory VMs Using Private Virtual Memory

Towards Predictable Datacenter Networks

Presentation transcript:

Hamdy Fadl1 بسم الله الرحمن الرحيم Stork data scheduler: mitigating the data bottleneck in e-Science Hamdy Fadl Dr. Esma YILDIRIM

Hamdy Fadl2 Outline - Introduction - The motivation of the paper - what is stork - Previous studies - Efforts to achieve reliabe and effecient data placement - Why stork data scheduler - Initial end-to-end implementation using Stork - Piplineing - Estimation of data movement speed -The effect of connection time in data transfers. - Two unique features of Stork: - Request aggregation - Throughput optimization - Concurrency vs. Parallelism - Conclusion

Hamdy Fadl3

4 Introduction The inadequacy of traditional distributed computing systems in dealing with complex data-handlingThe inadequacy of traditional distributed computing systems in dealing with complex data-handling problems in our new data-rich world has motivated this new paradigm Traditional distributed computing Data (resources 2 nd class entity, access is a side-effect) placement ( delay, its script privileges) The DOE report says: –‘It is essential that at all levels data placement tasks be treated in the same way computing tasks are treated’ research and development is still required to create schedulers and planners for storage space allocation and the transfer of data

Hamdy Fadl5 The motivation of the paper Enhancing the data management infrastructure since: The users focus their attention on the information rather than how to discover, access, and use it.

Hamdy Fadl6 What is Stork ?Stork Stork is considered one of the very first examples of ‘data-aware scheduling’ and has been very actively used : in many e-Science application areas including: –coastal hazard prediction –storm surge modeling ; –oil flow and reservoir uncertainty analysis ; –Digital sky imaging ; –educational video processing and behavioral assessment ; –numerical relativity and black hole collisions ; –and multi-scale computational fluid dynamics

Hamdy Fadl7 Using Stork, the users can transfer very large datasets via a single command

Hamdy Fadl8 Previous Studies scheduling of data storage and networking resources and optimization of data transfer tasksproblemSeveral previous studies address data management for large-scale applications. However, scheduling of data storage and networking resources and optimization of data transfer tasks has been an open problem.

Hamdy Fadl9 Previous studies Studies for optimal number of streams Hacker et al. model can’t predict congestion Lu et al need two different throughput measurements of different stream numbers to predict the others. Others, three streams are sufficient to get a 90 per cent utilization.

Hamdy Fadl10 The studies for optimal number of streams for data scheduling are limited and, based on approximate theoretical models None of the existing studies are able to accurately predict the optimal number of parallel streams for best data throughput in a congested network.None of the existing studies are able to accurately predict the optimal number of parallel streams for best data throughput in a congested network. Previous Studies

Hamdy Fadl11 DRS :data replication service LDR :lightweight data replicator RFT :reliable file transfer GridFTP (7app) RFTLDRDRS Grid file transfer protocol Byte streams to be transferred in a reliable manner secure and reliable data transfer protocol especially developed for high-bandwidth WANs. Effort to achieve reliable and efficient data placement

Hamdy Fadl12 Why Stork data scheduler! Implements Techniques for Scheduling Queuing optimization of data placement job

Hamdy Fadl13 Why Stork data scheduler! provides: high reliability in data transfers and creates a level of abstraction between the user applications and the underlying data transfer and storage resources via a modular, uniform interface. –R–R–R–Reliability –a–a–a–abstraction

Hamdy Fadl14 completionThe checkpointing, error recovery and retry mechanisms ensure the completion of the tasks even in the case of unexpected failures. Multi-protocol support makes Stork a very powerful data transfer tool. fall-backThis feature allows Stork not only to access and manage different data storage systems, but also to be used as a fall-back mechanism when one of the protocols fails in transferring the desired data. Why Stork data scheduler!

Hamdy Fadl15 Optimizations such as concurrent transfers, parallel streaming,request aggregation and data fusion provide enhanced performance compared with other data transfer tools. Interact with higher level planners and workflow managers for the coordination of compute and data tasks. This allows the users to schedule both CPU resources and storage resources asynchronously as two parallel universes, overlapping computation and I/O. Why Stork data scheduler!

Hamdy Fadl16 Initial end-to-end implementation using Stork data subsystem (data management and scheduling) compute subsystem (compute management and scheduling)

Hamdy Fadl17 Initial end-to-end implementation using Stork In cases where multiple workflows need to be executed on the same system, users may want to prioritize among multiple workflows or make other scheduling decisions. This is handled by the workflow scheduler at the highest level. An example for such a case would be hurricanes of different urgency levels arriving at the same time.

Hamdy Fadl18 Pipelining Generic 4- stage pipeline; the colored boxes represent instructions independent of each other different types of jobs and execute them concurrently while preserving the task sequence and dependencies different types of jobs and execute them concurrently while preserving the task sequence and dependencies,

Hamdy Fadl19 How to use Pipelining here ? large pipeline consisting of many tasks divided into sub- stages A distributed workflow system can be viewed as a large pipeline consisting of many tasks divided into sub- stages, where the main bottleneck is remote data access/retrieval owing to network latency and communication overhead.

Hamdy Fadl20 How to use Pipelining here ? By Pipelining we can order and schedule data movement jobs in distributed systems independent of compute tasks to exploit parallel execution while preserving data dependencies. necessary data-placement stepsThe scientific workflows will include the necessary data-placement steps, such as – stage-in and stage-out, as well as other important steps that support data movement, – such as allocating and de-allocating storage space for the data and reserving and releasing the network links.

Hamdy Fadl21 Estimation of data movement speed In order to estimate the speed at which data movement can take place, it is necessary to estimate the bandwidth capability at the source storage system, at the target storage system and the network in between.

Hamdy Fadl22 The effect of connection time in data transfers. +Plus symbols, file size 10MB (rrt5.131 ms); X crosses, file size 10MB (rrt ms); * asterisks, file size 100MB (rrt ms); □squares, file size 100MB (rrt ms); ○circles, file size 1GB (rrt ms).

Hamdy Fadl23 The effect of connection time in data transfers. Plus symbols, file size 10MB(rrt ms); crosses, file size 10MB (rrt ms); asterisks, file size 100MB (rrt ms); squares, file size 100MB (rrt ms); circles, file size 1GB (rrt ms); filled squares, file size1GB (rrt ms).

Hamdy Fadl24 Two unique features of Stork: aggregation of data transfer jobs considering their source and destination addresses and the application-level throughput estimation and optimization service. –A–Aggregation –E–Estimation and optimization

Hamdy Fadl25 Stork Feature : Request aggregation In order to minimize the overhead of connection set-up and tear-down for each transfer. (MP, insp) data placement jobs are combined and embedded into a single request in which we increase the overall performance especially for transfer of datasets with small file sizes. –minimize the overhead –increase the overall performance

Hamdy Fadl26 How to aggregate According to the file size and source/destination pairs, data placement requests are combined and processed as a single transfer job. Information about aggregated requests is stored in a transparent manner. A main job that includes multiple requests is defined virtually and it is used to perform the transfer operation Therefore, users can query and get status reports individually without knowing that their requests are aggregated and being executed with others.

Hamdy Fadl27 How to aggregate Req1 Size 1 Source/Dest.1 Req2 Size 1 Source/Dest.1 Req3 Size 1 Source/Dest.1 Req4 Size 1 Source/Dest.1 Aggregated Request

Hamdy Fadl28 How to aggregation Connection time is small but it becomes quite important if there are hundreds of jobs to be scheduled. We have minimized the load put by tsetup by multiple transfer jobs. Instead of having separate set-up operations such as: – t1 =tsetup + t1transfer, t2 =tsetup +t2transfer,..., tn =tsetup + tntransfer, we aggregate multiple requests to improve the total transfer time –t =tsetup + t1transfer + t2transfer +· · ·+tntransfer.

Hamdy Fadl29 Benifits of aggregation benefits in terms of improving the overall scheduler performance, by reducing the total number of requests that the data scheduler needs toAggregating data placement jobs and combining data transfer requests into a single operation also has its benefits in terms of improving the overall scheduler performance, by reducing the total number of requests that the data scheduler needs to execute separately. –Instead of initiating each request one at a time, it is more beneficial to execute them as a single operation if they are requesting data from the same storage site or using the same protocol to access data.

Hamdy Fadl30 Request aggregation and the overhead of connection time in data transfers. fnum, number of data files sent in a single transfer operation. The test environment includes host machines from the Louisiana Optical Networking Initiative (LONI) network.LONInetwork Transfer operations are performed using the GridFTP protocol. The average round-trip delay times between test hosts are and ms As latency increases, the effect of overhead in connection time increases. We see better throughput results with aggregated requests. The main performance gain comes from decreasing the amount of protocol usage and reducing the number of independent network connections. Results:

Hamdy Fadl31 Note about aggregation results: We have successfully applied job aggregation in the Stork scheduler such that the total throughput is increased by reducing the number of transfer operations

Hamdy Fadl32 Stork Feature: Throughput optimization the effective use of available network throughput and optimization of data transfer speedend-to-end application performance.In data scheduling, the effective use of available network throughput and optimization of data transfer speed is crucial for end-to-end application performance.

Hamdy Fadl33 How to do throughput optimization By opening parallel streams and setting the optimal parallel stream number specific to each transfer. The intelligent selection of this number is based on a novel mathematical model they have developed that predicts the peak point of the throughput of parallel streams and the corresponding stream number for that value.

Hamdy Fadl34 The throughput of n streams (Thn) is calculated by the following equation: The unknown variables a, b and c are calculated based on the throughput samplings of three different parallelism data points. The throughput increases as the stream number is increased, it reaches its peak point either when congestion occurs or the end-system capacities are reached. Further increasing the stream number does not increase the throughput but it causes a drop down owing to losses.

Hamdy Fadl35 The characteristics of the throughput curve for local area network (LAN) and WAN transfers LAN–LAN Newton’s method model Red solid line, GridFTP Green dashed line, Newton 1_8_16; Blue dashed line, Dinda 1_16.

Hamdy Fadl36 The characteristics of the throughput curve for local area network (LAN) and WAN transfers LAN–WAN Newton’s method model.

Hamdy Fadl37 Notes about The characteristics of the throughput curve In both cases, the throughput has a peak point that indicates the optimal parallel stream number and it starts to drop down for larger numbers of streams.

Hamdy Fadl38 Concurrency In cases where network optimization is not sufficient and the end-system characteristics play a decisive role in limiting the achievable throughput, making concurrent requests for multiple data transfers can improve the total throughput. The Stork data scheduler also enables the users to set a concurrency level in the server set-up and provides multiple data transfer jobs to be handled concurrently. –However, it is up to the user to set this property.

Hamdy Fadl39 Concurrency vs. Parallelism Parallelism : you have a single file and you divide this file to multiple stream ( for example send it to a TCP connection) Concurrency : we have multiple files are going to multiple destinations

Hamdy Fadl40 Concurrency versus parallelism in memory-to-memory transfers Dsl-condor–Eric 100 Mbps. Fig 3.a Red solid line, con-level=1; Green dashed line, con-level=2; Blue dashed line, con-level=4. In figure 3.a, increasing the concurrency level does not provide a more significant improvement than increasing the parallel stream values. Thus, the throughput of four parallel streams is almost as good as the throughput of concurrency levels 2 and 4 because the throughput is bound by the interface and the CPU is fast enough and does not present a bottleneck. Results

Hamdy Fadl41 Concurrency versus parallelism in memory-to-memory transfers Oliver–Poseidon 10 Gbps. Fig 3.b Red solid line, con-level=1; Green dashed line, con-level=2; Blue dashed line, con-level=4. With parallel streams, they were able to achieve a throughput value of 6 Gbps. However, when they increase the concurrency level, they were be able to achieve around 8 Gbps with a combination of a concurrency level of 2 and a parallel stream number of 32. This significant increase is due to the fact that a single CPU reaches its upper limit with a single request but, through concurrency, multiple CPUs are used until the network limit is reached. Results

Hamdy Fadl42 concurrency for disk-to-disk transfers The effect of concurrency is much more significant for disk-to-disk transfers where multiple parallel disks are available and managed by parallel file systems.

Hamdy Fadl43 Concurrency versus parallelism in disk-to-disk transfers Red solid line, conlevel/datasize=1/12G; green dashed line, 2/6G; blue dashed line, 4/3G; magenta dashed line, 6/2G; cyan dashed-dotted line, 8/1.5G; yellow dashed dotted line, 10/1.2G. Total throughput Fig 4.a Parallel streams improve the throughput for a single concurrency level by increasing it from 500 to 750 Mbps. However, owing to serial disk access, this improvement is limited. concurrencythroughput Only by increasing the concurrency level can we improve the total throughput. The throughput in figure 4a is increased to 2 Gbps at a concurrency level of 4. After that point, increasing the concurrency level causes the throughput to be unstable with sudden ups and downs in throughput; however, it is always around 2 Gbps. This value is due to the end-system CPU limitations. If we increase the node number better throughput results could be seen. As we look into the figure for average throughput, the transfer speed per request falls as we increase the concurrency level (figure 4b). Average throughput. Fig 4.b

Hamdy Fadl44 Concurrency versus parallel stream optimization in the Stork data scheduler Total throughput; fig 5.a Average Throughput. Fig 5.b Average Stream Number. Fig 5.c The optimization service provides 750 Mbps throughput for a single concurrency level and it increases up to 2 Gbps for a concurrency level of 4. The average throughput decreases more steeply after that level (figure 5b). Also the optimal parallel stream number decreases and adapts to the concurrency level (figure 5.c) From the experiments, it can be seen that an appropriately chosen concurrency level may improve the transfer throughput significantly. Results

Hamdy Fadl45 Conclusion Their results show that the optimizations performed by the Stork data scheduler help to achieve much higher end-to-end throughput in data transfers than non-optimized approaches. They believe that the Stork data scheduler and this new ‘data-aware distributed computing’ paradigm will have an impact on all traditional compute and data intensive e-Science disciplines, as well as new emerging computational areas in the arts, humanities, business and education that need to deal with increasingly large amounts of data.e-Science

Hamdy Fadl46 Thank you