Auburn University http://www.eng.auburn.edu/~xqin COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Multiple Processor Systems
Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir.
COS 461 Fall 1997 Workstation Clusters u replace big mainframe machines with a group of small cheap machines u get performance of big machines on the cost-curve.
LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Jack Lee Yiu-bun, Raymond Leung Wai Tak Department.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs John Oleszkiewicz, Li Xiao, Yunhao Liu IEEE TRASACTION ON PARALLEL.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
Distributed Computing Software based solutions to Parallel Computing.
Performance Evaluation of Load Sharing Policies on a Beowulf Cluster James Nichols Marc Lemaire Advisor: Mark Claypool.
MS 9/19/97 implicit coord 1 Implicit Coordination in Clusters David E. Culler Andrea Arpaci-Dusseau Computer Science Division U.C. Berkeley.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Computer Science Lecture 7, page 1 CS677: Distributed OS Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features: –One.
GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.
The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin
1 Distributed Systems: Distributed Process Management – Process Migration.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Load distribution in distributed systems
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
1 A Framework for Data-Intensive Computing with Cloud Bursting Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The Ohio.
1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.
EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
Beowulf Software. Monitoring and Administration Beowulf Watch 
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.
Scalable and Coordinated Scheduling for Cloud-Scale computing
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Managing Network Resources in Condor Jim Basney Computer Sciences Department University of Wisconsin-Madison
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002.
COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques Dr. Xiao Qin Auburn University
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Tree-Based Networks Cache Coherence Dr. Xiao Qin Auburn University
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Presenter: Yue Zhu, Linghan Zhang A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: a Case Study by PowerPoint.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
COMP 3500 Introduction to Operating Systems TLB and Memory Accesses Dr. Xiao Qin Auburn University Slides.
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
COMP7500 Advanced Operating Systems
CMSC 611: Advanced Computer Architecture
Clustered Web Server Model
Vivek Seshadri 15740/18740 Computer Architecture
Introduction to Load Balancing:
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
April 6, 2001 Gary Kimura Lecture #6 April 6, 2001
PA an Coordinated Memory Caching for Parallel Jobs
Memory Management for Scalable Web Data Servers
Advanced Operating Systems
Module 5: CPU Scheduling
Chapter5: CPU Scheduling
Chapter 6: CPU Scheduling
Process Migration Troy Cogburn and Gilbert Podell-Blume
(A Research Proposal for Optimizing DBMS on CMP)
Energy-Efficient Storage Systems
Transparent Contribution of Memory
Wide Area Workload Management Work Package DATAGRID project
I Datagrid Workshop- Marseille C.Vistoli
Database System Architectures
Week1 software - Lecture outline & Assignments
Performance-Robust Parallel I/O
Transparent Contribution of Memory
Presentation transcript:

Auburn University http://www.eng.auburn.edu/~xqin COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University http://www.eng.auburn.edu/~xqin xqin@auburn.edu Spring, 2012

Current Solutions Disk I/O Systems Limitation Caching Prefetching Parallel I/O Limitation Low level Not Portable

Current Solutions (Cont.) Scheduling/Load balancing Space-sharing (PBS,Backfilling) Time-Sharing Centralized Control (PBS) Distributed Control Coordinated Scheduling (Gang) Non-I/O-aware (Condor, Mosix, DQS, LSF) Disk-I/O-aware Network-I/O-aware load balancing Support Sequential Jobs Support Parallel Jobs Disk-I/O Buffer Management Support Homogeneous Clusters Support Heterogeneous Clusters

System Architecture I/O-intensive jobs Client Services Load Manager Load Manager Load Manager t1 t2 t3 t3 t4 t5 t6 t7 mem disk mem disk mem disk Workstation 1 Workstation 2 Workstation n High Bandwidth network

Methodology I/O Intensive Applications User Specified Access Pattern Data Storage Pattern Measure I/O load Predict Response Time Estimate Overhead Make Decisions Load Balancing Schemes Dispatch and Migration

Outline Motivations A Disk-I/O-Aware Load Balancing Policy with Remote Execution A Disk-I/O-Aware Load Balancing Policy with Preemptive Migration Evaluation of the two Disk-I/O-Aware Policies Load Balancing for Heterogeneous Clusters Contributions and Conclusions

Load Balancing with Remote Execution A newly arrived job Remote Execution Local Execution Node j Node i Running jobs Running jobs High Bandwidth Network

The IOCM-RE Scheme A new parallel job Yes. Find candidate remote nodes Select candidate nodes, balance I/O load I/O overloaded ? no no yes Select candidate nodes, balance memory load mem overloaded ? no no yes Select candidate nodes, balance CPU load CPU overloaded ? no no Remotely Execute Locally execute

Explicit I/O load Explicit I/O load = I/O access rate  (1 - buffer hit rate) Applications Applications data data Probability that data is NOT in the I/O buffer of node i Disk I/O Buffer Disk I/O Buffer or data Data is NOT in the buffer Data is in the buffer

Implicit I/O load Given a task s running on node i: Implicit I/O load induced by page faults Given a task s running on node i: Memory space requested by the running tasks Available user memory space if otherwise Page fault rate of task s

Overall I/O load Implicit I/O load induced by page faults Explicit I/O load resulting from tasks accessing disks. I/O load index of node i Implicit I/O load of task s running on node i Explicit I/O requirement of task s