Download presentation
Presentation is loading. Please wait.
Published byRosaline Cobb Modified over 9 years ago
1
Resource management system for distributed environment B4. Nguyen Tuan Duc
2
Background Emerging need for resource management system of clusters / grids Several systems exist, but have problems… Portable Batch System Sun Grid Engine ….
3
Goal Flexible resource management system Support clusters, grids Fair-share scheduling Maximize utilization of resources Support parallel applications Reduce load aggregation
4
Agenda Background Goal Related works Proposal method Problems
5
Related works Portable Batch System (MRJ 1990s) Batch queuing system Automatic load-balancing Parallel jobs support Job accounting
6
Portable Batch System (PBS)
7
Sun Grid Engine Batch queuing system by Sun Microsystems Same features with PBS, and Job checkpoint Several add-ons
8
Problems of batch queuing systems Resource utilization Load aggregation Server accept too many requests from clients Limit of execution model Cannot fork, since process created with fork() does not go into the queue …
9
Saito Dai ’ s system (STDS) Flexible Resource Management System for Widely Distributed Environment (2006) No load aggregation Job scheduling on each node Independent from execution model (fork, … OK) Support parallel jobs
10
STDS structure Two main components Node searching system (graph searching) Scheduler (on each node) Scheduler Daemon on each node CPU fair-sharing by ‘nice’ Node searching system Create graph from links Node search graph search
11
STD node searching system
12
Our approach Similar to STD system Node searching system Scheduler on each node But different in … Node search: no graph searching Scheduler: kernel scheduler with user accounting (budget scheduler)
13
Scheduler: Budget scheduling Budget scheduling Normal queue & budget queue Normal queue for interactive processes Linux 2.6 default scheduler Budget queue for CPU-hogging processes Automatic detecting of CPU-intensive process http://www.logos.ic.i.u- tokyo.ac.jp/~duc/pre/1107.ppt http://www.logos.ic.i.u- tokyo.ac.jp/~duc/pre/1107.ppt
14
Node searching system Client-server model Daemon on each node Daemon reports CPU state (process number, CPU utilization, …) directly to user Reports maximum price From where user can submit jobs? From every where on the cluster, grids From their desktop, via the Internet Need of a job submitting system
15
Node searching system (NSS) User
16
Who will determine nodes? User! Users choose nodes appropriated to their jobs Parallel jobs: idle CPUs or CPUs with low-price jobs Long-last jobs: idle CPU, set low-price
17
Node searching system (NSS) NSS should report to users: CPU utilization Maximum price Load (process number,..) … Daemon on each node sends information about the node to client. Client is on user’s machine No heavy load aggregation
18
Problems!!! May be heavy load on user client NAT, Firewall How client can connect to server?? Information need? Only CPU utilization, maximum price, load, average-price?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.