Performance Evaluation of Load Sharing Policies on a Beowulf Cluster James Nichols Marc Lemaire Advisor: Mark Claypool
Outline Introduction Methodology Results Conclusions
Introduction What is a Beowulf cluster? Cluster of computers networked together via Ethernet Load Distribution Share load, decreasing response times and increasing overall throughput Need for expertise in a particular load distribution mechanism Load Historically use CPU as the load metric. What about disk and memory load? Or system events like interrupts and context switches? PANTS Application Node Transparency System Removes the need for expertise required by other load distribution mechanisms
PANTS PANTS Application Node Transparency System Developed in previous MQP: CS-DXF Enhanced the following year to use DIPC in: CS-DXF Intercepts execve() system calls Uses /proc files system to calculate CPU load to classify node as “busy” or “free” Any workload which does not generate CPU load will not be distributed Near linear speedup for computationally intensive applications New load metrics and polices!
Outline Introduction Methodology Results Conclusions
Methodology Identified load parameters Implemented ways to measure parameters Improved PANTS implementation Built micro benchmarks which stressed each load metric Built macro benchmark which stressed a more realistic mix of metrics Selected real world benchmark
Methodology: Load Metrics Acquired via /proc/stat CPU – Totals jiffies (1/100ths of a second) that the processor spent on user, nice, and system processes. Obtain a percentage of total. I/O – Blocks read/written to disk per second. Memory – Page operations per second. Example: a virtual memory page fault requiring a page to be loaded into memory from disk Interrupts – System interrupts per second. Example: incoming Ethernet packet. Context Switches – How many times the processor switched between processes per second.
Methodology: Micro & Macro Benchmarks Implemented micro and macro benchmarks Helped refine understanding of how the system was performing, tested our load metrics, etc. (Not enough time to present)
Real world benchmark: Linux kernel compile Distributed compilation of the Linux kernel Executed by the standard Linux program make Loads I/O and memory resources Details: Kernel version files Mean source file size: 19KB Needed to expand relative path names to full paths Thresholds Pants default: CPU: 95% New policy: CPU: 95%, I/O: 1000 Blocks/sec, Memory: 4000 Page faults/sec, IRQ: interrupts/sec, Context Switches: 6000 switches/sec Compare PANTS default policy to our new load metrics and policy
Outline Introduction Methodology Results Conclusions
Results: CPU
Results: Memory
Results: Context Switches
Results: Summary
Conclusions Achieve better throughput and more balanced load distribution when metrics include I/O, memory, interrupts, and context switches. Future Work: Use preemptive migration? Include network usage load metric. For more information visit: Questions?