Cumulus - dynamic cluster available under Clusterix Marcin Pawlik, Jan Kwiatkowski, Roman Wyrzykowski, Konrad Karczewski Research supported by: Clusterix - National Cluster of Linux Systems & European Framework Programme 6 project DeDiSys
Presentation outline Work motivation Potential solutions Dynamic cluster creation Clusterix integration Achievements and future work Questions
Work motivation Build a cluster for research and educational purposes. Extend the cluster functionality by joining the Clusterix environment.
Requirements Cost effective – utilizes existing hardware infrastructure Not invasive – no large modifications to the existing infrastructure needed Cohabitative – no degeneration of the existing functionality Useful – meets our scientific and educational requirements
Potential solutions first approach – cpu cycle harvester Easy to deploy – one program installed and the system is ready Cohabitative – operates when the machine is idle Invasive – every modification has to change the software on all the nodes Uncontrollable –no guarantees of cpu time, network bandwidth, etc.
Potential solutions second approach – dynamic cluster Fully controllable – the nodes are fully dedicated to the cluster Not invasive – modifications only in the ”cluster space” Cohabitative – operates when the machines are not utilized Not fully available – works only part-time
Implementation Assumptions Features network boot remotely mounted file systems optional local swap and scratch space Features easy to control and modify higher server load
Software architecture TFTP DHCP b k i Server Node1 Cluster oneSIS DHCP i DHCP i Node2 NFS 10.0.0.1 10.0.0.2 Node3
Joining Clusterix Advantages Access to a nation-wide Grid environment Potential future access to worldwide resources Higher computational power Higher availability
Clusterix integration Cumulus j JIMS Cluster /home/vus0 GT j VUS GRMS j Clusterix GT JIMS VOIS
Summary Creation of the fully functional cluster at the expense of one dedicated computer Extended power and availability - participation in the National Cluster of Linux Systems Future work Evaluation of the cluster-wide suspend mechanism Cluster performance evaluation Upgrade of the network infrastructure Incorporation of more computational nodes …
Questions
Requirements Globus Toolkit Myproxy Monitoring system (JIMS) Virtual user account system (VUS) Host and User certificates
Installed software System Cluster oneSIS DHCP NFS TFTP … Resource manager (TORQUE) Parallel processing environments (MPICH, PVM)