Presentation is loading. Please wait.

Presentation is loading. Please wait.

CluBORun: tool for utilizing idle resources of computing clusters in volunteer computing. Maxim Manzyuk, Oleg Zaikin, Mikhail Posypkin Volgograd state.

Similar presentations


Presentation on theme: "CluBORun: tool for utilizing idle resources of computing clusters in volunteer computing. Maxim Manzyuk, Oleg Zaikin, Mikhail Posypkin Volgograd state."— Presentation transcript:

1 CluBORun: tool for utilizing idle resources of computing clusters in volunteer computing. Maxim Manzyuk, Oleg Zaikin, Mikhail Posypkin Volgograd state technical university, Volgograd, Russia ISDCT SB RAS, Irkutsk, Russia IITP RAS, Moskow, Russia

2 Why volunteer computing on clusters? There are BOINC volunteer computing projects based mainly on desktop PCs that have performance greater than 1 PFLOPs. Despite this fact there are some reasons why resources of computing clusters can be useful in volunteer computing. computing cluster is quite reliable device, so results obtained on it can be taken as a reference when checking the results computing cluster can significantly help to increase performance of a new volunteer project with low amount of participants clusters resources are usually idle 2

3 CluBORun Cluster for BOINC Run (CluBORun) – tool for launching BOINC volunteer computing on clusters. Main features: use only ordinary cluster’s user rights utilize only idle resources of computing clusters (just as BOINC-manager does it for PCs) launch BOINC volunteer computing for any volunteer project with linux client application 3

4 CluBORun Folders with BOINC clients (own folder for each node) Files-flags for BOINC clients. Start flag corresponds to launched BOINC client, stop-flag means that BOINC client must be stopped Text file all_tasks.txt with list of tasks to launch (with stop-flag, start-flag, BOINC client path as parameters) MPI program start_boinc for launching BOINC client on a particular cluster node. Script catch_node.sh for analyzing cluster workload and launching start_boinc on free nodes 4

5 start_boinc start_boinc is a MPI C++ program that can be launched on several nodes of a cluster. On every node MPI processes are provided with roles. 1 control node process. It launch BOINC on a node (by system command), then periodically (once in 30 seconds) check if stop flag appeared or if time exceeded. Besides this checking this process slуep k-1 sleeping processes MPI processes doesn’t do any concrete tasks computing. BOINC “think” that he work on a simple PC. BOINC connects to a project and do all computations. 5

6 catch_node.sh One can set maximum amount of launched BOINC processes (by default I equal to cluster cores). catch_node.sh analyze cluster workload. If there are free nodes (and limit for BOINC jobs is not exceede) then launch BOINC by start_boinc If in queue appears new task of other user with status “waiting for freed resources” then catch_node.sh stop BOINC tasks (but only if it will help new task to be launched) 6

7 MVS-100k cluster workload for 1 month 7

8 Restricted queue info In cluster management system showing jobs of other users can be disabled (for example, in MVS-10P with SLURM system). In this case CluBORun can see own jobs, but no jobs of other users. By command sinfo CluBORun can see how many nodes are free occupied by BOINC occupied by other users 8

9 Restricted queue info Solution decrease maximum amount of launched BOINC processes if there are free resources, then add BOINC job (call it jobA) to queue if added job obtained status “waiting for resources” then in queue exists at least 1 job of other user with the same status hence CluBOrun switch to restricted mode a) stop adding new jobs in queue b) stop one after one BOINC jobs in queue until jobA will launch 9

10 Restricted queue info Example waiting BOINC jobA waiting jobs of other users running BOINC job1 running BOINC job2 running BOINC job3 10

11 Restricted queue info Variant 1. two stopped BOINC jobs helped other users jobs and BOINC jobA to run. running BOINC jobA running BOINC job3 Variant 2. three stopped BOINC jobs didn’t help BOINC jobA to run. waiting BOINC jobA no running BOINC jobs 11

12 Restricted queue info If stopping BOINC jobs can help, it will help. If not, then CluBOrun will wait for launching of BOINC jobA. After running of jobA CluBORun switchs from restricted mode to normal mode. This algorithm was implemented and works right now on MVS-10P cluster. 12

13 CluBORun resources in SAT@home Range of performance of MVS-100k in SAT@home is about 10-40 %. With the help of CluBORun in SAT@home several problems were solved faster: searching for new pairs of orthogonal diagonal Latin squares of order 10 (17 new pairs were found); solving logical cryptanalysis problem for weakened Bivium cipher (with 10 known bits from 177 bits of secret key, 3 problems were solved); solving A5/1 logical cryptanalysis problems (3 problems were solved) 13

14 Comparison with 3G Bridge 3G Bridge (by LPDS at MTA-SZTAKI, Hungary) is an open-source core job bridging component between different grid infrastructures. For example, between service grid (made of clusters) and desktop grid (made of PCs). Similatity: possibility to move jobs from dektop grid to a cluster. Differences in concrete situations. + 3G Bridge can bridge job not only from desktop grid to service grid, but in the opposite direction too + CluBOrun can use ordinary cluster user rights, idle recourses are used 14

15 Thank you for your attention! 15


Download ppt "CluBORun: tool for utilizing idle resources of computing clusters in volunteer computing. Maxim Manzyuk, Oleg Zaikin, Mikhail Posypkin Volgograd state."

Similar presentations


Ads by Google