A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate
October 30th, Overview Introduction BOINC ATLAS Use Case Tests and Results Conclusions
October 30th, Introduction Project participants: –Faculdade de Ciências da Universidade de Lisboa –Faculdade de Engenharia da Universidade do Porto From Grid-Brick system presented at CHEP2003 Goals: –Create a distributed computing system –Explore commodity CPU’s and disks and keep them together –Use public computing –Evaluate its use for dedicated HEP clusters.
October 30th, Overview Introduction BOINC –Description –Features –Behavior –Related Work ATLAS Use Case Tests and Results Conclusions
October 30th, Description Stands for Berkeley Open Infrastructure for Network Computing Generic software platform for distributed computing Developed by the team Based on public computing Key concepts –Project –Application –Workunit (Job) –Result
October 30th, Features Generic platform: supports many applications / projects Projects can be run simultaneously Common language applications can run as BOINC applications Fault-tolerance Monitored through a Web interface Implements security mechanisms
October 30th, Behavior Initial communication Work request –Hardware characteristics Server decides Workunit download –Application –Input files Results Upload Client makes requests, Server is passive
October 30th, Related Work Project-specific solutions: –Distributed.net Commercial solutions XtremWeb JXGrid
October 30th, Overview Introduction BOINC –Background –Additional Features –Behavior ATLAS Use Case Tests and Results Conclusions
October 30th, Background Grid-Brick project: Presented at CHEP2003 Goal was merge storage units with computing farms. Conclusions: –No central resource manager –Plug and play clients –Increase robustness –Fault-tolerant system
October 30th, Additional Features Avoid data movement User specific applications Environments –Scripts –Libraries Environments patches “get input” apps Job dependencies
October 30th, Behavior Initial communication Work request –Hardware characteristics –Available input files Server decides: –Input file exists: ok –No input file: wait, run "get input" app Workunit download: –Application –Environment / Patches Results Upload
October 30th, Overview Introduction BOINC ATLAS Use Case Tests and Results Conclusions
October 30th, ATLAS Use Case How can physicists use to run ATLAS jobs. The actors of this use case can be: –Physicist doing personal job submission –Real production Let us suppose we have: –Several ATLAS jobs to run –We know what files each job will produce and consume and how to generate or get these files. –We have computers connected to the Internet
October 30th, ATLAS Use Case Execution Steps: –Select or submit ATLAS application –Work submission: environment files (job options files, scripts, etc) environment patch input file template "get input" application result (output file) template As a result the user gets the aggregation of the produced output files as a unique output file.
October 30th, Overview Introduction BOINC ATLAS Use Case Tests and Results Conclusions
October 30th, Tests Based on the defined ATLAS Use Case Typical ATLAS jobs sequence using Muon events: –Generation: e events (1x) –Simulation: e/10 events (10x) –Digitization: e/10 events (10x) –Reconstruction: e/10 events (10x) Two groups of tests were defined: e = 100, e = For each group, 4 tests were made: –One simple client –Two BOINC client –Four BOINC client –Eight BOINC client
October 30th, Results - Execution Times Group A: 100 events Group B: 1000 events
October 30th, Results - Data Movement 1000 events in 8 machines: Seqx: events x00-x99
October 30th, Overview Introduction BOINC ATLAS Use Case Tests and Results Conclusions
October 30th, Conclusions Several BOINC projects are currently running successfully worldwide From tests: –Execution of user applications => more flexibility –Environments and patches => easier work submission –Heavier computation => better results –Low data movement => better results can be brought to physicists daily tasks with not much effort