Presentation is loading. Please wait.

Presentation is loading. Please wait.

2015-10-17 Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.

Similar presentations


Presentation on theme: "2015-10-17 Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others."— Presentation transcript:

1 2015-10-17 ATLAS@home Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others

2 ATLAS : Elementary Particle Physics One of the biggest experiment at CERN trying to understand the origin of mass which completes the standard model 2012 , ATLAS and CMS discovered Higgs Boson

3 2015-10-17

4

5 data processing flow in ATLAS

6

7

8

9

10 Why ATLAS@home It's free! Well, almost. Public outreach – volunteers want to know more about the project they participate Good for ATLAS visibility Can add a significant computing power to WLCG A brief history – Started end of 2013, at a test instance at IHEP, Beijing – Migrated to CERN and officially launched June 2014 – are continuously running. 2015-10-17

11 ATLAS@home Goal: to run ATLAS simulation jobs on volunteer computers. Challenges: – Big ATLAS software base, ~10GB, and very platform dependant, runs on Scientific Linux – Volunteer computing resources, should be integrated into the current Grid Computing infrastructure. In other words, all the volunteer computers should appear as a WLCG site, and Jobs are submited from PanDA(ATLAS Grid Computing Portal). – Grid Computing relies heavily on personal credentials, but these credential should not be put on volunteer computers

12 Solutions Use VirtualBox+vmwrapper to virtualize volunteer hosts Use network file system CVMFS to distribute ATLAS software, as CVMFS supports on-demand file caching, it helps to reduce the image size. In order to avoid placing credential on the volunteer hosts, Arc CE is introduced in the architecture together with BOINC – Arc CE is grid middleware, it interacts with ATLAS Central Grid Services, and manages different LRMS (Local Resource Management System), such as Condor, PBS by specific LRMS plugins – A BOINC plugin is developped, to forward “Grid Jobs” to the BOINC server, and convert the job results into Grid format.

13 Architecture 2015-10-17 ATLAS Workload Management System

14 BOINC ARC plugin(1) Converts a ARC CE job into a BOINC job The Plugin includes: – Submit/scan/cancel job – Information provider (total CPUs, CPU usages, job status) Submit – ARC CE job: All input files into one tar.gz file – Copy the input file from ARC CE session directory into BOINC internal directory – Setup BOINC environment and call BOINC command to generate a job based on job templates/input files – Wrote the jobid back to ARC CE job control directory. – Upon job finishing, BOINC services put the desired output files back to the ARC CE session directory

15 BOINC ARC CE plugin(2) Scan – Scan the job diag file (in session directory), get the exit code, upload output files to designated SE, update ARC CE job status. Cancel – Cancel a BOINC job Information provider – Query BOINC DB, get information concerning total CPU number, CPU usage, status of each job

16 Current Status gained CPU hours: 103,355 daily resource: 3% of grid computing

17 Current Status:

18 the Whole ATLAS Computing

19 ATLAS jobs Full ATLAS simulation jobs – 10 evts/job initially – Now 100 evts/job A typical ATLAS simulation job – 40~80MB Input data – 10~30MB output data – on average, 92 minutes CPU time, 114 minutes elapsed time CPU efficiency lower than on grid – Slow home network → significant – initialization time – CPUs not available all the time Jobs run in an SLC5 64-bit->upgraded to SLC6 (Ucernvm) virtualization on Windows, Linux, Mac ANY kind of job could run onATLAS@HOME 2015-10-17

20 How Grid People see ATLAS@home Volunteers want to earn the credits for their contribution, they want their PCs to work optimally – This is true for the grid sites as well, at least it should be – But volunteers are better shifters then we are Different to what we are used to: – On grid: jobs are failing, please fix the sites! – On Boinc: jobs suck, please fix your code! ATLAS@HOME is the first Boinc project massive I/O demands, even for less intensive jobs – Server infrastructure needs to be carefully planned to cope with a high load Credentials must not be passed to PCs Jobs can be in the execution mode for a long time, depending on the volunteer computer preferences, not suitable for high priority tasks 2015-10-17

21 ATLAS outreach outreach website: https://atlasphysathome.web.cern.ch/ feedback mail list: atlas-comp-contact-home@cern.ch

22 Future Effort (1) Customize the VM image to reduce the network traffic and speed up the initialization Optimize the file transfers, server load and job efficiency on the PCs Test and migrate to LHC@home infrastructure Test if BOINC can replace the small Grid Sites Investigation of the use of BOINC on local batch clusters to run ATLAS jobs. Investigation of running various worflows (longer jobs, multi-core jobs) on virtual machines 2015-10-17

23 Future Effort(2) provide an event display & possibly screen saver that would let people see what they are running.

24 Acknowledgements David and Rom for all the supports and suggestions. CERN IT for providing Servers and Storage resources for ATLAS@home, working on integrating ATLAS@home with LHC@home


Download ppt "2015-10-17 Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others."

Similar presentations


Ads by Google