Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagnasco, L. Betev, D. Goyal, A. Grigoras, C. Grigoras, M. Litmaath, N. Manukyan, M. Martinez, J. Porter, P. Saiz, S. Sankar, S. Schreiner AliEn v2-20 and beyond Workshop dei Tier-2 italiani di ALICE
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 219 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Content What is AliEn New features on v2.20 –TaskQueue –Catalogue –Service communication What is next? Summary
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 319 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE All components to create a GRID File Catalogue –UNIX-like file system –Mapping to physical files –Metadata information –SE discovery Transfer Model –With different plugins TaskQueue –Job Agent & pull model –Automatic installation of software packages –Simulation, reconstruction, analysis... Developed by ALICE –Used by several communities AliEn
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 419 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE AliEn File Catalogue Global Unique name space –Mapping from LFN to PFN UNIX-like file system interface Powerful metadata catalogue Automatic SE selection Integrated quota system Multiple storage protocols: xrootd, torrent, srm, file Collections of files Physical file archival Roles and users
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 519 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE xrootd Job execution Job Manager JOB TASKQUEUE Job Broker CE MonALISA xrootd Site A JOB MonALISA xrootd Site B MonALISA Site C File catalogue LFN GUID Meta data JOB CE JA CREAMCE
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 619 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE New in v2.20
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 719 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE TaskQueue database layout Single DB Innodb tables –Row locking –Foreign keys –Transactions not used… Lookup tables 2 JDLs per job JDL fields mapped to columns Link to full graph
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 819 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Brokering Avoid Classad matching –Less fields to parse Match in a single SQL statement. Four attempts at matching: –With packages already installed –With any packages –With remote data and packages already installed –With remote data, any packages
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 919 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE File brokering Site ASite BSite C File 1 File 2 File 3 File 4 File 5 Current schema Submit 4 jobs: File1 File 4 File2File3File 5 Broker per file Submit 3 empty subjobs File1, 2,4,5 When a job starts, analyze as much as possible File 3 If nothing left, just exit
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1019 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE More TaskQueue MaxWaitingTime: amount of time that job can stay in ‘WAITING’ –If time exceeded, job ends up in error –New state: ERROR_EW (Expired Waiting) Retrial: –Number of times that a single job can be resubmitted –Resubmission done by central services Reusing JobId in resubmission Direct removal of KILLED jobs
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1119 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Some results… DB time to insert a job, and 8 change status: Time to process all 250M ALICE jobs: 4.8 days
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1219 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Service communication Replacing SOAP with JSON –Less overhead (no XML encoding) –Easier to interact with other clients Backward incompatible change To be deployed in ALICE…
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1319 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE SOAP vs JSON Apache web server 32 hosts for clients –16 cores –8000 calls per client Without SSL
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1419 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Catalogue Innodb tables –Row locking –Transactions –Foreign keys To be deployed in ALICE…
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1519 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE What is next?
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1619 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE And for the next versions… Trust model File popularity Interactive jobs Correlate Monitoring data Multi core jobagents Catalogue crawler Error classification Distributed brokering
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1719 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE File catalogue Removal of GUID –Decrease size of the catalogue –Storage on the sites based on lfn+timestamp Using file system instead of Database –Keep database for metadata, quotas, SE. Improve handling of zip archives –More than 80% of the lfn are inside an archive
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1819 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE TaskQueue Compression of JDL –And/or storing diffs Brokering alternatives: –2-level brokering JA ask CM, CM asks in bulk the CS –Combining jobs with similar input And dispatch them together Multicore jobagent –One agent per core or per machine?
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1919 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Human grid Chile, Trust model USA, JA memory Armenia, XML model File Popularity India, File deletion South Korea, Quota system Germany, ORACLE Italy, CREAMCE Scotland, VO to VO Switzerland, Main dev. China, Trust Model
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 2019 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Summary Parts of AliEn v2.20 already deployed for ALICE! –Needs another intervention, with 48h downtime –PANDA runs all the latest components TaskQueue speed improved drastically –40 times insertion rate –20 times resubmission time –Improved concurrency Plenty of areas to develop and contribute