Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagnasco, L. Betev, D. Goyal, A. Grigoras, C.

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

During the last three years, ALICE has used AliEn continuously. All the activities needed by the experiment (Monte Carlo productions, raw data registration,
Database Architectures and the Web
Torrent base of software distribution by ALICE at RDIG V.V. Kotlyar (IHEP, Protvino), E.A. Ryabinkin (RRC-KI, Moscow), I.A. Tkachenko (RRC-KI, Moscow),
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Summary of issues and questions raised. FTS workshop for experiment integrators Summary of use  Generally positive response on current state!  Now the.
AliEn Tutorial MODEL th May, May 2009 Installation of the AliEn software AliEn and the GRID Authentication File Catalogue.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Messaging System Ivan, Omar, Sergio 14 march 2012.
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services GS group meeting Monitoring and Dashboards section Activity.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
INFSO-RI Enabling Grids for E-sciencE Project Gridification: the UNOSAT experience Patricia Méndez Lorenzo CERN (IT-PSS/ED) CERN,
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Siebel 8.0 Module 2: Overview of EIM Processing Integrating Siebel Applications.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Jian Gui WANG New Implementation of Agriculture Models APAN19---Jan New Implementations of Agriculture Models Using Mediate Architecture.
Distributed database system
Overview of ALICE monitoring Catalin Cirstoiu, Pablo Saiz, Latchezar Betev 23/03/2007 System Analysis Working Group.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagansco, S. Banerjee, L. Betev, F. Carminati,
WLCG infrastructure monitoring proposal Pablo Saiz IT/SDC/MI 16 th August 2013.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagansco, S. Banerjee, L. Betev, F. Carminati,
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
AliEn central services Costin Grigoras. Hardware overview  27 machines  Mix of SLC4, SLC5, Ubuntu 8.04, 8.10, 9.04  100 cores  20 KVA UPSs  2 * 1Gbps.
+ AliEn site services and monitoring Miguel Martinez Pedreira.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
Partnerships in Innovation: Serving a Networked Nation Grid Technologies: Foundations for Preservation Environments Portals for managing user interactions.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES L. Betev, A. Grigoras, C. Grigoras, P. Saiz, S. Schreiner AliEn.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz The future of AliEn.
Status of AliEn2 Services ALICE offline week Latchezar Betev Geneva, June 01, 2005.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CC Monitoring I.Fedorko on behalf of CF/ASI 18/02/2011 Overview.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The AliEn File Catalogue Jamboree on Evolution of WLCG Data &
MND section. Summary of activities Job monitoring In collaboration with GridView and LB teams enabled full chain from LB harvester via MSG to Dashboard.
AliEn Tutorial ALICE workshop Sibiu 20 th August, 2008 Pablo Saiz.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Creating a simplified global unique file catalogue Miguel Martinez Pedreira Pablo Saiz.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
Federating Data in the ALICE Experiment
Jean-Philippe Baud, IT-GD, CERN November 2007
ALICE and LCG Stefano Bagnasco I.N.F.N. Torino
Database Architectures and the Web
Client/Server Databases and the Oracle 10g Relational Database
UML diagrams for the AliEn job execution part and PackMan service
Running a job on the grid is easier than you think!
ALICE Monitoring
ALICE FAIR Meeting KVI, 2010 Kilian Schwarz GSI.
INFN-GRID Workshop Bari, October, 26, 2004
MC data production, reconstruction and analysis - lessons from PDC’04
Simulation use cases for T2 in ALICE
Database Architectures and the Web
Ákos Frohner EGEE'08 September 2008
CTA: CERN Tape Archive Overview and architecture
Data Management cluster summary
DUCKS – Distributed User-mode Chirp-Knowledgeable Server
Presentation transcript:

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagnasco, L. Betev, D. Goyal, A. Grigoras, C. Grigoras, M. Litmaath, N. Manukyan, M. Martinez, J. Porter, P. Saiz, S. Sankar, S. Schreiner AliEn v2-20 and beyond Workshop dei Tier-2 italiani di ALICE

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 219 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Content What is AliEn New features on v2.20 –TaskQueue –Catalogue –Service communication What is next? Summary

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 319 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE All components to create a GRID File Catalogue –UNIX-like file system –Mapping to physical files –Metadata information –SE discovery Transfer Model –With different plugins TaskQueue –Job Agent & pull model –Automatic installation of software packages –Simulation, reconstruction, analysis... Developed by ALICE –Used by several communities AliEn

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 419 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE AliEn File Catalogue Global Unique name space –Mapping from LFN to PFN UNIX-like file system interface Powerful metadata catalogue Automatic SE selection Integrated quota system Multiple storage protocols: xrootd, torrent, srm, file Collections of files Physical file archival Roles and users

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 519 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE xrootd Job execution Job Manager JOB TASKQUEUE Job Broker CE MonALISA xrootd Site A JOB MonALISA xrootd Site B MonALISA Site C File catalogue LFN GUID Meta data JOB CE JA CREAMCE

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 619 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE New in v2.20

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 719 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE TaskQueue database layout Single DB Innodb tables –Row locking –Foreign keys –Transactions not used… Lookup tables 2 JDLs per job JDL fields mapped to columns Link to full graph

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 819 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Brokering Avoid Classad matching –Less fields to parse Match in a single SQL statement. Four attempts at matching: –With packages already installed –With any packages –With remote data and packages already installed –With remote data, any packages

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 919 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE File brokering Site ASite BSite C File 1 File 2 File 3 File 4 File 5 Current schema Submit 4 jobs: File1 File 4 File2File3File 5 Broker per file Submit 3 empty subjobs File1, 2,4,5 When a job starts, analyze as much as possible File 3 If nothing left, just exit

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1019 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE More TaskQueue MaxWaitingTime: amount of time that job can stay in ‘WAITING’ –If time exceeded, job ends up in error –New state: ERROR_EW (Expired Waiting) Retrial: –Number of times that a single job can be resubmitted –Resubmission done by central services Reusing JobId in resubmission Direct removal of KILLED jobs

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1119 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Some results… DB time to insert a job, and 8 change status: Time to process all 250M ALICE jobs: 4.8 days

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1219 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Service communication Replacing SOAP with JSON –Less overhead (no XML encoding) –Easier to interact with other clients Backward incompatible change  To be deployed in ALICE…

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1319 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE SOAP vs JSON Apache web server 32 hosts for clients –16 cores –8000 calls per client Without SSL

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1419 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Catalogue Innodb tables –Row locking –Transactions –Foreign keys To be deployed in ALICE…

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1519 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE What is next?

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1619 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE And for the next versions… Trust model File popularity Interactive jobs Correlate Monitoring data Multi core jobagents Catalogue crawler Error classification Distributed brokering

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1719 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE File catalogue Removal of GUID –Decrease size of the catalogue –Storage on the sites based on lfn+timestamp Using file system instead of Database –Keep database for metadata, quotas, SE. Improve handling of zip archives –More than 80% of the lfn are inside an archive

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1819 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE TaskQueue Compression of JDL –And/or storing diffs Brokering alternatives: –2-level brokering JA ask CM, CM asks in bulk the CS –Combining jobs with similar input And dispatch them together Multicore jobagent –One agent per core or per machine?

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1919 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Human grid Chile, Trust model USA, JA memory Armenia, XML model File Popularity India, File deletion South Korea, Quota system Germany, ORACLE Italy, CREAMCE Scotland, VO to VO Switzerland, Main dev. China, Trust Model

CERN IT Department CH-1211 Geneva 23 Switzerland t ES 2019 Dec 2012 Pablo Saiz Workshop dei Tier-2 italiani di ALICE Summary Parts of AliEn v2.20 already deployed for ALICE! –Needs another intervention, with 48h downtime –PANDA runs all the latest components TaskQueue speed improved drastically –40 times insertion rate –20 times resubmission time –Improved concurrency Plenty of areas to develop and contribute