+ AliEn site services and monitoring Miguel Martinez Pedreira.

Slides:



Advertisements
Similar presentations
ALICE G RID SERVICES IP V 6 READINESS
Advertisements

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
AliEn Tutorial MODEL th May, May 2009 Installation of the AliEn software AliEn and the GRID Authentication File Catalogue.
Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep ,
G RID SERVICES IP V 6 READINESS
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
1 port BOSS on Wenjing Wu (IHEP-CC)
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) VOMS Installation and configuration Bouchra
ALICE data access WLCG data WG revival 4 October 2013.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Online Monitoring with MonALISA Dan Protopopescu Glasgow, UK Dan Protopopescu Glasgow, UK.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Introduction to CVMFS A way to distribute HEP software on cloud Tian Yan (IHEP Computing Center, BESIIICGEM Cloud Computing Summer School.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Nadia LAJILI User Interface User Interface 4 Février 2002.
Monitoring, Accounting and Automated Decision Support for the ALICE Experiment Based on the MonALISA Framework.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Site operations Outline Central services VoBox services Monitoring Storage and networking 4/8/20142ALICE-USA Review - Site Operations.
Tarball server (for Condor installation) Site Headnode Worker Nodes Schedd glidein - special purpose Condor pool master DB Panda Server Pilot Factory -
ALICE Offline Week | CERN | November 7, 2013 | Predrag Buncic AliEn, Clouds and Supercomputers Predrag Buncic With minor adjustments by Maarten Litmaath.
+ AliEn status report Miguel Martinez Pedreira. + APIs dcache issue having lfn-like pfns root://srm.ndgf.org:1094//alice/cern.ch/user/a/alitrain/PWGJE/Jets_PbPb_2011/104_
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
Overview of ALICE monitoring Catalin Cirstoiu, Pablo Saiz, Latchezar Betev 23/03/2007 System Analysis Working Group.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagansco, S. Banerjee, L. Betev, F. Carminati,
LHCb-ATLAS GANGA Workshop, 21 April 2004, CERN 1 DIRAC Software distribution A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
Monitoring with MonALISA Costin Grigoras. What is MonALISA ?  Caltech project started in 2002
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
EGEE is a project funded by the European Union under contract IST VO box: Experiment requirements and LCG prototype Operations.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
AliEn central services Costin Grigoras. Hardware overview  27 machines  Mix of SLC4, SLC5, Ubuntu 8.04, 8.10, 9.04  100 cores  20 KVA UPSs  2 * 1Gbps.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz The future of AliEn.
Status of AliEn2 Services ALICE offline week Latchezar Betev Geneva, June 01, 2005.
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
+ AliEn status report Miguel Martinez Pedreira. + Touching the APIs Bug found, not sending site info from ROOT to central side was causing the sites to.
Presentation of the results khiat abdelhamid
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Gestion des jobs grille CMS and Alice Artem Trunov CMS and Alice support.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
ALICE WLCG operations report Maarten Litmaath CERN IT-SDC ALICE T1-T2 Workshop Torino Feb 23, 2015 v1.2.
Storage discovery in AliEn
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagnasco, L. Betev, D. Goyal, A. Grigoras, C.
Federating Data in the ALICE Experiment
Kilian Schwarz ALICE Computing Meeting GSI, October 7, 2009
ALICE Monitoring
Workload Management System
ALICE FAIR Meeting KVI, 2010 Kilian Schwarz GSI.
Introduction to CVMFS A way to distribute HEP software on cloud
ALICE – FAIR Offline Meeting KVI (Groningen), 3-4 May 2010
Publishing ALICE data & CVMFS infrastructure monitoring
WLCG Collaboration Workshop;
Production Manager Tools (New Architecture)
The LHCb Computing Data Challenge DC06
Presentation transcript:

+ AliEn site services and monitoring Miguel Martinez Pedreira

+ What is AliEn ? AliEn (ALICE Environment) is a lightweight Open Source Grid Framework built around other Open Source components using the combination of a Web Service and Distributed Agent Model designed to comply with the offline world of a HEP experiment massive amounts of data implies distributing its storage and processing It started within the ALICE Off-line Project at CERN and constitutes the production environment for simulation, reconstruction, and analysis of physics data of the ALICE ExperimentALICE Off-line ProjectCERNALICE Experiment The current status of the ALICE grid operation can be found at the MonALISA Grid MonitoringMonALISA Grid Monitoring AliEn development - Miguel Martinez Pedreira 2

+ Virtual Organisations Users (jobs submission for data analysis) + Central management (the “brain” of the GRID) + Sites (the “muscle” of the GRID) AliEn development - Miguel Martinez Pedreira 3

+ Distributed Analysis AliEn development - Miguel Martinez Pedreira 4

+ AliEn summary 3-layer system that leverages the deployed resources of the underlying WLCG infrastructures and services Interfaces to AliRoot via ROOT plugin (TAlien) that implements AliEn API Complex workflows including distributed analysis built on top of AliEn API Used by ALICE and PANDA AliEn development - Miguel Martinez Pedreira 5

+ Site services summary CE Component that stays in a loop, the one submitting jobs to batch system Comunicates with the JobBroker to see if the parameters of the cluster and the jobs waiting fit ClusterMonitor (and CMreport) Proxies several kinds of communications, like process information from jobs, mainly connects nodes to VoBox, gets job slots for CE, other external calls… CMreport parses and sends messages stored by the CM to the CS (joblogs) MonaLisa Monitoring component. Controls services, and sends all kind of information to the ML central repository PackMan See CVMFS later, used to manage which software packages (AliRoot, etc) are installed in the nodes, triggers installations and so on. Previous technology used for this was torrent AliEn development - Miguel Martinez Pedreira 6

+ JobAgent Also commonly called ‘Pilot’ Runs the job itself Starts from a script sent to batch system from the CE Gets a job from the TaskQueue Initialization, sandbox, proxy, environment… Several processes: 1 communication daemon, 1 control process, 1 worker process Measures cpu, memory, disk usages, sends heartbeats… Uploads output files to the storage elements Continues running other jobs if possible It has its own TTL AliEn development - Miguel Martinez Pedreira 7

+ Site services AliEn development - Miguel Martinez Pedreira 8 JobAgent ClusterMonitor CMreport MonaLisa CE JobAgent Central Services DBs

+ CVMFS AliEn development - Miguel Martinez Pedreira 9

+ New (WLCG) VoBox pseudo-script Put host certificates under /etc/grid-security mkdir –p /etc/grid-security cp /etc/grid-security chmod 400 /etc/grid-security/hostkey.pem Install WLCG software yum install yum-priorities yum-protectbase rpm -Uvh rpm -Uvh yum install wlcg-vobox Configure VoBox mkdir yaim Put yaim files in the created folder /opt/glite/yaim/bin/yaim -c -s /root/yaim/voboxalice-site-info.def -n VOBOX Open ports [22, 1975, 1093, 8084, 9000, 7001, 8884, 9000, 9930 ] Install meta-package for all library dependencies yum install HEP_OSlibs_SL6 AliEn development - Miguel Martinez Pedreira 10

+ New (WLCG) VoBox pseudo-script Install CVMFS (squid servers?) rpm -Uvh release-2-4.el6.noarch.rpm yum install cvmfs cvmfs-keys cvmfs-init-scripts CVMFS configuration file: cp cvmfs-default.local /etc/cvmfs/default.local cvmfs_config setup cvmfs_config chksetup cvmfs_config probe Make sure you can see it: ls -l /cvmfs/alice.cern.ch/bin/ AliEn development - Miguel Martinez Pedreira 11

+ New (WLCG) VoBox pseudo-script Links in $HOME/bin to AliEn bins Proxy steps ( 7) Some files for AliEn (under $HOME/.alien) AliEn development - Miguel Martinez Pedreira 12

+ Monitoring Reliability = yes / ( yes + no ) Availability = yes / ( yes + no + unknown) What do you want to see? AliEn development - Miguel Martinez Pedreira 13

+ Monitoring AliEn development - Miguel Martinez Pedreira 14

+ Monitoring AliEn development - Miguel Martinez Pedreira 15

+ 16

+ Monitoring AliEn development - Miguel Martinez Pedreira 17

+ More monitoring LAN / WAN traffic Bandwidth tests What do you want to have in ML that is useful for you ? AliEn development - Miguel Martinez Pedreira 18