Panda-based Software Installation

Slides:



Advertisements
Similar presentations
INFN - Ferrara BaBarGrid Meeting SPGrid Efforts in Italy BaBar Collaboration Meeting - SLAC December 11, 2002 Enrica Antonioli - Paolo Veronesi.
Advertisements

CREAM: Update on the ALICE experiences WLCG GDB Meeting Patricia Méndez Lorenzo (IT/GS) CERN, 11th March 2009.
Server Roles and Features.NET Framework 3.51.NET Framework 4.5 IIS Web Server IIS Default Document IIS Directory Browsing IIS HTTP Errors.
SIM346. General information about the software application.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
CVMFS AT TIER2S Sarah Williams Indiana University.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
LCG 3D StatusDirk Duellmann1 LCG 3D Throughput Tests Scheduled for May - extended until end of June –Use the production database clusters at tier 1 and.
Platform Manager Simple, Secure, Remote Application Management.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
LcgCAF:CDF submission portal to LCG Federica Fanzago for CDF-Italian Computing Group Gabriele Compostella, Francesco Delli Paoli, Donatella Lucchesi, Daniel.
Computing and LHCb Raja Nandakumar. The LHCb experiment  Universe is made of matter  Still not clear why  Andrei Sakharov’s theory of cp-violation.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
1 Chapter Overview Preparing to Upgrade Performing a Version Upgrade from Microsoft SQL Server 7.0 Performing an Online Database Upgrade from SQL Server.
Website s Azure Websites is an enterprise class cloud solution for developing, testing and running web apps. Azure Websites allows you to focus on what.
My Name: ATLAS Computing Meeting – NN Xxxxxx A Dynamic System for ATLAS Software Installation on OSG Sites Xin Zhao, Tadashi Maeno, Torre Wenaus.
Tarball server (for Condor installation) Site Headnode Worker Nodes Schedd glidein - special purpose Condor pool master DB Panda Server Pilot Factory -
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Direct gLExec integration with PanDA Fernando H. Barreiro Megino CERN IT-ES-VOS.
June 22, 2007USATLAS T2-T3 DQ2 0.3 SiteServices Patrick McGuigan
Nurcan Ozturk University of Texas at Arlington US ATLAS Transparent Distributed Facility Workshop University of North Carolina - March 4, 2008 A Distributed.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
EMI INFSO-RI Argus The EMI Authorization Service Valery Tschopp (SWITCH) Argus Product Team.
DIRAC Pilot Jobs A. Casajus, R. Graciani, A. Tsaregorodtsev for the LHCb DIRAC team Pilot Framework and the DIRAC WMS DIRAC Workload Management System.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL May 8 th, 2008.
HPC pilot code. Danila Oleynik 18 December 2013 from.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Proxy management mechanism and gLExec integration with the PanDA pilot Status and perspectives.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Gridmake for GlueX software Richard Jones University of Connecticut GlueX offline computing working group, June 1, 2011.
Event Service Wen Guan University of Wisconsin 1.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Shifters Jamboree Kaushik De ADC Jamboree, CERN December 4, 2014.
Network integration with PanDA Artem Petrosyan PanDA UTA,
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
EGEE is a project funded by the European Union under contract IST Experiment Software Installation toolkit on LCG-2
Security and VO management enhancements in Panda Workload Management System Jose Caballero Maxim Potekhin Torre Wenaus Presented by Maxim Potekhin at HPDC08.
Integration TestBed (iTB) and Operations Provisioning Leigh Grundhoefer.
GridCat ITB Service Integration Plan Bockjoo Kim OSG Integration Workshop at UC Feb 15-17, 2005.
Panda Monitoring, Job Information, Performance Collection Kaushik De (UT Arlington), Torre Wenaus (BNL) OSG All Hands Consortium Meeting March 3, 2008.
SoftUpdate New features and management technique.
Parrot and ATLAS Connect
Jean-Philippe Baud, IT-GD, CERN November 2007
ATLAS Cloud Operations
PanDA setup at ORNL Sergey Panitkin, Alexei Klimentov BNL
3D Application Tests Application test proposals
John Gordon, STFC-RAL GDB 10 October 2007
A full demonstration based on a “real” analysis scenario
David Cameron ATLAS Site Jamboree, 20 Jan 2017
The ATLAS software in the Grid Alessandro De Salvo <Alessandro
BigPanDA WMS for Brain Studies
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
Open Source Toolkit for Turn-Key AI Cluster (Introduction)
Printers.
Configuration Of A Pull Network.
Job Application Monitoring (JAM)
SCCM in hybrid world Predrag Jelesijević Microsoft 7/6/ :17 AM
System Center Third Party Tools Ivanti Patch and RCT Recast April 2019.
Presentation transcript:

Panda-based Software Installation Tadashi Maeno (BNL)

Installation using Panda (1/3) server ProdSys job site B pull https Installation job job pilot https submit site A $OSG_APP install pilot Operator Worker Nodes

Installation using Panda (2/3) Someone submits installation jobs to Panda through usual HTTP I/F The same I/F is used for production/analysis as well Authentication Scheduling (priority, retry, …) Pilots retrieve jobs Each pilot knows which type of jobs it should retrieve Production pilots run ATLAS TRF Installation pilots run Installation TRF

Installation using Panda (3/3) Installation TRF Downloads pacman-latest.tgz from http://physics.bu.edu Setup pacman Scans destination dir to find setup.sh for Athena runtime Installs Atlas releases and/or Production caches when setup.sh is missing Runs Kit-Validation Advantages Automatization Scalability Panda infrastructure like monitoring

How to submit jobs import userinterface.Client as Client from taskbuffer.JobSpec import JobSpec jobList = [] for site in [‘SLACXRD’,’AGLT2’]: job = JobSpec() job.transformation = ‘…/installAtlasSW‘ job.computingSite = site job.jobParameters="-s 12.0.6 -p 12.0.6slc3+gcc -c AtlasProduction_12_0_6_3_i686_slc3_gcc323_opt, AtlasProduction_12_0_6_4_i686_slc3_gcc323_opt“ … jobList.append(job) Client.submitJobs(jobList) List of sites TRF site Release + Package + Caches submit

Remaining Issues The installation pilot needs write-permission on $OSG_APP “Normal” pilots are mapped to usatlas1 because schedulers are running with the production role Special scheduler running with software role to map pilots to usatlas2? gLExec? Integration with schedconfig DB schedconfig contains what releases are available at each site An intelligent client is possible Get a list of sites where a release is missing Submit a bunch of jobs to install the release Update schedconfig when installation is succeeded Who has responsibility on operations?

Test at SLAC Tried 13.0.25 as it is unused for production Required modifications to SLAC Outbound HTTP connection BU : to download pacman CNAF : to download KV cache Gave temporary write-permission on $OSG_APP/13.0.25 to Nurcan’s DN Submitted an job from BNL Job=4531377 Installation succeeded and KV passed log

Conclusions Release installation using Panda is ready Tested to install 13.0.25 to SLAC successfully A few issues Permission of installation pilot Operator Integration with schedconfig