Work report Xianghu Zhao Nov 11, 2014.

Slides:

Advertisements

Similar presentations

Status of BESIII Distributed Computing BESIII Workshop, Mar 2015 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.

Advertisements

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.

K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.

DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.

Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep ,

Copyright© 2003 Avaya Inc. All rights reserved Upgrade to Communication Manager 2.0 with Migration to Linux 8.0 Purpose: This presentation was prepared.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

Monte Carlo Instrument Simulation Activity at ISIS Dickon Champion, ISIS Facility.

RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.

Compiled Matlab on Condor: a recipe 30 th October 2007 Clare Giacomantonio.

Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.

Status of StoRM+Lustre and Multi-VO Support YAN Tian Distributed Computing Group Meeting Oct. 14, 2014.

Some Design Notes Iteration - 2 Method - 1 Extractor main program Runs from an external VM Listens for RabbitMQ messages Starts a light database engine.

BESIII Production with Distributed Computing Xiaomei Zhang, Tian Yan, Xianghu Zhao Institute of High Energy Physics, Chinese Academy of Sciences, Beijing.

Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.

Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.

Framework of Job Managing for MDC Reconstruction and Data Production Li Teng Zhang Yao Huang Xingtao SDU

9 th Weekly Operation Report on DIRAC Distributed Computing YAN Tian From to

Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep , 2014 Draft.

LHCb-ATLAS GANGA Workshop, 21 April 2004, CERN 1 DIRAC Software distribution A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.

Status of BESIII Distributed Computing BESIII Workshop, Sep 2014 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.

OPTIMIZATION OF DIESEL INJECTION USING GRID COMPUTING Miguel Caballer Universidad Politécnica de Valencia.

INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.

Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.

CCJ introduction RIKEN Nishina Center Kohei Shoji.

LHCb 2009-Q4 report Q4 report LHCb 2009-Q4 report, PhC2 Activities in 2009-Q4 m Core Software o Stable versions of Gaudi and LCG-AA m Applications.

Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.

Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group.

ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.

The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.

SEE-GRID-SCI WRF-ARW model: Grid usage The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures.

Progress on Design and Implement of Job Management System Suo Bing, Yan Tian, Zhao Xianghu

Petr Škoda, Jakub Koza Astronomical Institute Academy of Sciences

Compute and Storage For the Farm at Jlab

Status of BESIII Distributed Computing

Installation of the ALICE Software

Real Time Fake Analysis at PIC

AWS Integration in Distributed Computing

MCproduction on the grid

Virtualisation for NA49/NA61

Overview of the Belle II computing

U.S. ATLAS Grid Production Experience

Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław

Operating System.

Report of Dubna discussion

Running a job on the grid is easier than you think!

CEPC Software Management Proposal

Upgrade SFX V3 to V4 Lieve Rottiers.

Existing Perl/Oracle Pipeline

Virtualisation for NA49/NA61

Analysis Trains - Reloaded

Generator Services planning meeting

Submit BOSS Jobs on Distributed Computing System

Status of Storm+Lustre and Multi-VO Support

FCC HtCondor Submission:

Job workflow Pre production operations:

Analysis framework - status

Discussions on group meeting

Skill Based Assessment - Entity Framework -

Skill Based Assessment

Haiyan Meng and Douglas Thain

Status of CVS repository Production databases Production tools

YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.

UM D0RACE STATION Status Report Chunhui Han June 20, 2002

Job Application Monitoring (JAM)

Status and plans for bookkeeping system and production tools

Production Manager Tools (New Architecture)

Production client status

Presentation transcript:

Work report Xianghu Zhao Nov 11, 2014

Status of User Jobs 8000 jobs total 4732 jobs successful 1.2 TB data transferred to IHEPD-USER

Status of User Jobs Running Jobs Input Sandbox Size

User Requirements Use custom generator for simulation Do simulation + reconstruction + analysis in one job Common point: They need some custom packages which does not exist in the official BOSS release

Solution User compile the packages locally before submission Add corresponding .so lib files to the input sandbox in the ganga script Add “pwd” to the environment “LD_LIBRARY_PATH” before Boss jobs running Add other user files needed by the job to the input sandbox and change the paths to relative ones

Generator Custom Files All user *.so files could be found as soft link under the directory $USERWORKAREA/InstallArea/x86_64-slc5-gcc43-opt/lib/ These two files from package BesEvtGen are put in the input sandbox libBesEvtGen.so libBesEvtGenLib.so

Old Version Problems libcurl.so.3 not found for SL6 nodes This happens in the reconstruction on SL6 nodes Boss version from 664p01 has included this file sqlite memory leak (“by event” jobs will fail after a period of time) Boss version from 664p01 has solved this problem MdcTunningSvc can only use mysql (ignore the settings in the jobOption) Boss version from 664 can use sqlite BesEventMixer can not set custom random trigger directory Boss version from 664 can support custom directory

Patchs with Version There too many different versions and it is not good to hard code all the situations in the GangaBoss code Use patches which are automatically included in the input sandbox and executed before Boss job running /afs/.ihep.ac.cn/bes3/offline/ExternalLib/gangadist/scripts/*.patch For each Boss version, configure the patches needed in a configuration file (python dict format) /afs/.ihep.ac.cn/bes3/offline/ExternalLib/gangadist/scripts/Boss.conf

Adding Analysis Support Add a new option “anaoptsfile” for the Boss application app = Boss(version=bossVersion, optsfile=optionsFile, recoptsfile=recOptionsFile, anaoptsfile=anaOptionsFile) New data type “root” for analysis output file Add new files to the input sandbox anaoptions.pkl anadata.py, anadata.opts Change the splitter to generate the “anadata” files Add analysis to workflow Save analysis logs to the output sandbox anabosslog, anabosserr

Analysis Job Custom Files File of user analysis package DCDCAlg libDCDCAlg.so There are two more packages used by An Fenfen DTagTruthMatchSvc libDTagTruthMatchSvcLib.so libDTagTruthMatchSvc.so McDecayModeSvc libMcDecayModeSvcLib.so libMcDecayModeSvc.so Also two files needed by the analysis deltaE.txt, Mbc.txt User need to change the path in the job option file to relative path

Some Problems User could only submit about 1100 jobs and the submission will fail Process “dirac-server” occupies too much CPU time (about 75%) and it is killed after one hour Submission time is longer than before Need to upload 8.6 M files to input sandbox The size of input sandbox storage increased very fast 8.6 M for one job 67 G total now

Future Plan: Upload Big Files to SE Upload all user release .so files to SE before submitting jobs to DIRAC Tell the boss script which files to download from SE On the node, the boss script download the files before executing boss jobs Upload files to /bes/user/x/xxx/Upload Change the filename to md5 hex format (7086c05f3cead1809dc8d1ac0f5c467c) Calculate and check md5 before uploading to avoid uploading the same file repeatedly Tell the script the real filename and md5 for download

DIRAC Server Upgrade Separate the mysql database to a standalone server badger01 <--> besdirac02 dirac-code -> mysql server Web server: Also on dirac-code? A virtual machine? Another server (with CVMFS, badger02)?