CCJ introduction RIKEN Nishina Center Kohei Shoji.

Slides:

Advertisements

Similar presentations

ECMWF 1 COM INTRO 2004: Introduction to file systems Introduction to File Systems Computer User Training Course 2004 Carsten Maaß User Support.

Advertisements

Computing Infrastructure

Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.

SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.

K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.

Asynchronous Solution Appendix Eleven. Training Manual Asynchronous Solution August 26, 2005 Inventory # A11-2 Chapter Overview In this chapter,

ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.

Israel Cluster Structure. Outline The local cluster Local analysis on the cluster –Program location –Storage –Interactive analysis & batch analysis –PBS.

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

STAR Software Basics Introduction to the working environment Lee Barnby - Kent State University.

Data oriented job submission scheme for the PHENIX user analysis in CCJ Tomoaki Nakamura, Hideto En’yo, Takashi Ichihara, Yasushi Watanabe and Satoshi.

The Mass Storage System at JLAB - Today and Tomorrow Andy Kowalski.

Eos Center-wide File Systems Chris Fuson Outline 1 Available Center-wide File Systems 2 New Lustre File System 3 Data Transfer.

CCJ Computing Center in Japan for spin physics at RHIC T. Ichihara, Y. Watanabe, S. Yokkaichi, O. Jinnouchi, N. Saito, H. En’yo, M. Ishihara,Y.Goto (1),

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.

An Overview of PHENIX Computing Ju Hwan Kang (Yonsei Univ.) and Jysoo Lee (KISTI) International HEP DataGrid Workshop November 8 ~ 9, 2002 Kyungpook National.

Experiences with a HTCondor pool: Prepare to be underwhelmed C. J. Lingwood, Lancaster University CCB (The Condor Connection Broker) – Dan Bradley

Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.

Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.

How to get started on cees Mandy SEP Style. Resources Cees-clusters SEP-reserved disk20TB SEP reserved node35 (currently 25) Default max node149 (8 cores.

Write-through Cache System Policies discussion and A introduction to the system.

Linux & Shell Scripting Small Group Lecture 3 How to Learn to Code Workshop group/ Erin.

Katie Antypas User Services Group Lawrence Berkeley National Lab 17 February 2012 JGI Training Series.

SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.

Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.

HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.

M. Schott (CERN) Page 1 CERN Group Tutorials CAT Tier-3 Tutorial October 2009.

Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April

APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.

PHENIX Computing Center in Japan (CC-J) Takashi Ichihara (RIKEN and RIKEN BNL Research Center ) Presented on 08/02/2000 at CHEP2000 conference, Padova,

Nov. 8, 2000RIKEN CC-J RIKEN CC-J (PHENIX Computing Center in Japan) Report N.Hayashi / RIKEN November 8, 2000 PHENIX Computing

Getting Started on Emerald Research Computing Group.

PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.

December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.

Proper use of CC at Lyon J. Brunner. Batch farm / bqs BQS classes Only for test jobs !! Regular Exceptional resource needs Very long jobs Large memory/scratch.

GRID activities in Wuppertal D0RACE Workshop Fermilab 02/14/2002 Christian Schmitt Wuppertal University Taking advantage of GRID software now.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

November 10, 1999PHENIX CC-J Updates in Nov.991 PHENIX CC-J Updates in Nov New Hardware - N.Hayashi / RIKEN November 10, 1999 PHENIX Computing Meeting.

RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,

May 10, 2000PHENIX CC-J Updates1 PHENIX CC-J Updates - Preparation For Opening - N.Hayashi / RIKEN May 10, 2000 PHENIX Computing

Introduction to Hartree Centre Resources: IBM iDataPlex Cluster and Training Workstations Rob Allan Scientific Computing Department STFC Daresbury Laboratory.

Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.

Oct. 6, 1999PHENIX Comp. Mtg.1 CC-J: Progress, Prospects and PBS Shin’ya Sawada (KEK) For CCJ-WG.

The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.

Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.

Software framework and batch computing Jochen Markert.

+ Introduction to Unix Joey Azofeifa Dowell Lab Short Read Class Day 2 (Slides inspired by David Knox)

ATLAS Computing Wenjing Wu outline Local accounts Tier3 resources Tier2 resources.

Patrick Gartung 1 CMS 101 Mar 2007 Introduction to the User Analysis Facility (UAF) Patrick Gartung - Fermilab.

September 26, 2003K User's Meeting1 CCJ Usage for Belle Monte Carlo production and analysis –CPU time: 170K hours (Aug.1, 02 ~ Aug.22, 03)

Advanced Computing Facility Introduction

Compute and Storage For the Farm at Jlab

GRID COMPUTING.

Work report Xianghu Zhao Nov 11, 2014.

SAM at CCIN2P3 configuration issues

Artem Trunov and EKP team EPK – Uni Karlsruhe

Welcome to our Nuclear Physics Computing System

Haiyan Meng and Douglas Thain

Near Real Time Reconstruction of PHENIX Run7 Minimum Bias Data From RHIC Project Goals Reconstruct 10% of PHENIX min bias data from the RHIC Run7 (Spring.

CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster

Welcome to our Nuclear Physics Computing System

Vanderbilt University

Preparations for Reconstruction of Run6 Level2 Filtered PRDFs at Vanderbilt’s ACCRE Farm Charles Maguire et al. March 14, 2006 Local Group Meeting.

CIS 155 Lecture 10, Farewell to UNIX

Michael P. McCumber Task Force Meeting April 3, 2006

Presentation transcript:

CCJ introduction RIKEN Nishina Center Kohei Shoji

RIKEN CCJ – PHENIX Computing Center in Japan RIKEN CCJ is constructed as a principal site of computing for PHENIX simulation and data analysis At room 258 in Main research building 2 Linux CPU Farm 2.6GHz x 264CPUs 500GB memory in total 380TB disk Data storage 40TB on HDD 1400TB on HPSS Data transfer between RCF 200MB/sec with Grid-FTP

Overview 3 Internet NIS NFS /ccj/u/username Gateway ccjsun.riken.jp ccjgw.riken.jp Linux CPU Farm Interactive node linux1/2/3/4/5 NFS /ccj/w/r0x/username NFS afs NFS DB NFS data storage RO login LSF batch job RCF daily mirroring

Gateway & interactive node Login to ccjsun.riken.jp / ccjgw.riken.jp Login to interactive node – most users use linux4 / linux5 which have 64bit-CPU with SL5.3 – other interactive nodes have 32bit-CPU with older SL version Common directories served by NFS – /ccj/u/username home directory quota 4GB/5GB daily backup – /ccj/w/r0x/username work directory quota 40GB/50GB NO daily backup Tips – confirm your quota by /usr/bin/rsh ccjnfs20 vxquota -v 4

Interactive node & Linux CPU Farm Setup source /opt/phenix/bin/phenix_setup.csh -a setenv ODBCINI /opt/ccj/etc/odbc.ini.ccj.analysis setenv PGHOST ccjams002.riken.go.jp setenv PGPORT 5432 or 5440 Use “-a” option because phenix_setup.csh remove PATH and LD_LIBRARY_PATH needed for LSF batch job system if “-a” is not specified “setenv”s are for database access at CCJ CCJ provides /afs/rhic.bnl.bnl.gov and database which you are familiar at RCF But they are just daily-mirror of that at RCF – Actually you can commit to CVS but it would be lost next day – The changes at RCF are reflected next day 5

Linux CPU Farm LSF batch job system is used at CCJ (cf. Condor at RCF) Commands bsub –q short –o stdout.txt –e stderr.txt –s “script arguments” select queue with your job length : short 2.5 hours / long 24 hours get log for stdout and stderr – the output should be under /ccj/u/username – could NOT specify under /ccj/w/r0x/username because the directory is mounted with read-only mode at batch node write script you want to run with double quotation bqueues check queue status bjobs check job status 6

Linux CPU Farm – local disk If submitted jobs make multiple/massive access to NFS, everything gets slow – people on interactive node needs a few or more minutes to get “ls” Do not read from/write to /ccj/u or /ccj/w/r0x in your job To avoid this, in your job script – copy input files to local disk at batch node – do computing / run analysis and write output to local disk – copy output files from local disk to your directory Use dedicated command “rcpx” for file copy, not use “cp” 7 NFS /ccj/u/username batch node NFS /ccj/w/r0x/username batch node local disk

Linux CPU Farm – rcpx Local disk directory you can use at batch node is given by environmental variable $CCJ_JOBTMP /ccj/u and /ccj/w/r0x are served by NFS server ccjnfs20 Example of job script 8 #!/bin/csh # copy input files rcpx ccjnfs20:/ccj/w/r0x/username/run.csh $CCJ_JOBTMP/run.csh rcpx ccjnfs20:/ccj/w/r0x/username/input.root $CCJ_JOBTMP/input.root set input_name = $CCJ_JOBTMP/input.root set output_name = $CCJ_JOBTMP/output.root # run analysis $CCJ_JOBTMP/run.csh $input_name $output_name # copy output files rcpx $CCJ_JOBTMP/output.root ccjnfs20:/ccj/w/r0x/username/output.root exit

Linux CPU Farm – rcpx rcpx command controls the number of accesses to the NFS server to avoid the panic Tips – Do not forget to put “ccjnfs20:” or the rcpx fails – It would be better to use wild card to reduce overhead – Directory you submitted job is the initial directory of the batch job – You don’t need to take care about libraries and database though they are served by NFS – Directory you used ($CCJ_JOBTMP) is automatically cleaned by LSF 9

Linux CPU Farm – nDST Several types of nDSTs have been put at local disks data/localdisk/index.html data/localdisk/index.html User can access these nDSTs by using “-r” option of bsub Command bsub –q short –o stdout.txt –e stderr.txt –r –s “script arguments” The location of nDSTs is given by variable $CCJ_DATADIR Available files can be obtained by /opt/ccj/bin/filename – this command lists available nDST file name Example to get file list in job script 10 set files = `/opt/ccj/bin/filename | grep CNT` foreach file ( $files ) /bin/ls $CCJ_DATADIR/$file >> filelist.txt end

Data storage & File transfer HPSS – tape drive to archive files NFS – If you need a space to put some amount of files, please submit the application to get the space on which files can be accessed with rcpx from batch node – scp & bbftp – transfer files from/to outside – scp to transfer files up to 100MB – bbftp to transfer files up to 50GB Grid-FTP – CCJ achieved transfer speed of 200MB/sec between RCF and CCJ – If you want to transfer new DSTs from RCF, please contact to CCJ guys 11

Summary CCJ provides analysis/computing environment similar to RCF – libraries, CVS, database, etc. LSF batch job system is available at CCJ – User should take care not to hang up the system by reading/writing NFS directly – Several types of nDSTs are available on batch job HPSS, NFS file storage are useful to progress the analysis and simulation smoothly User can use scp/bbftp to transfer files from/to outside Grid-FTP makes it possible to transfer a large amount of files between CCJ and RCF Please contact to phenix-ccj-admin at ribf.riken.jp 12