Proper use of CC at Lyon J. Brunner. Batch farm / bqs BQS classes Only for test jobs !! Regular Exceptional resource needs Very long jobs Large memory/scratch.

Slides:



Advertisements
Similar presentations
Buffers & Spoolers J L Martin Think about it… All I/O is relatively slow. For most of us, input by typing is painfully slow. From the CPUs point.
Advertisements

Processes Management.
Discovering Computers Fundamentals, Third Edition CGS 1000 Introduction to Computers and Technology Fall 2006.
GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
Memory Design Example. Selecting Memory Chip Selecting SRAM Memory Chip.
Operating System Support Focus on Architecture
CS 104 Introduction to Computer Science and Graphics Problems
1 What is a Process ? The activity of program execution. Also called a task or job Has associated with it: Code Data Resources A State An executing set.
Linux+ Guide to Linux Certification, Second Edition
Chapter 8 Operating System Support
Backup and Recovery Part 1.
Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
Computer Organization and Architecture Operating System Support Chapter 8.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
CE Operating Systems Lecture 5 Processes. Overview of lecture In this lecture we will be looking at What is a process? Structure of a process Process.
Guide to Linux Installation and Administration, 2e1 Chapter 3 Installing Linux.
Review of Memory Management, Virtual Memory CS448.
Guide to Linux Installation and Administration, 2e1 Chapter 8 Basic Administration Tasks.
Introduction to HPC resources for BCB 660 Nirav Merchant
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Linux+ Guide to Linux Certification, Second Edition
Chapter 41 Processes Chapter 4. 2 Processes  Multiprogramming operating systems are built around the concept of process (also called task).  A process.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
W.A.Wojcik/CCIN2P3, May Running the multi-platform, multi-experiment cluster at CCIN2P3 Wojciech A. Wojcik IN2P3 Computing Center
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
M. Schott (CERN) Page 1 CERN Group Tutorials CAT Tier-3 Tutorial October 2009.
Migration from SL4 to SL5 at CC-Lyon J ü rgen Brunner.
Database Storage Structures
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
Operating System Structure A key concept of operating systems is multiprogramming. –Goal of multiprogramming is to efficiently utilize all of the computing.
Linux+ Guide to Linux Certification, Third Edition
Introduction to z/OS Basics © 2006 IBM Corporation Chapter 7: Batch processing and the Job Entry Subsystem (JES) Batch processing and JES.
Linux+ Guide to Linux Certification Chapter Six Linux Filesystem Administration.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
Chapter 8: Installing Linux The Complete Guide To Linux System Administration.
11th April 2003Tim Adye1 RAL Tier A Status Tim Adye Rutherford Appleton Laboratory BaBar UK Collaboration Meeting Liverpool 11 th April 2003.
Lecture 02 File and File system. Topics Describe the layout of a Linux file system Display and set paths Describe the most important files, including.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
EMI is partially funded by the European Commission under Grant Agreement RI EMI SA2 Report Andres ABAD RODRIGUEZ, CERN SA2.4, Task Leader EMI AHM,
CCJ introduction RIKEN Nishina Center Kohei Shoji.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
Software framework and batch computing Jochen Markert.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Computer Architecture Chapter (8): Operating System Support
Advanced Computing Facility Introduction
Compute and Storage For the Farm at Jlab
CE 454 Computer Architecture
Operating System.
Operating System Concepts
CC - IN2P3 Site Report Hepix Spring meeting 2011 Darmstadt May 3rd
SAM at CCIN2P3 configuration issues
Artem Trunov and EKP team EPK – Uni Karlsruhe
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
TYPES OFF OPERATING SYSTEM
Welcome to our Nuclear Physics Computing System
Compiling and Job Submission
Welcome to our Nuclear Physics Computing System
Operating Systems.
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Distributed File Systems
Distributed File Systems
CC and LQCD dimanche 13 janvier 2019dimanche 13 janvier 2019
Distributed File Systems
Distributed File Systems
Contact Information Office: 225 Neville Hall Office Hours: Monday and Wednesday 12:00-1:00 and by appointment. Phone:
Working in The IITJ HPC System
Distributed File Systems
Presentation transcript:

Proper use of CC at Lyon J. Brunner

Batch farm / bqs BQS classes Only for test jobs !! Regular Exceptional resource needs Very long jobs Large memory/scratch Regular Only on most powerful machines Try to split or group your scripts so that they fit into the regular classes If special queues needed : contact me

Batch farm / qsub options Specify precisely which resources are needed : qsub –l resource-list –t= : normalized CPU time –platform=LINUX32 : SL4 32 bit –M=512Mb : virtual memory –scratch=10000Mb : scratch size –hpss : access to HPSS –srb : use of SRB –oracle : acces of Oracle server at Lyon –u_srb_antares : Access to Antares SRB pool –u_xrootd : Use of XRootd server to acces Root files –u_sps_antares: use of Antares semiper disk space

Batch farm : HPSS/SRB It is forbidden to –Store small files on HPSS/SRB (< 100Mbyte) –Make multiple access to the same file from many jobs Forbidden means: jobs from users who are detected to perform operations like the ones above are blocked Maximal 50 jobs can access in parallel SRB/Antares Small files and multi-access files (e.g. scattering tables, noise files for TE) should be stored on $GROUP_DIR or on semiper disk space

Interactive machines ccali / ccalisl432 It is forbidden to –launch large scale productions and/or long jobs on these machines To be used only for interactive work (graphics, editing) or test jobs to prepare batch productions If observed behavior does not change, demons will be installed to kill interactive jobs which exceed certain (small) CPU time limits (e.g sec)

How to acces/store evt files SRB : Sput/Sget (/in2p3/mc, /in2p3/data) HPSS : rfcp ($ANTHPSS/users) SRB has larger buffer disks than HPSS, if you have a choice use SRB Intermediate storage of output files –$TMPBATCH (up to 16 GBytes) –Antares semiper space Same concept for output Root files

How to acces (read) Root files XRootd –Acces via XRootd server ccxroot –Large buffer disk –Faster than copy from HPSS/SRB to scratch –Replace file name of input root file by –root://ccxroot:1999//hpss/in2p3.fr/group/antares/ mc/neutrino/mu/l05_c09_s02/TE_3N_10pe/259 20/TE.MCEW.gea.anumu_lowe_109.root

Problems with XRootd When using with Root version earlier than 5.20 there are problems –Memory leak  virtual memory limit –Corrupted files Standard Root version at CC-Lyon 5.14 How to use 5.20 ? –Add following lines in your.cshrc before source $THRONG_DIR/group_cshrc –setenv ROOTSYS /usr/local/root/root_v /root –source /usr/local/shared/bin/root_env.csh –Use antares-daq, compiled with Root 5.20 like $ANTARES/src/DAQ_tree/Root_5.20_080902

Antares semiper disk space (new!) So far only $GROUP_DIR (afs) available to store files which are needed by many jobs, i.e. must be visible from many machines $GROUP_DIR total size 65 Gbytes Now GFPS space /sps/antares 2 TBytes –Not on afs, not visible from outside –Mounted on all interactive and batch machines –Direct acces (cp) or via rfio (rfcp), e.g. from HPSS User area /sps/antares/users –Create subdir with your login name mkdir /sps/antares/users/brunner –Use to store small files, frequent acces files –Use for temporary analysis files No disk quota introduced yet: Be reasonable !