ALICE Offline Tutorial Using the AliEn Grid Client GSI, 4 th Mar. 2010.

Slides:



Advertisements
Similar presentations
Learning Unix/Linux Bioinformatics Orientation 2008 Eric Bishop.
Advertisements

CS 497C – Introduction to UNIX Lecture 26: - The Process Chin-Chih Chang
Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.
Linux+ Guide to Linux Certification, Second Edition
UNIX Chapter 00 A “ Quick Start ” into UNIX Operating System Mr. Mohammad Smirat.
1c.1 Assignment 2 Preliminaries Review (Full details in assignment write-up.)‏ © 2011 B. Wilkinson/Clayton Ferner. Fall 2011 Grid computing course. Modification.
Guide To UNIX Using Linux Third Edition
Installing and running COMSOL on a Windows HPCS2008(R2) cluster
1 SEEM3460 Tutorial Unix Introduction. 2 Introduction What is Unix? An operation system (OS), similar to Windows, MacOS X Why learn Unix? Greatest Software.
Using Macs and Unix Nancy Griffeth January 6, 2014 Funding for this workshop was provided by the program “Computational Modeling and Analysis of Complex.
Introduction to UNIX/Linux Exercises Dan Stanzione.
AliEn Tutorial MODEL th May, May 2009 Installation of the AliEn software AliEn and the GRID Authentication File Catalogue.
Partner Logo German Cancio – WP4-install LCFG HOW-TO - n° 1 WP4 hands-on workshop: EDG LCFGng exercises
Unix Primer. Unix Shell The shell is a command programming language that provides an interface to the UNIX operating system. The shell is a “regular”
Lesson 7-Creating and Changing Directories. Overview Using directories to create order. Managing files in directories. Using pathnames to manage files.
Back to content Final Presentation Mr. Phay Sok Thea, class “2B”, group 3, Networking Topic: Mail Client “Outlook Express” *At the end of the presentation.
Linux environment ● Graphical interface – X-window + window manager ● Text interface – terminal + shell.
Let’s Make An Form! Bonney Armstrong GD 444 Westwood College February 9, 2005.
Introduction to Shell Script Programming
Elisabetta Ronchieri - How To Use The UI command line - 10/29/01 - n° 1 How To Use The UI command line Elisabetta Ronchieri by WP1 elisabetta.ronchieri.
Week 7 Working with the BASH Shell. Objectives  Redirect the input and output of a command  Identify and manipulate common shell environment variables.
Agenda User Profile File (.profile) –Keyword Shell Variables Linux (Unix) filters –Purpose –Commands: grep, sort, awk cut, tr, wc, spell.
Unix Basics Chapter 4.
– Introduction to the Shell 10/1/2015 Introduction to the Shell – Session Introduction to the Shell – Session 2 · Permissions · Users.
Alice off-line meeting Alberto Colla Cern, October 3, 2005 AliEn How-To Alice off-line meeting Cern, October 3, 2005 Alberto Colla (Alice off-line Calibration.
An Introduction to Unix Shell Scripting
Introduction to Unix – CS 21 Lecture 9. Lecture Overview Shell description Shell choices History Aliases Topic review.
FTP Server and FTP Commands By Nanda Ganesan, Ph.D. © Nanda Ganesan, All Rights Reserved.
110/10/06 - AliEn AliEn Tutorial Solutions Panos Christakoglou University of Athens - CERN.
Linux+ Guide to Linux Certification, Third Edition
UNIX Commands. Why UNIX Commands Are Noninteractive Command may take input from the output of another command (filters). May be scheduled to run at specific.
Module 6 – Redirections, Pipes and Power Tools.. STDin 0 STDout 1 STDerr 2 Redirections.
Agenda Link of the week Use of Virtual Machine Review week one lab assignment This week’s expected outcomes Review next lab assignments Break Out Problems.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Andrei Gheata, Mihaela Gheata, Andreas Morsch ALICE offline week, 5-9 July 2010.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
Working with AliEn Kilian Schwarz ALICE Group Meeting April
1 CSE306 Operating Systems Projects CVS/SSH tutorial.
Introduction to Programming Using C An Introduction to Operating Systems.
Executable scripts. So far We have made scripts echo hello #for example And called it hello.sh Run it as sh hello.sh This only works from current directory?
Configuring and Troubleshooting Identity and Access Solutions with Windows Server® 2008 Active Directory®
Linux Commands C151 Multi-User Operating Systems.
CS 245 – Part 1 Using Operating Systems and Networks for Programmers Jiang Guo Dept. of Computer Science California State University Los Angeles.
FTP COMMANDS OBJECTIVES. General overview. Introduction to FTP server. Types of FTP users. FTP commands examples. FTP commands in action (example of use).
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
Lab 8 Overview Apache Web Server. SCRIPTS Linux Tricks.
Lecture 02 File and File system. Topics Describe the layout of a Linux file system Display and set paths Describe the most important files, including.
Linux+ Guide to Linux Certification, Second Edition
Agenda The Bourne Shell – Part I Redirection ( >, >>,
A GANGA tutorial Professor Roger W.L. Jones Lancaster University.
AliEn Tutorial ALICE workshop Sibiu 20 th August, 2008 Pablo Saiz.
Linux Administration Working with the BASH Shell.
AAF tips and tricks Arsen Hayrapetyan Yerevan Physics Institute, Armenia.
INTRODUCTION TO SHELL SCRIPTING By Byamukama Frank
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
GRID COMPUTING.
AliEn Tutorial Panos Christakoglou University of Athens - CERN
Development Environment Basics
Basic aliensh S. Bagnasco, INFN Torino CNAF Nov 27-28, 2007.
Prepared by: Eng. Maryam Adel Abdel-Hady
CONTENT MANAGEMENT SYSTEM CSIR-NISCAIR, New Delhi
Running a job on the grid is easier than you think!
Running a job on the grid is easier than you think!
Chapter 2: System Structures
The Linux Operating System
Shell Script Assignment 1.
CSE 303 Concepts and Tools for Software Development
Intro to UNIX System and Homework 1
LING 408/508: Computational Techniques for Linguists
Introduction Paul Flynn
Presentation transcript:

ALICE Offline Tutorial Using the AliEn Grid Client GSI, 4 th Mar. 2010

Contents I Prerequisites Installation of the AliEn Grid client Connection and Login/Authentication General description of the shell Functionality and orientation Basic commands View, edit and copy files within the catalogue 2 ALICE Offline Tutorial - Using the AliEn Grid Client

Contents II Grid Jobs: Job submission, status and control Overview of the JDL files (Job Description Language) Working with the file catalogue Copying files from/to the catalogue. Creating collections of files Working with the AliEn plug-in Configure your own plug-in Run the analysis in grid via the plug-in 3 ALICE Offline Tutorial - Using the AliEn Grid Client

Prerequisites Did you follow ALL the steps for the user registration ? Do you have valid usercert.pem and userkey.pem files ? If not, you will only be able to watch this tutorial... The registration was supposed to be done at: 4 ALICE Offline Tutorial - Using the AliEn Grid Client

Installing the AliEn client Get the client installer (alien-installer) wget Make the file executable chmod +x alien-installer Just start the installer and wait: 5 ALICE Offline Tutorial - Using the AliEn Grid Client

Installing AliEn client (cont.) … or if you want to specify an alternate installation location 6 ALICE Offline Tutorial - Using the AliEn Grid Client

Installing AliEn client (cont.) If you didn’t have the folder ~/bin, the installer created it for you But you have ensure it is set in your PATH Shell Environment Variable Set it in the appropriate configuration file for your shell ! Or you’ll have to set it each time you open a new shell to login: export PATH=$PATH:~/bin 7 ALICE Offline Tutorial - Using the AliEn Grid Client

Installation – Try it out Copy grid certificates to the computer in front of you mkdir.globus Verify location of your certificate+key and their permissions ~/.globus userkey.pem usercert.pem Download the alien installer: Make the file executable Run the installer (specify alternative installation location) Check the installation went fine and you see the directories Do export PATH=$PATH:~/bin 8 ALICE Offline Tutorial - Using the AliEn Grid Client

Authenticating at the AliEn Shell (I) To access the AliEn Shell you have to authenticate every 24h by creating your access token with alien-token-init 9 ALICE Offline Tutorial - Using the AliEn Grid Client

Authenticating at the AliEn Shell (II) 10 ALICE Offline Tutorial - Using the AliEn Grid Client

Authentication – Problems Permissions on ~/.globus/userkey.pem are not private to your user chmod 400 userkey.pem Your certificate authority is exotic and not known to the server Your certificate has expired You have not given the AliEn user name as an argument to the token init command and your local user name is not identical to the AliEn user name Clock skew - your local computer time is out of the validity time of your certificate 11 ALICE Offline Tutorial - Using the AliEn Grid Client

Authentication – Try it out Do alien-token-init If asked about compiling the gapi and xrootd libs say ”no” Later, to install on your own machine and do analysis you’ll have to say “yes”. At the end you should have a valid token. 12 ALICE Offline Tutorial - Using the AliEn Grid Client

AliEn Shell Doing aliensh... Standard bash shell with grid commands Main shell features are available Command / file / path completion 13 ALICE Offline Tutorial - Using the AliEn Grid Client

AliEn Shell - Basic commands Standard Unix Shell commands work as usual: ls, cd, mkdir/rmdir, cat, more, pwd, whoami... There’s a help command to list all known commands Get a complete command list by typing Commands have ‘-h’ flag to print out a short help message 14 ALICE Offline Tutorial - Using the AliEn Grid Client

Shell – editing files All old versions of the edited file are saved in a hidden folder To delete all the old versions (NOT the file itself with your last changes ) You can choose your editor in the file ~/.alienshrc : export EDITOR='your-choice': emacs | emacs -nw | xemacs | xemacs -nw | pico | vi | vim ( “vi” is default ) 15 ALICE Offline Tutorial - Using the AliEn Grid Client

Shell – Copying files from/to the Catl. The cp command works just like the Unix Shell command but is operating on the files in the Grid Catalogue. To specify files on your local disk, use “file:” as a location prefix You may need to define the environment variable alien_CLOSE_SE pointing to a SE that is accessible and close to your location – export alien_CLOSE_SE=“ALICE::GSI::SE” You can always specify the SE location in the copy commands – 16 ALICE Offline Tutorial - Using the AliEn Grid Client

Shell – “whereis” command Where is the file tutorial.test actually stored ? 17 ALICE Offline Tutorial - Using the AliEn Grid Client

Shell – Try it out Access the alien shell Check your user name by typing whoami List the contents of your home directory Do the following in your AliEn space: Create the directories ~/bin, ~/tutorial/ and ~/tutorial/output cp /alice/cern.ch/user/s/sschrein/tutorial/tutorial_textfile ~/tutorial cp /alice/cern.ch/user/s/sschrein/tutorial/grid_tutorial.pdf ~/tutorial Now copy the grid_tutorial.pdf to your local machine Get the information of the file (whereis) tutorial_textfile Edit the file and append a comment of yourself Copy it to your local machines home directory and check it’s there and you can open it 18 ALICE Offline Tutorial - Using the AliEn Grid Client

Grid Jobs – JDL Files I Executable: Compulsory field where we give the name of the executable (must be stored in /bin or $V0/bin or ~/bin) Executable = "alienroot"; Arguments: Will be passed to the executable Arguments = “ –q -b"; Packages: Type packages in the shell to see what packages are installed Packages = { "APICONFIG::V2.2", "ROOT::v5-13" }; InputFile: The files that will be transported to the node where the job will run InputFile = { "LF:/alice/cern.ch/user/a/alip/macros/bAnalysis.C" }; Validationcommand: Specifies the script to be used as a validation script Validationcommand = "/alice/cern.ch/user/a/alienmas/validation.sh“; 19 ALICE Offline Tutorial - Using the AliEn Grid Client

Grid Jobs – JDL Files II InputData: It will require that the job will be executed in a site close to the files specified here InputData = { “LF:/alice/cern.ch/data/AliESDs.root,nodownload" }; InputDataCollection: The filename of the collection of the input data InputDataCollection = "LF:/alice/cern.ch/data/002.xml,nodownload” ; InputDataList: The filename in which the job will get the input data collection InputDataList = "collection.xml" ; InputDataListFormat: The format of the InputData list InputDataListFormat = "xml-single" ; Receive a mail when the job finishes = ; 20 ALICE Offline Tutorial - Using the AliEn Grid Client

TTL: The maximum run time of your job in seconds TTL = 7200 ; Split: Split the jobs in several sub jobs split = "se" ; MasterResubmitThreshold: Resubmit sub jobs, if less than are successful MasterResubmitThreshold= “99%" ;# or give a absolute Job number SplitMaxInputFileNumber: Max input file count of each sub job SplitMaxInputFileNumber= “100” ; OutputDir: Where the output files+archives will be stored OutputDir = "/alice/cern.ch/user/a/aliprod/analysis/output101" ; Grid Jobs – JDL Files III 21 ALICE Offline Tutorial - Using the AliEn Grid Client

Grid Jobs – JDL Files IV OutputFile: The files that will be registered in the catalogue once the job finishes OutputFile = { "myFilename" };# default 2 copies (disk=2) OutputFile = { };# give me 3 copies OutputArchive: What files will be archived in a zip file OutputArchive = { "myArchivename:*.root" };# analogue above We have a Storage Element discovery and failover mechanism, storing your output files by default always in the two topmost locations. To get more (up to 9 copies), you can specify the count. → DON'T USE THE EXPLICIT FORMAT ( you may find it old JDLs ): as e.g. OutputFile = 22 ALICE Offline Tutorial - Using the AliEn Grid Client

Grid Jobs – JDL File Example Packages = { }; Executable = "/alice/cern.ch/user/a/alienmas/bin/PPQMixexe.sh"; InputFile = { "LF:/alice/cern.ch/user/a/alienmas/PPQMix/runAnalysis.C", "LF:/alice/cern.ch/user/a/alienmas/PPQMix/PPQMixexe.root", "LF:/alice/cern.ch/user/a/alienmas/PPQMix/ConfigureCuts.C", "LF:/alice/cern.ch/user/a/alienmas/PPQMix/PPQMixTask.h", "LF:/alice/cern.ch/user/a/alienmas/PPQMix/PPQMixTask.cxx“ }; InputDataListFormat = "xml-single"; InputDataList = "wn.xml"; InputDataCollection = { "LF:/alice/cern.ch/user/a/alienmas/PPQMix/ ,nodownload" }; MasterResubmitThreshold = "99%"; Split = "se"; SplitMaxInputFileNumber = "100"; OutputArchive = { "log_archive.zip:stdout,stderr" }; OutputFile = { "output1.root" }; OutputDir = "/alice/cern.ch/user/a/alienmas/PPQMix/output/003"; TTL = 30000; Validationcommand = "/alice/cern.ch/user/a/alienmas/PPQMix/PPQMixexe_validation.sh“; Jobtag = { "comment:My analysis Job" }; 23 ALICE Offline Tutorial - Using the AliEn Grid Client

Job Validation You're supposed to have an error validation script for your jobs! 24 ALICE Offline Tutorial - Using the AliEn Grid Client #!/bin/bash #... Some missing content here echo "*******************************************************" >> stdout echo "* Time: $validatetime " >> stdout segFault=`grep -Ei "Segmentation fault" stderr` if [ "$segFault" != "" ] ; then error=1 echo "* ########## Job not validated - Segment. fault ###" >> stdout echo "$segFault" >> stdout echo "Error = $error" >> stdout fi if ! [ -f *.file ] ; then error=1 echo "Output file(s) not found. Job FAILED !" >> stdout echo "Output file(s) not found. Job FAILED !" >> stderr fi if [ $error = 0 ] ; then echo "* Job Validated *" >> stdout fi cd - exit $error

Submitting Jobs In order to submit a job, call submit together with the JDL file you’ve created Your Job will be send to the AliEn Task Queue Thereafter, it will be picked up by an AliEn Site somewhere in the world 25 ALICE Offline Tutorial - Using the AliEn Grid Client

All details: Job Status / Lifecycle (simplified) INSERTED - I - WAITING - W - ASSIGNED - A - STARTED - ST - RUNNING - R - SAVING - SV - SAVED - S - DONE - D - VALIDATION FAILED - EV - ERROR_SV - ESV - ERROR_V - EV – YES/NO Error during validation process. Validation done, but job didn't comply. The output files could not be stored. 26 ALICE Offline Tutorial - Using the AliEn Grid Client

Job Status I To check about the Grid jobs you got two commands, both have a lot of additional parameters. ps will give you a list of your jobs: top is more verbose than ps and will give you by default a list of ALL jobs in the queue. Attention, this can be a long list, better use parameters: 27 ALICE Offline Tutorial - Using the AliEn Grid Client

Job Status II – A job's JDL ps –jdl displays the job’s JDL during or after the job’s runtime. Be aware, during/after runtime the JDL contains more information. 28 ALICE Offline Tutorial - Using the AliEn Grid Client

Job Status III – A job's tracelog ps –trace all prints out the complete job trace log during or after the job’s runtime. 29 ALICE Offline Tutorial - Using the AliEn Grid Client

Job Status IV – Spy on Job Output With spy you can check the output of a job while it is still running BUT: Never spy on large files, e.g. never spy on a *.root file 30 ALICE Offline Tutorial - Using the AliEn Grid Client

Job Status V – Master Jobs masterjob will print a status of the sub jobs of the specified master job … this is only interesting/working, if you have a job that splits 31 ALICE Offline Tutorial - Using the AliEn Grid Client

Job Control – Kill a Job You can also kill a job while it is running with: kill 32 ALICE Offline Tutorial - Using the AliEn Grid Client

Submitting/Running Jobs - Problems If everything is ok with your JDL then your job is submitted and a is assigned to it. You get a submission error message, e.g. if Your JDL contains errors, e.g. the syntax is not correct A file listed in the JDL is missing A package defined in the JDL is not listed in the packman 33 ALICE Offline Tutorial - Using the AliEn Grid Client

Creating File Collections With find you can create XML collections of files: find –x > Don’t forget the output file is local on you machine, you need to upload it. 34 ALICE Offline Tutorial - Using the AliEn Grid Client

Files and Grid Jobs – Try it out Do the following... cp /alice/cern.ch/user/s/sschrein/bin/tut_testjob.sh ~/bin cp /alice/cern.ch/user/s/sschrein/tutorial/tutorial.jdl ~/tutorial cp /alice/cern.ch/user/s/sschrein/tutorial/tut_validation.sh ~/tutorial Fix the file locations inside tutorial.jdl ( each “s/sschrein” to your user’s folder) Now submit the job submit ~/tutorial/tutorial.jdl You should get an error, the JDL file has syntax problems -> find the errors and correct them ! Submit the job again Follow the job's stages 35 ALICE Offline Tutorial - Using the AliEn Grid Client

Files and Grid Jobs – Try it out Once the job has finished with Go to cd ~/tutorial/output Check what the job did, where it was running, the files and their content. That’s supposed to be it. Mission accomplished! 36 ALICE Offline Tutorial - Using the AliEn Grid Client

A helper for AliEn analysis Works as a plugin for the analysis manager (as event handlers) – One has to create and configure a AliAnalysisAlien object – See: Creates dataset, JDL, analysis macro, execution+validation scripts Submits your job and merges the results

MyAnalysis.jdl submit File catalog Analysis Manager task1 task2 task3 taskN Outputs GRID analysis via plugin MyAnalysis.C CLIENT A.Gheata, CHEP'09 38 A coll abo rati ve anal ysis fra me wor k in use for ALI CE exp eri me nt AliEn grid plugin SetGridDataDir() AddRunNumber() SetAditionalLibs() SetOutputFile() MyAnalysis.root AnalysisPlayer.C Dataset.xml WN ALIEN UI SE WN SE WN SE AM Outputs AM Outputs AM Outputs AM->StartAnalysis(“grid”) Analysis Manager TAlien AM->StartAnalysis(“local”) Terminate()

Important plug-in settings plugin->SetRunMode(const char *mode) – “full”: generate files, copy in grid, submit, merge – “offline”: generate files, user can change them – “submit”: copy files in grid, submit, merge – “terminate”: merge available results – “test”: generate files + a small dataset, run locally as a remote job plugin->SetNtestFiles(Int_t nfiles) – default 1 plugin->SetROOTVersion(const char *rootver) plugin->SetAliRootVersion(const char *alirootver) – Change whenever needed – See command: aliensh[] packages

Describing the input data plugin->SetGridDataDir(const char *datadir) – Put here the alien path before run numbers – See pcalimonitor.cern.ch for relevant data paths plugin->SetDataPattern(const char *pattern) – Use uniquely identifying patterns i.e. */pass3/*/AliESDs.root – Plugin supports making datasets on ESD, ESD tags or AOD plugin->SetRunRange(Int_t min, Int_t max) – Sets the run range to be analyzed – Enumeration of run numbers allowed – For existing data collections, use AddDataFile() Plugin->SetRunPrefix(“000”) – To be used for real data

Describing the output plugin->SetGridOutputDir(const char *dir) – Can be absolute AliEn FC path (/alien/cern.ch/…) or relative to work directory (no slashes) plugin->SetOutputFiles(“file1 file2 …”); – Allows a selection of files among the analysis outputs plugin->SetDefaultOutputs() – Enables all outputs of the tasks connected to the analysis manager plugin-SetOutputArchive(“log_archive.zip:stderr,stdout – Will save the standard output/error in a zip and all root files in another zip replicated in 2 storage elements – Note: If archiving the output you may want to omit declaring the output files

Other settings Using par files – plugin->EnablePackage(“package.par”) Using other external libraries available in AliEn – plugin->AddExternalPackage("fastjet::v2.4.0") Compiling single source files – plugin->SetAnalysisSource(“mySource.cxx”) – But files have to be uploaded to AliEn fron current directory – plugin->SetAdditionalLibs(“libJETAN.so mySource.cxx mySource.h”) Extra libraries to be loaded (besides AF ones) have to be enumerated in the same method.

Optional settings Number of files per job – plugin->SetSplitMaxInputFileNumber(Int_t n); Number of runs per master job – plugin->SetNrunsPerMaster(Int_t n) Number of files to merge in a chunk – plugin->SetMaxMergeFiles(Int_t n) Resubmit threshold – plugin->SetMasterResubmitThreshold(Int_t percentage) Process a single run per job and output to a single directory – plugin->SetOutputSingleFolder(const char *folder) ALICE Offline Tutorial - Using the AliEn Grid Client 43

Configuring and running the AliEn plugin Open CreateAlienHandler.C Change working/output directories Modify number of files/worker Make sure the run mode is set to “full” Run macro runGrid.C Inspect the job status Modify the run mode to “terminate” once job finished Run again runGrid.C

References I AliEn Website with further documentation: ALICE Analysis User Guide: UserGuide-0.0m.pdf 45 ALICE Offline Tutorial - Using the AliEn Grid Client