Chimera Workshop September 4 th, 2002. 2 Outline l Install Chimera l Run Chimera –Hello world –Convert simple shell pipeline –Some diamond, etc. l Get.

Slides:



Advertisements
Similar presentations
Shell Script Assignment 1.
Advertisements

Virtual Data and the Chimera System* Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
Utilizing the GDB debugger to analyze programs Background and application.
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
CS Lecture 03 Outline Sed and awk from previous lecture Writing simple bash script Assignment 1 discussion 1CS 311 Operating SystemsLecture 03.
Linux+ Guide to Linux Certification, Second Edition
The Unix Shell. Operating System shell The shell is a command interpreter It forms the interface between a user and the operating system When you log.
Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
Introduction to Unix (CA263) Introduction to Shell Script Programming By Tariq Ibn Aziz.
Lecture 02CS311 – Operating Systems 1 1 CS311 – Lecture 02 Outline UNIX/Linux features – Redirection – pipes – Terminating a command – Running program.
Submit Host Setup (user) Tc.data file Pool.config file Properties file Vdl-gen file Input file Exitcode checking script.
Introduction to Linux and Shell Scripting Jacob Chan.
Shell Script Examples.
Second edition Your UNIX: The Ultimate Guide Das © 2006 The McGraw-Hill Companies, Inc. All rights reserved. UNIX – Shell Programming The activities of.
Shell Programming, or Scripting Shirley Moore CPS 5401 Fall August 29,
Introduction to UNIX/Linux Exercises Dan Stanzione.
Advanced File Processing
Introduction to Shell Script Programming
What is Sure BDCs? BDC stands for Batch Data Communication and is also known as Batch Input. It is a technique for mass input of data into SAP by simulating.
1 Operating Systems Lecture 3 Shell Scripts. 2 Brief review of unix1.txt n Glob Construct (metacharacters) and other special characters F ?, *, [] F Ex.
Week 7 Working with the BASH Shell. Objectives  Redirect the input and output of a command  Identify and manipulate common shell environment variables.
Agenda User Profile File (.profile) –Keyword Shell Variables Linux (Unix) filters –Purpose –Commands: grep, sort, awk cut, tr, wc, spell.
Guide To UNIX Using Linux Fourth Edition
LIN 6932 Unix Lecture 6 Hana Filip. LIN 6932 HW6 - Part II solutions posted on my website see syllabus.
Unix Talk #2 (sed). 2 You have learned…  Regular expressions, grep, & egrep  grep & egrep are tools used to search for text in a file  AWK -- powerful.
An Introduction to Unix Shell Scripting
Linux Shell Programming Tutorial 3 ENGR 3950U / CSCI 3020U Operating Systems Instructor: Dr. Kamran Sartipi.
CST8177 bash Scripting Chapters 13 and 14 in Quigley's "UNIX Shells by Example"
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
Writing C-shell scripts #!/bin/csh # Author: Ken Berman # Date: # Purpose: display command and parameters echo $0 echo $argv[*]
Shell Script Programming. 2 Using UNIX Shell Scripts Unlike high-level language programs, shell scripts do not have to be converted into machine language.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
UNIX Shell Script (1) Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Scripting Languages Course 2 Diana Trandab ă ț Master in Computational Linguistics - 1 st year
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Chapter Five Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command.
Linux+ Guide to Linux Certification Chapter Eight Working with the BASH Shell.
Module 6 – Redirections, Pipes and Power Tools.. STDin 0 STDout 1 STDerr 2 Redirections.
Shell Programming. Creating Shell Scripts: Some Basic Principles A script name is arbitrary. Choose names that make it easy to quickly identify file function.
Agenda Link of the week Use of Virtual Machine Review week one lab assignment This week’s expected outcomes Review next lab assignments Break Out Problems.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
1 Operating Systems Lecture 2 UNIX and Shell Scripts.
BIF713 Additional Utilities. Linux Utilities  You have learned many Linux commands. Here are some more that you can use:  Data Manipulation (Reg Exps)
GriPhyN Virtual Data System Grid Execution of Virtual Data Workflows Mike Wilde Argonne National Laboratory Mathematics and Computer Science Division.
WDO-It! 102 Workshop: Using an abstraction of a process to capture provenance UTEP’s Trust Laboratory NDR HP MP.
Chapter Five Advanced File Processing. 2 Lesson A Selecting, Manipulating, and Formatting Information.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
40 Years and Still Rocking the Terminal!
Chapter Six Introduction to Shell Script Programming.
Virtual Data Management for CMS Simulation Production A GriPhyN Prototype.
Linux+ Guide to Linux Certification, Second Edition
Lesson 6-Using Utilities to Accomplish Complex Tasks.
Aggregator Stage : Definition : Aggregator classifies data rows from a single input link into groups and calculates totals or other aggregate functions.
1 UNIX Operating Systems II Part 2: Shell Scripting Instructor: Stan Isaacs.
Linux Administration Working with the BASH Shell.
INTRODUCTION TO SHELL SCRIPTING By Byamukama Frank
GRID COMPUTING.
SUSE Linux Enterprise Desktop Administration
Pegasus WMS Extends DAGMan to the grid world
Implementation of a simple shell, xssh (Section 1 version)
Introduction Python is an interpreted, object-oriented and high-level programming language, which is different from a compiled one like C/C++/Java. Its.
Shell Script Assignment 1.
Tutorial of Unix Command & shell scriptS 5027
What is Bash Shell Scripting?
Guide To UNIX Using Linux Third Edition
Unix Talk #2 (sed).
Electronics II Physics 3620 / 6620
CST8177 Scripting 2: What?.
Frieda meets Pegasus-WMS
Presentation transcript:

Chimera Workshop September 4 th, 2002

2 Outline l Install Chimera l Run Chimera –Hello world –Convert simple shell pipeline –Some diamond, etc. l Get (some) user apps to run with shplan l Introduce The Grid –As time permits

3 Prepare Chimera Installation l Log onto your Linux system. l Ensure Java version 1.3.* –Type "$JAVA_HOME/bin/java –version" –If necessary, download from –Use Java 1.4.* at your own risk l Set JAVA_HOME environment variable –Points to Java installation directory –Example: setenv JAVA_HOME /usr/lib/java

4 Install Chimera l Download the latest tar-ball –The binary release is smaller –The source release allows patching –Let's try the source release for now… l Unpack –gtar xvzf vds-source-YYYYMMDD.VV.tar.gz l cd vds-1.0b1 –Current version is beta1

5 Starting Chimera l Source necessary shell script (all releases) –Some environments require explicit setting of VDS_HOME=`/bin/pwd` before sourcing –Bourne:. set-user-env.sh –C-Shell: source set-user-env.csh –Ignore warnings (not errors) about VDS_HOME l Run test-version program to check –Script is a small sanity test –If it fails, Chimera will not run!

6 Sample test-version Script Output # checking for JAVA_HOME # checking for CLASSPATH # checking for VDS_HOME # script JAVA_HOME=/usr/lib/java # script CLASSPATH=…/lib/java-getopt jar:…/lib/xercesImpl.jar:…/lib/chimera.jar:…/lib/antlrall.jar:…/lib/xmlParserAPIs.jar # starting java # Using recommended version of Java, OK # looking for Xerces # found xercesImpl\.jar, OK for now # looking for antlr # found antlr(all)?.jar, OK for now # looking for GNU getopt # found java-getopt-.\..\..\.jar, OK for now # looking for myself # in developer mode, ignore not finding chimera.jar # found chimera\.jar, OK for now

7 Working with Chimera files VDLt vdlt2vdlx VDLx vdlx2vdlt ins/upd VDLx gendax DAX Only the abstract planning process is shown

8 The Virtual Data Languages l VDLt is textual l Concise l For human consumption l Usually for (TR) transformation l Process by converting to VDLx l VDLx uses XML l Uses XML-Schema l For generation from scripts l Usually for (DV) derivations l Storage representation of current VDDB

9 Virtual Data Language l A "function call" paradigm –Transformations are like function definitions. –Derivations are like function calls. l Many calls to one definition –Many (zero to N) calls of the same transformation.

10 Hello World l Exercise 1: –Wrap echo 'Hello world!' into VDL l Start with the transformation TR hw( output file ) {argument = "Hello world!"; argument stdout = ${file}; } l Then "call" the transformation DV d1->hw( ); l Save into a file hw.vdl

11 What to do with it? l All VDLt must be converted into VDLx before it becomes "usable" to the VDS. vdlt2vdlx hw.vdl hw.xml l All VDLx must be stored into your VDDB, before the system can work with it insertvdc –d my.db hw.xml l or updatevdc –d my.db hw.xml

12 From VDC to DAX l Deriving the provenance of a logical file or derivation is the abstract planning process. l All files and transformations are logical! –The complete provenance will be unrolled –No external catalogs (TC,RC) are queried l Ask the catalog for the produced file gendax –d my.db –l hw –o hw1.dax –f out.txt l Or ask for the derivation (to be fixed) gendax –d my.db –l hw –o hw2.dax –i hello

13 The DAX file l The result of the Chimera abstract planner. l Richer than Condor DAGMan format. l Expressed in terms of logical entities. l Contains complete (build) lineage. l Consists of three major parts 1.All logical files necessary for the DAG. 2.All jobs necessary to produce all files. 3.The dependencies between the jobs.

14 The Transformation Catalog l Translates logical transformation into an application specific for a certain pool l Similar in all versions of Chimera l Simple, text based file –3+ columns –Blank lines and comments are ignored l Standard location –$VDS_HOME/var/tc.data –adjustable with property vds.db.tc

15 Transformation Catalog Columns l 1 st column is the pool handle –Shell planner only uses the "local" handle –Other handles for concrete planner l 2 nd column is the logical transformation –Format: _ _ _ –Name-only names are just l 3 rd column is the path to the executable l 4 th column for environment variables –Use "null" if unused –Format: key=value;key=value

16 The Shplan Replica Catalog l A replica catalog translates a logical filename into a set of physical filenames l This RC is for the shell planner only l Simple textual file –Three columns –Blank lines and comments are ignored l Standard location –$VDS_HOME/var/rc.data –Adjustable with property vds.db.rc

17 Shell planner RC Columns l 1 st column is the pool handle –Shell planner only uses the "local" handle –Other handles for concrete planner l 2 nd column is the logical filename –A LFN may contain slash etc. l 3 rd column is the path to the file l Are multiples allowed? –No, only the last (first?) match is taken

18 TC and RC Short-cuts l Application locations can come from VDL –hints.pfnHint takes precedence over TC –Shell planner may run w/o any TC l Files need not be registered with RC –Existence checks are done in file system –RC updates can be optional –Shell planner may run w/o any RC  After all, we run locally with the shell planner!

19 Preparing to Run The Shell Planner l Make sure that your TC contains a translation for logical transformation "hw" localhw/bin/echonull l Make sure that there is a RC without weird content: cp /dev/null $VDS_HOME/var/rc.data

20 Running The Shell Planner l Run the shell planner shplanner –o hw hw1.dax l Check directory "hw" –Master script:.sh –Job scripts: _.sh –Helper files: _.lst

21 Running The Shell-Plan l Shell planner generates a master plan –This is a tool-kit  no auto-run feature l Run the master plan ( cd hw &&./hw.sh ) l Can check log file for status etc. –Standard name:.log l Master script will exit with 0 on success –Exit 1, if any sub-script fails.

22 Convert A Unix Pipeline l Example 2: Full name from a username grep ^user /etc/passwd | awk –F: '{ print $5 }' 1.Create abstract transformations 2.Create concrete derivations to call them 3.Convert into VDLx 4.Add to VDDB 5.Run abstract planner 6.Run shell planner 7.Run master script

23 Abstract Transformations l Grep for an arbitrary user name TR grep( none name, output of ) { argument = "^"${name}" /etc/passwd"; argument stdout = ${of}; } l Extract a full name from a line TR awk( input line, output full ) { argument stdin = ${line}; argument stdout = ${full}; argument = "-F: '{ print $5 }' "; }

24 Concrete Derivations l Run with a concrete username… DV d2->grep( name="voeckler", ); l … and post-process the results DV d3->awk( ); l The two derivations are linked by a LFN –"grepout.txt" is produced by "d2" –"grepout.txt" is consumed by "d3"

25 Convert, Insert and Plan l Convert into VDLx vdlt2vdlx ex2.vdl ex2.xml l Insert into your VDDB insertvdc –d my.db ex2.xml l Run abstract planner gendax –d my.db –l ex2 –o ex2.dax –f awkout.txt

26 Run Shell Planner l Prepare transformation catalog localgrep/usr/bin/egrepnull localawk/opt/gnu/bin/gawknull l Run shell planner shplanner –o ex2 ex2.dax l Run master plan ( cd ex2 &&./ex2.sh )

27 Kanonical Executable for Grids l Simple application –Copies all input with indentation to all output files –Adds additional data about itself l Allows tracking –What ran where, when, which architecture, and how long l To be used as stand-in for real applications –Allows to check the DAG stages

28 The Diamond DAG l Complex structure –Fan-in –Fan-out –"left" and "right" can run in parallel l Uses input file –Register with RC l Complex file dependencies

29 The Diamond DAG II l The "black diamond" takes one input file echo –e "local\tf.a\t${HOME}/f.a" » $VDS_HOME/var/rc.data date > ${HOME}/f.a l It uses three different transformations echo –e "local\tpreprocess\t$VDS_HOME/bin/keg\tnull" » $VDS_HOME/var/tc.data echo –e "local\tfindrange\t$VDS_HOME/bin/keg\tnull" » $VDS_HOME/var/tc.data echo –e "local\tanalyze\t$VDS_HOME/bin/keg\tnull" » $VDS_HOME/var/tc.data

30 A Big Example l Overall tree structure +70 "regular" diamond DAGs on top +10 collecting nodes +1 final collector =291 jobs to run All files are generated, no RC preloading Add "multi" TR to last example echo –e "local\tmulti\t$VDS_HOME/bin/keg\tnull" » $VDS_HOME/var/tc.data