Enabling Applications on BG/EGEE Grid Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET.

Slides:



Advertisements
Similar presentations
Generic MPI Job Submission by the P-GRADE Grid Portal Zoltán Farkas MTA SZTAKI.
Advertisements

FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Supporting MPI Applications on EGEE Grids Zoltán Farkas MTA SZTAKI.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Riccardo Bruno INFN.CT Sevilla, Sep 2007 The GENIUS Grid portal.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
Computational grids and grids projects DSS,
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
A Proposal of Application Failure Detection and Recovery in the Grid Marian Bubak 1,2, Tomasz Szepieniec 2, Marcin Radecki 2 1 Institute of Computer Science,
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Distribution After Release Tool Natalia Ratnikova.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
11/30/2007 Overview of operations at CC-IN2P3 Exploitation team Reported by Philippe Olivero.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Running a Scientific Experiment on the Grid Vilnius, 13 rd May, 2008 by Tomasz Szepieniec IFJ PAN & CYFRONET.
Migrating Desktop Marcin Płóciennik Marcin Płóciennik Kick-off Meeting, Santander, Graphical.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Bazaar Vision Ideas of RC/VO coordination,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Kraków Kick-off meeting Migrating Desktop General concept Intuitive Grid-user’s work environment independent of a hardware.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
Migrating Desktop Bartek Palak Bartek Palak Poznan Supercomputing and Networking Center The Graphical Framework.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
G.Govi CERN/IT-DB 1 September 26, 2003 POOL Integration, Testing and Release Procedure Integration  Packages structure  External dependencies  Configuration.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
Cracow Grid Workshop, October 15-17, 2007 Polish Grid Polish NGI Contribution to EGI Resource Provisioning Function Automatized Direct Communication Tomasz.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
Migrating Desktop Uniform Access to the Grid Marcin Płóciennik Poznan Supercomputing and Networking Center Poznan, Poland EGEE’07, Budapest, Oct.
Migrating Desktop Uniform Access to the Grid Marcin Płóciennik Poznan Supercomputing and Networking Center Poland EGEE’08 Conference, Istanbul, 24 Sep.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
II EGEE conference Den Haag November, ROC-CIC status in Italy
RI EGI-TF 2010, Tutorial Managing an EGEE/EGI Virtual Organisation (VO) with EDGES bridged Desktop Resources Tutorial Robert Lovas, MTA SZTAKI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid is a Bazaar of Resource Providers and.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
BalticGrid Focus on Cracow contributions Mariusz Witek IFJ PAN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Antonio Fuentes RedIRIS Barcelona, 15 Abril 2008 The GENIUS Grid portal.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
 1- Definition  2- Helpdesk  3- Asset management  4- Analytics  5- Tools.
Presentation transcript:

Enabling Applications on BG/EGEE Grid Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 2 Application Support Activity  Goals: provide support to show application developers their way to GRID and support application developers in adapting application to the GRID.  Supported fields:  Identification of grid techniques that should be used  Grid-enabling procedures  Deployment procedures  Possibility of integration with user friendly interfaces  Possibility of using performance tools  Production management Application Expert Group Application Developers Grid-enabled, user-friendly and efficient (Baltic)grid application Request for support Support  DISCLAIMER: We are NOT for:  organizing support for user like help desk, call center, etc. User support is in SA1.  developing grid-enabled extensions to applications. All alterations in applications should be done by application developers.

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 3 Tools for Application Developers ■ Migrating Desktop  User-friendly, graphical interface to GRID  Application could develop plugging to facilitate: ► Job submition ► Output analysis  Developed within CrossGrid and maintained by PSNC ■ OCM-G and G-PM  OCM-G is a grid-enabled application monitoring system enables possibility of on-line monitoring and steering of distributed application ► Special support for performance analysis of MPI applications  G-PM – tool for performance analysis  They enable possibility to study performance bottle-necks in grid applications  Developed within CrossGrid and maintained by IFJ PAN with cooperation with CYFRONET

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga The Idea of Gridification Workshop ■ An experiment!  You come to School with your own applications….  …at the end your application should be grid-enabled ■ Means:  Lectures to give you knowledge and some ideas (>3h)  Hands-on parts to work in pairs and really deploy your ideas on the BG Grid (6h)  Tutors to solving the problems, to show next steps, to discuss the ideas with, etc.

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga For those who missed to take an application.. ■ We have something special! ■ BLAST – application for searching patterns against human genomic code (about 15 GBs) ■ Try to make searching human genom on-line with Grid ■ See: ■

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Parts ■ Part I – Basic Techniques ■ Part II - Beyond Limits of gLite ■ Part III - Managing Large Experiments

Enabling Applications on BG/EGEE Grid Part I - Basic techniques Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Contents ■ Why to use Grid? ■ Application suitable for grid-enabling ■ Some notes about gLite environment ■ VO manager view vs. RC view ■ Dealing with Application Software ■ Introduction to today’s exercise

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Why to use Grid? 1/3 Motivation is important … we need motivation to face the problem that will occure Why not? –It requires changes in my habits –Grid fails sometimes/all-the- time –I must alter my application –I must deal with: Certificate Virtual Organization (VO) User Interface Yes, but you can change habits to better ~10% TRUE. The rest: go to Yes, go to application support activity for a hint. Yes, but this is feasible ADMINS! Enable UI

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Why to use Grid? 2/3 ■ Something positive?  Computational resources  Storage resources  Collaboration ■ Typical motivation  I must deal with some computation – I need 2 years with my PC..  It would be great to increase resolution…  In my work I need to play with some arguments – It would be great to immediately with the results  We have a Project – we must show… 

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Miracle of Sharing Resources ■ If there is shortage of resource, SHARING is the solution ■ Typical cycle of working with computing:  Preparing, Computing, Analyzing, (Writing a paper)  It does not refer to some researcher (e.g. solving Sierpinski problem) ■ It gives you more than you can obtain by keeping your part only demand resources Unused resources Unmet demand Figure copied from P. Plaszczak „Grid Computing”

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Why to use Grids? 3/3 ■ You should think about the following:  Licensed software  How many times I need to use it? ► Effort for gridenabling  Location and size of data  Speed-up including overheads ► Parallel execution  Other people that uses the same data  Security level required. Rights to data. ► „In Grid I Trust” ■ HINT: Before you start find your motivation

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Suitable for Grids ■ Batch oriented  But, we can deal with interactivity ■ Sequential  But, we have MPI-support (inside clusters)  Age of multi-core machines starts.. ■ Not very long-lasting (up to 12 hours)  But, we can enable checkpointing ■ Demanding e.g requires large RAM  Resources Broker provides the resources according to specification (practically - typical configuration) ■ Commercial  We can deal in many cases

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Notes on gLite and Sites ■ Almost homogeneous solution  Support for IA32 and IA64  It is not required to compile the software on site ■ Clusters only – it means limited variety of resource configuration  Typically double processor, 1-4 cores, 1-2GB RAM  Typically Ethernet 1Gb for interconnections  Application typically using large parallel machines should remain there ■ Support for local MPI (mpich-p4, no mpich-g2)  Non-public pool of resources – no WN-to-WN multisite communication – needs proxy for this (located on CEs – which is draw-back) ■ Frequent changes, poor quality of middleware  operational effort required ■ Pure Globus solution is still working  you can globus-run on CE

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Virtual Organization Manager View ■ Virtual Organization (VO) – a group of people/institutions cooperating or having the same goals or requirements  VO members are registered and recognized by X.509 certificates  Examples: ► The team working in human genome project ► The Gaussian users ► Users testing theirs application in the Grid ■ Resource allocation is done on VO bases – users are allowed to use only the sites that supports the VO  FCR – Freedom of Choice of Resource based on monitoring ■ This is VO manager duty to negotiate with sites  Subject of negotiations would be: ► The limits on queue system ► Specific configuration ► Support for VO services (VOMS, LFC)  No framework for this – currently: ► In BG -> go to SA1 (application support would help) ► In EGEE -> long-lasting official procedure of registration and getting resources ► Resource Allocation Portal in BalticGrid4Science?

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 16 Status of VO support-related staff ■ What we have:  Declarations of the site policy (based on percentage of resources devoted to disciplines)  Hidden configuration of level of support  GGUS ticket if something goes wrong  VO assesment tool – in preparation ■ What we want to have  Possibility to plan and dynamically adapt support level to VO needs  Establishing/supporting VO tracking  Visible policy of supporting VO along them RC is running  Collaboration tool?

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 17 Some use cases ■ VO suported in the infrastructure needs more computing power for 1 month started at next week. ■ VO needs increase storage/CPU ratio, so they want to negotiate this with 20 sites and trace the process of enabling this. ■ RC needs resources for new VO and needs to cut (re- negotiate) resources for supported VOs. ■ Currently, in each case only -based approach is available

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 18 Plan resources allocation in portal ■ Plan the resources allocation in time ■ Support for a VO could be considered as 'contracts' each between single RC and single VO ■ Possibility to include policy declarations ■ Would RC Manager be happy if policy managment would be done in that way?

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 19 Negotiate changes ■ Both party could propose changes to the 'contract' ■ In this case the other party is notified and negotiation over each element of contract can be proceeded ■ Number of rounds is not limited

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 20 Trace the contract execution process ■ Change of state can requested by both sides ■ Other side can confirm or reject the transition  In some steps veryfication of work could be included  e.g. checking if the site increase guarantied number of CPU slots ■ Trace of action available in the portal

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 21 Feedback Collecting ■ Execution of the contract be assessed in form of feedback  e.g. for RCs: Was site enable support in time? Did site provided promised number of CPU?;  e.g. for VO: Does VO exploit guaranteed (reserved) resources? Is the proper configuration available? ■ Feedback would be collected:  on state transitions or/and on request  Semi-automatically based on monitoring and accounting data  Always with possibility to make comments, explain, etc. ■ Points could be assigned to RCs and VOs based on the feedback  Top 10 reliable sites list can be published ;-) ■ Points and feedback could help in making policy-related decisions

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Main Points to Start ■ First we should think about:  Batch mode  The application binaries, library, data needed  Input data  Output data ■ Hints:  You should not transfer more that 10MBs thought RB!!!  Use storage feature and consider replication of the data ► You can also use http download if you have to…  Data what are used in majority of jobs could be installed once  Consider putting outputs to grid for future use  OutputSandBox should be used for status and logs only

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Managing VO Software ■ VO Software = Application + Data needed ■ Important:  VO Software is not for Admins to install!!!  VO should deal with the software itself!

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Become Software Manager ■ A Software Grid Manager (SGM) is a person who manages installations of application software in Grid i.e. libraries, other dependency software and application itself. ■ The SGM is identified to Grid services by a special role in his/her proxy certificate extension. ■ How to become an SGM in BGTUT VO?  During creating a proxy certificate you need to request a role called "lcgadmin".  voms-proxy-init -voms bgtut:/Role=lcgadmin

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Share Workspace ■ Sites provide special disk space for such installations (pointed by variable $VO_ _SW_DIR)  E.g. in BGTUT VO: $VO_BGTUT_SW_DIR ■ SW_DIR is shared and visible from all worker nodes ■ By installation job you can put your software to the site ■ In majority of VOs only special group (so called: software manager) are granted to do it, but in general VOs (like BG VO) all user could do this  This is associated with a VOMS group ■ Remember: This is common space - think twice before submit a change.

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga VO Tags ■ VO Tags are mechanism to add application-related info to sites information system ■ Using this we can publish a tags that means that XXX package is installed  lcg-ManageVOTag -host HOSTNAME -vo bgtut --add –tag VO_bgtut_YOUR_TAG ■ Tags become visible in the BDII in few minutes ■ To check the tags the particular site publishes use  lcg-ManageVOTag -host HOSTNAME -vo bgtut --list –tag  For doing this you can do even NOT being SGM ■ To find all sites that has a given software installed (are publishing the proper tag):  Prepare JDL and run: glite-job-listmatch ■ Remember: Information system is vital element of the site. Give tag names that will be unique!

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga SW_DIR + VO Tag ■ Create catalog in SW_DIR  It should be named according your application name or in case your own experiment use your name  Put the software in this catalog ■ Publish VO Tag  It should be clear and include VO name ■ Add a requirement to JDL  Requirements = Member(„VO_bgtut_YOUR_TAG",other.GlueHostApplic ationSoftwareRunTimeEnvironment)

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 28 GAUSSIAN VO – A Good Example of VO VO already accepted in EGEE operated by CYFRONET (Krakow, Poland) For users –everyone that accept the policy can join –easy to start – ready scripts.. – For admins –sites with GAUSSIAN site licence can join –

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Hand-on Exercise for Today ■ Work in pairs ■ Show us your application ■ Try to run it in simple version on the Grid 1. Tar all files and put it to storage (by lcg-cr) 2. Prepare a script that: 1. Download the tarball 2. Expand it 3. Prepare the environment 4. Run the application 5. Collect all important files, tar them and put on the grid (lcg-cr) 6. Log the status to output file 3. Submit the script and checks if it’s running well 4. Get file from storage and check the results See: balticgrid.org -> Grid Operations -> BalticGridTutorials -> BG Summer School -> Exercise 1

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 30 Important! ■ Migrate to REAL environment 1. Request for your personal grid certificate ► Local policy - ask person from the BalticGrid project ► Or: find your CA on GridPMA page 2. Register in BalticGrid VO (or in other VO) ► You need to have a certificate uploaded to your browser ► 3. Request account on your local UI 4. Enjoy BalticGrid environment 5. Tell as about your application (we will support you) 6. Besides, COPY ALL FILE FROM BGTUT UI to safe location

Enabling Applications on BG/EGEE Grid Part II - Beyond Limits of gLite Riga, 4 th July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Suitable for Grids ■ Batch oriented  But, we can deal with interactivity ■ Sequential  But, we have MPI-support (inside clusters)  Age of multi-core machines starts.. ■ Not very long-lasting (up to 12 hours)  But, we can enable checkpointing ■ Demanding e.g requires large RAM  Resources Broker provides the resources according to specification (practically - typical configuration) ■ Commercial  We can deal in many cases

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Contents ■ Classification of Parallel Execution ■ Introduction to Interactivity in Grids ■ L-system-based rendering application ■ Introduction to OCM-G ■ OCM-G Frameworks for  Interactive use,  Multi-sites master-worker applications  Checkpointing

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Classification of Parallel Executions 1. Parameter study ■ just plenty of jobs running separately 2. Multi-site parallel execution based on Master-Workers schema  Jobs are submitted separately, but they connect to one Master component  Jobs become Slaves 3. Non-MPI parallel applications  Need of having more than one job slot for single job  The same method of requesting resources, but no mpirun 4. MPI applications

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Introduction to Interactivity 1/2 ■ Interactivity comes when we have on-line contact with application on Worker Node ■ Purpose  Collecting information about application status  Application monitoring and/or steering  Load-balancing of chunks  Visualization  Person-in-the-loop computations ■ Means  Master should be put on machine with public-IP and inbound connectivity  Outbound connectivity for worker nodes is enough

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Introduction to Interactivity 2/2 ■ It is possible to have „Interactive” type of job in JDL  JobType = “Interactive”;  STDIN, STDOUT, STDERR  Streams goes thought Resource Broker  Hard to manage if many connections  Not recommended

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga L-system-based Rendering Application by Krzysztof Abramowicz ■ Roles of Components  Master – transform L-system and generate description of the scene  Workers – rendering the scene (by Povray execution)  Aggregator – combine images with AVI movie  Interface – interactive shell for a user or script

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Architecture ■ How it works  Master – on CE or on public UI ► Send scenes to render to workers  Interface – on UI  Workers – submitted as separated jobs, ► all without any specific arguments besides address of Master ► Render scene and save the image in the grid, inform the Master  Aggregator – separated jobs that connects to Master ► Add new images to movie according to information from master ■ Framework is dynamic – the number of worker nodes could change in time

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Results

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Tomasz Szepieniec Tomasz Duszka Jakub Janczak in Including tutorial – you can try it within baltigrid VO

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 41 ■ OCM-G and G-PM  OCM-G is a grid-enabled application monitoring system enables possibility of on-line monitoring and steering of distributed application ► Special support for performance analysis of MPI applications  G-PM – tool for performance analysis  They enable possibility to study performance bottle-necks in grid applications  newly added: ► Support for IA64 ► Support for Globus 4  Developed within CrossGrid and maintained now by IFJ PAN with cooperation with CYFRONET OCM-G + G-PM = On-line Monitoring

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 42 AP1 node1 site1 Tool LM SM node2 AP2AP3 LM site2 node3 AP4 LM SM :thread_stop([a_1]) Stop :thread_stop([p_1,p_2,p_3]) :thread_stop([p_4]) :thread_stop([p_1,p_2]) :thread_stop([p_3]) Stop Architecture and Request Distribution

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Extensions in OCM-G ■ Integration with gLite  New options of installing OCM-G (not only RPM-based installation) ► Scripts for installing in Shared Workspace ► Quick installation with job  gLite job ID available internally ► Process list could be obtain using it ■ New services  Listing remote directory  Downloading files (supports parts of files)  Uploading files  Running shell command on remote nodes  Monitoring of CPU usage, free memory, free disk space, open files, etc.  Forking and managing other processes (including attaching to standard I/O)

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Full support of Globus Toolkit 2.4 and 3 and 4 ■ MCI reactivated!  Option to compile without Globus (pure sockets) ■ Partial support of IA64 ■ MPI instrumentation based on PMPI ■ Improved management of components life-cycle  Local Monitor now can be safely disconnected and re-connected Support of Other Platforms and Features

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Java package enabling access to OCM-G  Using COG (Java Globus API) or pure sockets ■ Multi-layer interface:  Layer 1 – handling connections GSI/MCI and sending/receiving text-based OMIS messages  Layer 2 – stateless objects handling tokens and operates on them  Layer 3 – stateful objects representing OCM-G tokens OCM-G Java API

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Managing and visualization of data collected by OCM-G ■ Using OCM-G Java API ■ Extensions of G-PM functionality:  Separation of data and visualization  Easy integration with web portals.  Support of dynamic applications  Better GUI ■ Development in progress  advanced prototype planned for August CANDLE – Successor of G-PM

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 47 Specific Requirements of BG Apps 1.Have access to output files while job is running to ensure that computations goes correctly GAMESS Application 2.Manage large amount of workers and have possibility to enable application-internal scheduling Texts Analysis Application OCM-G can face above requirements thought so called OCM-G FRAMEWORKS

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga GAMESS Framework ■ GAMESS – widely use computation chemistry application  Typically long running time – lost of data possible due to failure on worker node or break queue limits  Feature to restart computation basing output files ■ Using the framework  Normal JDL as input  Automatic transformation of JDL and OCM-G environment start-up  Automatic synchronization (downloading to UI) all output sandbox ■ Benefits  User can control if computation goes correctly (e.g. are coherent)  In case of failure partial results are available

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga GAMESS Framework

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 50 Master-Workers Applications 1.Manage large amount of workers and have possibility to enable application-internal scheduling 2.Framework enable on-line control on multi-site applications  Tool is a master  Jobs running under control of OCM-G are workers  We can spawn different workers on-line using OCM-G services  Jobs set can change in time (we can spawn new and kill existing workers) 3.Feasibility study done with DNLP (Latvian text analysis application)

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga DNLP Framework ■ DNLP – MPI-Prolog based application to natural language (Latvian) syntax analysis  Interactive usage ■ Framework enable:  Multi-site, dynamic, interactive ‘farm’ of jobs  OCM-G is used to distribute work between worker and collect results

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 52 Grid Pipe ■ Nice tool to make parallel jobs for simple shell-like pipe construction ■ Simple usage: parpipe ' cat /etc/passwd | tr a A | sort ' ■ -> Applications... -> Parallel Pipe

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 53 Conclusions  In gLite there is a possibility for:  Interactivity – using outbound connectevity  Parallel execution – even for multisite applications  OCM-G  Provide performance monitoring  Provide access to running application (including environment, files, etc.)  Provide way to build interactive, multi-cluster applications  Application Support activity is still open to:  Advice you the (shortest) way to the grid with your applications  Provide tools that would be useful

Enabling Applications on BG/EGEE Grid Part III – Managing Large Experiments Riga, 5 th July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Contents ■ Problems in managing computations ■ Job managing frameworks ■ User-friendly GUIs and user/grid portals

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Problem to Solve ■ Aim:  Organize medium size (20k cases) computation on grid ■ Challenges: 1. We have a team of 8 people working on this application (application developers + experts) 2. Experts is preparing parameters in form of single file 3. Application is changing, so dealing with version of the application is important 4. Application is not working well for some parameters

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Technical Problems ■ Job sometimes fails – resubmission is required ■ Majority of jobs should go to grid (to two different VOs) but we want to use also a local cluster ■ We should not overflow VOs but use it efficiently ■ Ghrrr.. Some sites are just wrongly configured! ■ In some failures only the application operator should make decision what to do ■ Cases are quickly computed (e.g. 10’) – we need put more cases to single grid job

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Good Practices 1/2 ■ Organization  This is good to structure your computation (from beginning!) in experiments with attached: ► Version of the software ► Input specification Version/name of  Experiments results should be to different subdirectory on LFC ■ Validation mechanism is needed to distinguish between outputs of job from qualified result related to some inputs  Validated results should be copied to different repository organized according input arguments

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Good Practices 2/2 ■ Mechanism for listing problems should be applied  Automatic resubmission should be done carefully ■ For excluding wrongly configured sites you can use:  FCR (Freedom of Choice of Resources), but it works for whole VO  List of resources in JDL argument ■ You need to have application level scheduler  To not overflow grid  To provide the same interface to many environments

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga The Two Golden Rules 1. Users are the most important resources in GRIDs 2. This is better to spend 60 minutes on writing a script than 3 minutes every day of doing this manually

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Solutions ■ Write a scripts that will manage all ■ Use the a existing framework for job submission  Some scripts are needed to adapt your application to a framework  Examples: ► BOSS – Batch Oriented Submission System ► Zeus Grid Toolkit ► GANGA

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Some Features of Frameworks ■ BOSS  Developed for CMS  Feasible to use alone  C++ & Python API  Poor documentation  Support for various environments: fork, gLites, LSF, condorG, PBS ■ Zeus Grid Toolkit  Comes from DESY, used in HEP experiments  Relatively easy to enable  Manage data in LFC  Running slowly in case number of jobs >1500 ■ GANGA  Used in Atlas and LHCb  HEP related but not exluding other application

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Grid Portal vs Application Portal ■ Grid Portals/Tools  A nice user interface  Not always suitable for REAL WORK ► Possibility of providing automatic solution is limited  Usually single submission only   Typically build by grid developers ■ Application Portals  Typically developed for a single application  Possibility to hide the grid  Typically build with cooperation with grid users

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Portal/GUIs Examples ■ Grid Portal/GUIs  Migrating Desktop ►  P-Grade Portal ► Workflow drawing ► ■ Tools to build application portals  GridSphere portal framework ►  GridwiseTech LCG API ►

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Job submission  Selecting job type  Specifying job definition  Choosing requirements  Deciding on ranking policy  Picking input and/or output files  Defining specific job parameters (plug-in!)  Pre-processing job parameters (plug-in!) MD functionality overview

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Job monitoring  Tracking status of the job  Checking job parameters  Tracing job logs  Examining detailed job status  Possibility of interaction with user MD functionality overview

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga MD functionality overview Processing job output –Presenting partial results (plug-in!) –Visualising job output files (plug-in!) –Processing results (plug-in!)

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Portal Example Movie from presentation of Protein Folding Application in EUChinaGrid

T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Summary ■ Part I – Basic Techniques  Motivation  Role of VO Manager  Preparation of job submission, installation of the software in SW_DIR, managing VO-tags ■ Part II - Beyond Limits of gLite  Non-MPI parallel applications  Interactive use of grid resources ■ Part III - Managing Large Experiments  Principles of managing computations  Job managing frameworks  User-friendly GUI and user/grid portals We wish you many very good grid-enabled applications!!!