Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enabling Applications on BG/EGEE Grid Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET.

Similar presentations


Presentation on theme: "Enabling Applications on BG/EGEE Grid Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET."— Presentation transcript:

1 Enabling Applications on BG/EGEE Grid Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

2 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 2 Application Support Activity  Goals: provide support to show application developers their way to GRID and support application developers in adapting application to the GRID.  Supported fields:  Identification of grid techniques that should be used  Grid-enabling procedures  Deployment procedures  Possibility of integration with user friendly interfaces  Possibility of using performance tools  Production management Application Expert Group Application Developers Grid-enabled, user-friendly and efficient (Baltic)grid application Request for support Support  DISCLAIMER: We are NOT for:  organizing support for user like help desk, call center, etc. User support is in SA1.  developing grid-enabled extensions to applications. All alterations in applications should be done by application developers.

3 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 3 Tools for Application Developers ■ Migrating Desktop  User-friendly, graphical interface to GRID  Application could develop plugging to facilitate: ► Job submition ► Output analysis  Developed within CrossGrid and maintained by PSNC ■ OCM-G and G-PM  OCM-G is a grid-enabled application monitoring system enables possibility of on-line monitoring and steering of distributed application ► Special support for performance analysis of MPI applications  G-PM – tool for performance analysis  They enable possibility to study performance bottle-necks in grid applications  Developed within CrossGrid and maintained by IFJ PAN with cooperation with CYFRONET

4 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga The Idea of Gridification Workshop ■ An experiment!  You come to School with your own applications….  …at the end your application should be grid-enabled ■ Means:  Lectures to give you knowledge and some ideas (>3h)  Hands-on parts to work in pairs and really deploy your ideas on the BG Grid (6h)  Tutors to solving the problems, to show next steps, to discuss the ideas with, etc.

5 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga For those who missed to take an application.. ■ We have something special! ■ BLAST – application for searching patterns against human genomic code (about 15 GBs) ■ Try to make searching human genom on-line with Grid ■ See: ■ http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html

6 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Parts ■ Part I – Basic Techniques ■ Part II - Beyond Limits of gLite ■ Part III - Managing Large Experiments

7 Enabling Applications on BG/EGEE Grid Part I - Basic techniques Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

8 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Contents ■ Why to use Grid? ■ Application suitable for grid-enabling ■ Some notes about gLite environment ■ VO manager view vs. RC view ■ Dealing with Application Software ■ Introduction to today’s exercise

9 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Why to use Grid? 1/3 Motivation is important … we need motivation to face the problem that will occure Why not? –It requires changes in my habits –Grid fails sometimes/all-the- time –I must alter my application –I must deal with: Certificate Virtual Organization (VO) User Interface Yes, but you can change habits to better ~10% TRUE. The rest: go to support@balticgrid.org Yes, go to application support activity for a hint. Yes, but this is feasible ADMINS! Enable UI

10 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Why to use Grid? 2/3 ■ Something positive?  Computational resources  Storage resources  Collaboration ■ Typical motivation  I must deal with some computation – I need 2 years with my PC..  It would be great to increase resolution…  In my work I need to play with some arguments – It would be great to immediately with the results  We have a Project – we must show… 

11 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Miracle of Sharing Resources ■ If there is shortage of resource, SHARING is the solution ■ Typical cycle of working with computing:  Preparing, Computing, Analyzing, (Writing a paper)  It does not refer to some researcher (e.g. solving Sierpinski problem) ■ It gives you more than you can obtain by keeping your part only demand resources Unused resources Unmet demand Figure copied from P. Plaszczak „Grid Computing”

12 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Why to use Grids? 3/3 ■ You should think about the following:  Licensed software  How many times I need to use it? ► Effort for gridenabling  Location and size of data  Speed-up including overheads ► Parallel execution  Other people that uses the same data  Security level required. Rights to data. ► „In Grid I Trust” ■ HINT: Before you start find your motivation

13 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Suitable for Grids ■ Batch oriented  But, we can deal with interactivity ■ Sequential  But, we have MPI-support (inside clusters)  Age of multi-core machines starts.. ■ Not very long-lasting (up to 12 hours)  But, we can enable checkpointing ■ Demanding e.g requires large RAM  Resources Broker provides the resources according to specification (practically - typical configuration) ■ Commercial  We can deal in many cases

14 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Notes on gLite and Sites ■ Almost homogeneous solution  Support for IA32 and IA64  It is not required to compile the software on site ■ Clusters only – it means limited variety of resource configuration  Typically double processor, 1-4 cores, 1-2GB RAM  Typically Ethernet 1Gb for interconnections  Application typically using large parallel machines should remain there ■ Support for local MPI (mpich-p4, no mpich-g2)  Non-public pool of resources – no WN-to-WN multisite communication – needs proxy for this (located on CEs – which is draw-back) ■ Frequent changes, poor quality of middleware  operational effort required ■ Pure Globus solution is still working  you can globus-run on CE

15 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Virtual Organization Manager View ■ Virtual Organization (VO) – a group of people/institutions cooperating or having the same goals or requirements  VO members are registered and recognized by X.509 certificates  Examples: ► The team working in human genome project ► The Gaussian users ► Users testing theirs application in the Grid ■ Resource allocation is done on VO bases – users are allowed to use only the sites that supports the VO  FCR – Freedom of Choice of Resource based on monitoring ■ This is VO manager duty to negotiate with sites  Subject of negotiations would be: ► The limits on queue system ► Specific configuration ► Support for VO services (VOMS, LFC)  No framework for this – currently: ► In BG -> go to SA1 (application support would help) ► In EGEE -> long-lasting official procedure of registration and getting resources ► Resource Allocation Portal in BalticGrid4Science?

16 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 16 Status of VO support-related staff ■ What we have:  Declarations of the site policy (based on percentage of resources devoted to disciplines)  Hidden configuration of level of support  GGUS ticket if something goes wrong  VO assesment tool – in preparation ■ What we want to have  Possibility to plan and dynamically adapt support level to VO needs  Establishing/supporting VO tracking  Visible policy of supporting VO along them RC is running  Collaboration tool?

17 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 17 Some use cases ■ VO suported in the infrastructure needs more computing power for 1 month started at next week. ■ VO needs increase storage/CPU ratio, so they want to negotiate this with 20 sites and trace the process of enabling this. ■ RC needs resources for new VO and needs to cut (re- negotiate) resources for supported VOs. ■ Currently, in each case only e-mail-based approach is available

18 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 18 Plan resources allocation in portal ■ Plan the resources allocation in time ■ Support for a VO could be considered as 'contracts' each between single RC and single VO ■ Possibility to include policy declarations ■ Would RC Manager be happy if policy managment would be done in that way?

19 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 19 Negotiate changes ■ Both party could propose changes to the 'contract' ■ In this case the other party is notified and negotiation over each element of contract can be proceeded ■ Number of rounds is not limited

20 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 20 Trace the contract execution process ■ Change of state can requested by both sides ■ Other side can confirm or reject the transition  In some steps veryfication of work could be included  e.g. checking if the site increase guarantied number of CPU slots ■ Trace of action available in the portal

21 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 21 Feedback Collecting ■ Execution of the contract be assessed in form of feedback  e.g. for RCs: Was site enable support in time? Did site provided promised number of CPU?;  e.g. for VO: Does VO exploit guaranteed (reserved) resources? Is the proper configuration available? ■ Feedback would be collected:  on state transitions or/and on request  Semi-automatically based on monitoring and accounting data  Always with possibility to make comments, explain, etc. ■ Points could be assigned to RCs and VOs based on the feedback  Top 10 reliable sites list can be published ;-) ■ Points and feedback could help in making policy-related decisions

22 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Main Points to Start ■ First we should think about:  Batch mode  The application binaries, library, data needed  Input data  Output data ■ Hints:  You should not transfer more that 10MBs thought RB!!!  Use storage feature and consider replication of the data ► You can also use http download if you have to…  Data what are used in majority of jobs could be installed once  Consider putting outputs to grid for future use  OutputSandBox should be used for status and logs only

23 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Managing VO Software ■ VO Software = Application + Data needed ■ Important:  VO Software is not for Admins to install!!!  VO should deal with the software itself!

24 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Become Software Manager ■ A Software Grid Manager (SGM) is a person who manages installations of application software in Grid i.e. libraries, other dependency software and application itself. ■ The SGM is identified to Grid services by a special role in his/her proxy certificate extension. ■ How to become an SGM in BGTUT VO?  During creating a proxy certificate you need to request a role called "lcgadmin".  voms-proxy-init -voms bgtut:/Role=lcgadmin

25 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Share Workspace ■ Sites provide special disk space for such installations (pointed by variable $VO_ _SW_DIR)  E.g. in BGTUT VO: $VO_BGTUT_SW_DIR ■ SW_DIR is shared and visible from all worker nodes ■ By installation job you can put your software to the site ■ In majority of VOs only special group (so called: software manager) are granted to do it, but in general VOs (like BG VO) all user could do this  This is associated with a VOMS group ■ Remember: This is common space - think twice before submit a change.

26 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga VO Tags ■ VO Tags are mechanism to add application-related info to sites information system ■ Using this we can publish a tags that means that XXX package is installed  lcg-ManageVOTag -host HOSTNAME -vo bgtut --add –tag VO_bgtut_YOUR_TAG ■ Tags become visible in the BDII in few minutes ■ To check the tags the particular site publishes use  lcg-ManageVOTag -host HOSTNAME -vo bgtut --list –tag  For doing this you can do even NOT being SGM ■ To find all sites that has a given software installed (are publishing the proper tag):  Prepare JDL and run: glite-job-listmatch ■ Remember: Information system is vital element of the site. Give tag names that will be unique!

27 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga SW_DIR + VO Tag ■ Create catalog in SW_DIR  It should be named according your application name or in case your own experiment use your name  Put the software in this catalog ■ Publish VO Tag  It should be clear and include VO name ■ Add a requirement to JDL  Requirements = Member(„VO_bgtut_YOUR_TAG",other.GlueHostApplic ationSoftwareRunTimeEnvironment)

28 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 28 GAUSSIAN VO – A Good Example of VO VO already accepted in EGEE operated by CYFRONET (Krakow, Poland) For users –everyone that accept the policy can join –easy to start – ready scripts.. –http://egee.grid.cyfronet.pl/Applications/gaussian-vo/http://egee.grid.cyfronet.pl/Applications/gaussian-vo/ For admins –sites with GAUSSIAN site licence can join –http://egee.grid.cyfronet.pl/Applications/gaussian-vo-how-to-support/http://egee.grid.cyfronet.pl/Applications/gaussian-vo-how-to-support/

29 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Hand-on Exercise for Today ■ Work in pairs ■ Show us your application ■ Try to run it in simple version on the Grid 1. Tar all files and put it to storage (by lcg-cr) 2. Prepare a script that: 1. Download the tarball 2. Expand it 3. Prepare the environment 4. Run the application 5. Collect all important files, tar them and put on the grid (lcg-cr) 6. Log the status to output file 3. Submit the script and checks if it’s running well 4. Get file from storage and check the results See: balticgrid.org -> Grid Operations -> BalticGridTutorials -> BG Summer School -> Exercise 1

30 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 30 Important! ■ Migrate to REAL environment 1. Request for your personal grid certificate ► Local policy - ask person from the BalticGrid project http://ca.balticgrid.org/user_doc.html ► Or: find your CA on GridPMA page 2. Register in BalticGrid VO (or in other VO) ► You need to have a certificate uploaded to your browser ► http://voms-web.balticgrid.org/ 3. Request account on your local UI 4. Enjoy BalticGrid environment 5. Tell as about your application (we will support you) 6. Besides, COPY ALL FILE FROM BGTUT UI to safe location

31 Enabling Applications on BG/EGEE Grid Part II - Beyond Limits of gLite Riga, 4 th July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

32 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Suitable for Grids ■ Batch oriented  But, we can deal with interactivity ■ Sequential  But, we have MPI-support (inside clusters)  Age of multi-core machines starts.. ■ Not very long-lasting (up to 12 hours)  But, we can enable checkpointing ■ Demanding e.g requires large RAM  Resources Broker provides the resources according to specification (practically - typical configuration) ■ Commercial  We can deal in many cases

33 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Contents ■ Classification of Parallel Execution ■ Introduction to Interactivity in Grids ■ L-system-based rendering application ■ Introduction to OCM-G ■ OCM-G Frameworks for  Interactive use,  Multi-sites master-worker applications  Checkpointing

34 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Classification of Parallel Executions 1. Parameter study ■ just plenty of jobs running separately 2. Multi-site parallel execution based on Master-Workers schema  Jobs are submitted separately, but they connect to one Master component  Jobs become Slaves 3. Non-MPI parallel applications  Need of having more than one job slot for single job  The same method of requesting resources, but no mpirun 4. MPI applications

35 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Introduction to Interactivity 1/2 ■ Interactivity comes when we have on-line contact with application on Worker Node ■ Purpose  Collecting information about application status  Application monitoring and/or steering  Load-balancing of chunks  Visualization  Person-in-the-loop computations ■ Means  Master should be put on machine with public-IP and inbound connectivity  Outbound connectivity for worker nodes is enough

36 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Introduction to Interactivity 2/2 ■ It is possible to have „Interactive” type of job in JDL  JobType = “Interactive”;  STDIN, STDOUT, STDERR  Streams goes thought Resource Broker  Hard to manage if many connections  Not recommended

37 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga L-system-based Rendering Application by Krzysztof Abramowicz ■ Roles of Components  Master – transform L-system and generate description of the scene  Workers – rendering the scene (by Povray execution)  Aggregator – combine images with AVI movie  Interface – interactive shell for a user or script

38 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Architecture ■ How it works  Master – on CE or on public UI ► Send scenes to render to workers  Interface – on UI  Workers – submitted as separated jobs, ► all without any specific arguments besides address of Master ► Render scene and save the image in the grid, inform the Master  Aggregator – separated jobs that connects to Master ► Add new images to movie according to information from master ■ Framework is dynamic – the number of worker nodes could change in time

39 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Results

40 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Tomasz Szepieniec Tomasz Duszka Jakub Janczak in http://grid.cyfronet.pl/ocmg Including tutorial – you can try it within baltigrid VO

41 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 41 ■ OCM-G and G-PM  OCM-G is a grid-enabled application monitoring system enables possibility of on-line monitoring and steering of distributed application ► Special support for performance analysis of MPI applications  G-PM – tool for performance analysis  They enable possibility to study performance bottle-necks in grid applications  newly added: ► Support for IA64 ► Support for Globus 4  Developed within CrossGrid and maintained now by IFJ PAN with cooperation with CYFRONET OCM-G + G-PM = On-line Monitoring

42 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 42 AP1 node1 site1 Tool LM SM node2 AP2AP3 LM site2 node3 AP4 LM SM :thread_stop([a_1]) Stop :thread_stop([p_1,p_2,p_3]) :thread_stop([p_4]) :thread_stop([p_1,p_2]) :thread_stop([p_3]) Stop Architecture and Request Distribution

43 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Extensions in OCM-G ■ Integration with gLite  New options of installing OCM-G (not only RPM-based installation) ► Scripts for installing in Shared Workspace ► Quick installation with job  gLite job ID available internally ► Process list could be obtain using it ■ New services  Listing remote directory  Downloading files (supports parts of files)  Uploading files  Running shell command on remote nodes  Monitoring of CPU usage, free memory, free disk space, open files, etc.  Forking and managing other processes (including attaching to standard I/O)

44 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Full support of Globus Toolkit 2.4 and 3 and 4 ■ MCI reactivated!  Option to compile without Globus (pure sockets) ■ Partial support of IA64 ■ MPI instrumentation based on PMPI ■ Improved management of components life-cycle  Local Monitor now can be safely disconnected and re-connected Support of Other Platforms and Features

45 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Java package enabling access to OCM-G  Using COG (Java Globus API) or pure sockets ■ Multi-layer interface:  Layer 1 – handling connections GSI/MCI and sending/receiving text-based OMIS messages  Layer 2 – stateless objects handling tokens and operates on them  Layer 3 – stateful objects representing OCM-G tokens OCM-G Java API

46 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Managing and visualization of data collected by OCM-G ■ Using OCM-G Java API ■ Extensions of G-PM functionality:  Separation of data and visualization  Easy integration with web portals.  Support of dynamic applications  Better GUI ■ Development in progress  advanced prototype planned for August 2007. CANDLE – Successor of G-PM

47 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 47 Specific Requirements of BG Apps 1.Have access to output files while job is running to ensure that computations goes correctly GAMESS Application 2.Manage large amount of workers and have possibility to enable application-internal scheduling Texts Analysis Application OCM-G can face above requirements thought so called OCM-G FRAMEWORKS

48 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga GAMESS Framework ■ GAMESS – widely use computation chemistry application  Typically long running time – lost of data possible due to failure on worker node or break queue limits  Feature to restart computation basing output files ■ Using the framework  Normal JDL as input  Automatic transformation of JDL and OCM-G environment start-up  Automatic synchronization (downloading to UI) all output sandbox ■ Benefits  User can control if computation goes correctly (e.g. are coherent)  In case of failure partial results are available

49 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga GAMESS Framework

50 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 50 Master-Workers Applications 1.Manage large amount of workers and have possibility to enable application-internal scheduling 2.Framework enable on-line control on multi-site applications  Tool is a master  Jobs running under control of OCM-G are workers  We can spawn different workers on-line using OCM-G services  Jobs set can change in time (we can spawn new and kill existing workers) 3.Feasibility study done with DNLP (Latvian text analysis application)

51 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga DNLP Framework ■ DNLP – MPI-Prolog based application to natural language (Latvian) syntax analysis  Interactive usage ■ Framework enable:  Multi-site, dynamic, interactive ‘farm’ of jobs  OCM-G is used to distribute work between worker and collect results

52 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 52 Grid Pipe ■ Nice tool to make parallel jobs for simple shell-like pipe construction ■ Simple usage: parpipe ' cat /etc/passwd | tr a A | sort ' ■ www.balticgrid.org -> Applications... -> Parallel Pipe

53 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga 53 Conclusions  In gLite there is a possibility for:  Interactivity – using outbound connectevity  Parallel execution – even for multisite applications  OCM-G http://grid.cyfronet.pl/ocmg  Provide performance monitoring  Provide access to running application (including environment, files, etc.)  Provide way to build interactive, multi-cluster applications  Application Support activity is still open to:  Advice you the (shortest) way to the grid with your applications  Provide tools that would be useful

54 Enabling Applications on BG/EGEE Grid Part III – Managing Large Experiments Riga, 5 th July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET

55 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Contents ■ Problems in managing computations ■ Job managing frameworks ■ User-friendly GUIs and user/grid portals

56 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Problem to Solve ■ Aim:  Organize medium size (20k cases) computation on grid ■ Challenges: 1. We have a team of 8 people working on this application (application developers + experts) 2. Experts is preparing parameters in form of single file 3. Application is changing, so dealing with version of the application is important 4. Application is not working well for some parameters

57 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Technical Problems ■ Job sometimes fails – resubmission is required ■ Majority of jobs should go to grid (to two different VOs) but we want to use also a local cluster ■ We should not overflow VOs but use it efficiently ■ Ghrrr.. Some sites are just wrongly configured! ■ In some failures only the application operator should make decision what to do ■ Cases are quickly computed (e.g. 10’) – we need put more cases to single grid job

58 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Good Practices 1/2 ■ Organization  This is good to structure your computation (from beginning!) in experiments with attached: ► Version of the software ► Input specification Version/name of  Experiments results should be to different subdirectory on LFC ■ Validation mechanism is needed to distinguish between outputs of job from qualified result related to some inputs  Validated results should be copied to different repository organized according input arguments

59 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Good Practices 2/2 ■ Mechanism for listing problems should be applied  Automatic resubmission should be done carefully ■ For excluding wrongly configured sites you can use:  FCR (Freedom of Choice of Resources), but it works for whole VO  List of resources in JDL argument ■ You need to have application level scheduler  To not overflow grid  To provide the same interface to many environments

60 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga The Two Golden Rules 1. Users are the most important resources in GRIDs 2. This is better to spend 60 minutes on writing a script than 3 minutes every day of doing this manually

61 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Solutions ■ Write a scripts that will manage all ■ Use the a existing framework for job submission  Some scripts are needed to adapt your application to a framework  Examples: ► BOSS – Batch Oriented Submission System http://boss.bo.infn.it/ ► Zeus Grid Toolkit http://www-zeus.desy.de/~wrona/grid/index.php ► GANGA http://ganga.web.cern.ch/ganga/

62 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Some Features of Frameworks ■ BOSS  Developed for CMS  Feasible to use alone  C++ & Python API  Poor documentation  Support for various environments: fork, gLites, LSF, condorG, PBS ■ Zeus Grid Toolkit  Comes from DESY, used in HEP experiments  Relatively easy to enable  Manage data in LFC  Running slowly in case number of jobs >1500 ■ GANGA  Used in Atlas and LHCb  HEP related but not exluding other application

63 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Grid Portal vs Application Portal ■ Grid Portals/Tools  A nice user interface  Not always suitable for REAL WORK ► Possibility of providing automatic solution is limited  Usually single submission only   Typically build by grid developers ■ Application Portals  Typically developed for a single application  Possibility to hide the grid  Typically build with cooperation with grid users

64 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Portal/GUIs Examples ■ Grid Portal/GUIs  Migrating Desktop ► http://desktop.psnc.pl/  P-Grade Portal ► Workflow drawing ► http://www.lpds.sztaki.hu/pgrade/ ■ Tools to build application portals  GridSphere portal framework ► http://www.gridsphere.org/  GridwiseTech LCG API ► http://www.gridwisetech.com/content/view/91/96/lang,en/

65 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Job submission  Selecting job type  Specifying job definition  Choosing requirements  Deciding on ranking policy  Picking input and/or output files  Defining specific job parameters (plug-in!)  Pre-processing job parameters (plug-in!) MD functionality overview

66 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga ■ Job monitoring  Tracking status of the job  Checking job parameters  Tracing job logs  Examining detailed job status  Possibility of interaction with user MD functionality overview

67 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga MD functionality overview Processing job output –Presenting partial results (plug-in!) –Visualising job output files (plug-in!) –Processing results (plug-in!)

68 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Application Portal Example Movie from presentation of Protein Folding Application in EUChinaGrid

69 T.Szepieniec AT cyfronet.pl; 2–6 July 2007, 2 nd BG Summer School, Riga Summary ■ Part I – Basic Techniques  Motivation  Role of VO Manager  Preparation of job submission, installation of the software in SW_DIR, managing VO-tags ■ Part II - Beyond Limits of gLite  Non-MPI parallel applications  Interactive use of grid resources ■ Part III - Managing Large Experiments  Principles of managing computations  Job managing frameworks  User-friendly GUI and user/grid portals We wish you many very good grid-enabled applications!!!


Download ppt "Enabling Applications on BG/EGEE Grid Riga, 3 rd July 2007 by Tomasz Szepieniec IFJ PAN & CYFRONET."

Similar presentations


Ads by Google