Download presentation
Presentation is loading. Please wait.
Published byStanley Lickert Modified over 9 years ago
1
1 2010-04-27 EDGI European Desktop Grid Initiative EDGI gUSE portal user guide EDGI is supported by the FP7 Capacities Programme under contract nr RI-261556
2
PrerequisitesPrerequisites In order to use and understand this training material the trainees must be familiar with In order to use and understand this training material the trainees must be familiar with using a gLite infrastructure for job execution using a gLite infrastructure for job execution using a gUSE web-based portal for job execution using a gUSE web-based portal for job execution In case any knowledge of the above tools is missing, tutorials/manuals can be found on the following urls: In case any knowledge of the above tools is missing, tutorials/manuals can be found on the following urls: gLite: https://indico.egi.eu/indico/conferenceDisplay.py?confId=231 gLite: https://indico.egi.eu/indico/conferenceDisplay.py?confId=231https://indico.egi.eu/indico/conferenceDisplay.py?confId=231 gUSE: http://sourceforge.net/projects/guse/files/WS-PGRADE- Cookbook_2012_05_31.pdf/download gUSE: http://sourceforge.net/projects/guse/files/WS-PGRADE- Cookbook_2012_05_31.pdf/downloadhttp://sourceforge.net/projects/guse/files/WS-PGRADE- Cookbook_2012_05_31.pdf/downloadhttp://sourceforge.net/projects/guse/files/WS-PGRADE- Cookbook_2012_05_31.pdf/download 2
3
ContentsContents Single job submission to EDGI through gLite using the EDGI gUSE portal Single job submission to EDGI through gLite using the EDGI gUSE portal Metajob submission to EDGI through gLite using the EDGI gUSE portal Metajob submission to EDGI through gLite using the EDGI gUSE portal (GBAC submission to EDGI through gLite using the EDGI gUSE portal) (GBAC submission to EDGI through gLite using the EDGI gUSE portal) For each type of submission, there will be a step-by-step explanation in order to ease the understanding. 3
4
Step 0: get a proxy I. Go to „Security” Certificates Go to „Security” Certificates Click „Download” Click „Download” Fill in form as follows: Fill in form as follows: Hostname: n40.hpcc.sztaki.hu Hostname: n40.hpcc.sztaki.hu Port: 7512 Port: 7512 Username: SS2012_EDGI Username: SS2012_EDGI Password: SS2012_EDGI Password: SS2012_EDGI Click „Download” Click „Download” 4
5
Step 0: get a proxy II. Click „Associate to VO” Click „Associate to VO” Select „edgiprod.vo.edgi-grid.eu” Select „edgiprod.vo.edgi-grid.eu” Click „OK” Click „OK” 5
6
Single job submission 6
7
Step 1: Choose application from the EDGI AR 7 Let us select the dsp app… http://edgi-repo.cpc.wmin.ac.uk/repository/
8
Step 2: Create a workflow for the selected application Using the graph editor, create your workflow Using the graph editor, create your workflow Number of inputs and outputs can be extracted from the information about the application stored in the AR (edgi- repo.cpc.wmin.ac.uk) Number of inputs and outputs can be extracted from the information about the application stored in the AR (edgi- repo.cpc.wmin.ac.uk) When the graph and the derived concrete workflow is there, the workflow must appear in the list of workflows dialog When the graph and the derived concrete workflow is there, the workflow must appear in the list of workflows dialog 8 Single job workflow for the DSP application
9
Step 3: Configure your workflow Configuration of the selected workflow can be performed within this dialog Configuration of the selected workflow can be performed within this dialog To start the configu- ration click on the job (orange box) To start the configu- ration click on the job (orange box) Job properties are: Job properties are: Type Type Application Repository Application Repository Boinc Job Boinc Job Resource Resource Role Role Replication Replication Parameter Parameter
10
Step 4: Set type and repository The “Type” is the type of submitters which server as the backend of the Portal system, i.e. they are the sources of the destination (Grids) the binary jobs are delivered to. The “Type” is the type of submitters which server as the backend of the Portal system, i.e. they are the sources of the destination (Grids) the binary jobs are delivered to. Set the type of your job to “EDGI” Set the type of your job to “EDGI” “Application Repository” is the source repository the portal takes the list of available applications from. “Application Repository” is the source repository the portal takes the list of available applications from. Set it to “Prod-AR” Set it to “Prod-AR”
11
Step 5: Select application and supporting VO The list of application is queried from the application repository the user previously selected. The list of application is queried from the application repository the user previously selected. Select the application named “dsp” Select the application named “dsp” The list of supporting Vos are also queried from the repository. The list of supporting Vos are also queried from the repository. Select the production VO named “edgiprod.vo.edgi-grid.eu” Select the production VO named “edgiprod.vo.edgi-grid.eu”
12
Step 6: Select resource and define its required role Now, the list of resources (computing elements and their queues) are listed. Now, the list of resources (computing elements and their queues) are listed. Select “cr1.edgi- grid.eu:8443/cream-pbs- edgidemo” Select “cr1.edgi- grid.eu:8443/cream-pbs- edgidemo” In the EDGI infrastructure one is needed “Role” in order to be able to submit to a given queue. For the selected queue this role is “edgidemo” In the EDGI infrastructure one is needed “Role” in order to be able to submit to a given queue. For the selected queue this role is “edgidemo” Set the “Role” to “edgidemo”. Make sure you have this role for your cert in the edgiprod VO! Set the “Role” to “edgidemo”. Make sure you have this role for your cert in the edgiprod VO!
13
Step 7: Set the command-line argument Replicate settings in all jobs must be set only if multiple jobs are defined. Replicate settings in all jobs must be set only if multiple jobs are defined. You can leave this option unset. You can leave this option unset. Parameter is the string your job application is going to take as command line argument. Parameter is the string your job application is going to take as command line argument. For this dsp example, set it to “-f 22 -i 22 -p 723 -n pools.txt” For this dsp example, set it to “-f 22 -i 22 -p 723 -n pools.txt”
14
Step 8: Set input and output The dsp application has one input and one output port (file). The dsp application has one input and one output port (file). Set the “Input Port’s internal File Name” to “pools.txt” Set the “Input Port’s internal File Name” to “pools.txt” “Source of input …” should be “upload” type. “Source of input …” should be “upload” type. With the “Browse” button select the “pools.txt” file from your machine. You can download an example one from: http://edgi- repo.cpc.wmin.ac.uk/repository/download?a ppid=1001&filename=example_pools_1.txt With the “Browse” button select the “pools.txt” file from your machine. You can download an example one from: http://edgi- repo.cpc.wmin.ac.uk/repository/download?a ppid=1001&filename=example_pools_1.txthttp://edgi- repo.cpc.wmin.ac.uk/repository/download?a ppid=1001&filename=example_pools_1.txthttp://edgi- repo.cpc.wmin.ac.uk/repository/download?a ppid=1001&filename=example_pools_1.txt Set the “Output Port’s Internal File Name” to “cost.txt” as our dsp application will generate this file upon successful completion. Set the “Output Port’s Internal File Name” to “cost.txt” as our dsp application will generate this file upon successful completion.
15
Step 9: Save your settings, upload and submit. Save your settings, the portal will automatically upload the input file from your machine (you selected as input). After completion, you will see the dialog on the left. Save your settings, the portal will automatically upload the input file from your machine (you selected as input). After completion, you will see the dialog on the left. Go to the workflow list view and push the submit button to start an instance of your configured workflow (job). Go to the workflow list view and push the submit button to start an instance of your configured workflow (job). On the appearing dialog leave everything unchanged (for novice users)! On the appearing dialog leave everything unchanged (for novice users)!
16
Step 10: track the status of your workflow Go to the list of workflows view and select “details” of your workflow. Go to the list of workflows view and select “details” of your workflow. You must see this while your workflow is running… You must see this while your workflow is running… …and this when successfully finished. …and this when successfully finished.
17
Step 11: get the output of your execution When your workflow finished, select “details” for the workflow instance and push the button labeled “Download file output”. When your workflow finished, select “details” for the workflow instance and push the button labeled “Download file output”. Save and extract the file the portal gives you and you will find the file “cost.txt” as output of the dsp application. Save and extract the file the portal gives you and you will find the file “cost.txt” as output of the dsp application.
18
Meta (multiple) job submission 18
19
What is a metajob? MetaJob is a collection of jobs MetaJob is a collection of jobs This type of job has been developed in EDGI to support the transition of huge number (i.e. 1000-10000) of jobs through gLite in a seamless way. This type of job has been developed in EDGI to support the transition of huge number (i.e. 1000-10000) of jobs through gLite in a seamless way. MetaJob is not recognised by gLite, considered as one single job MetaJob is not recognised by gLite, considered as one single job The list of jobs must be described in a metajob definition file The list of jobs must be described in a metajob definition file Metajob description file is travelling as one extra input file for the job Metajob description file is travelling as one extra input file for the job Metajob is only extracted at the Desktop Grid site Metajob is only extracted at the Desktop Grid site Input files for each job instance must be uploaded behind an http server Input files for each job instance must be uploaded behind an http server For each input file its url is needed wher it can be downloaded For each input file its url is needed wher it can be downloaded Metajob definition files a few directives can be used Metajob definition files a few directives can be used The output files are finally returned in one single zipped file The output files are finally returned in one single zipped file 19
20
MetaJob in the EDGI infrastructure 20 ARC grid gLite grid Eucalyptus/ Amazon ARC MCE atticmonitorAR CREAM MCE atticmonitorAR 3GBridge attic monitor AR User IF Bridge IF Attic FS DG client attic Monitor UI DG Pro- ject submit inspect upload down- load submit cloud Volunteer/ Institutional Resources DG client attic MetaJob as a single job Unfol ding Huge number of jobs Single job Demonstrated at EGI UF, 12 th of April, 2011 at both #10 with 10.000 jobs through gLite
21
Step 0: prepare your inputs Upload your individual input files to web server: Upload your individual input files to web server: http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_inputs/pools1.txt http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_inputs/pools1.txt … http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_inputs/pools10000.txt http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_inputs/pools10000.txt Create the description of your metajob: Create the description of your metajob: %Required 100% %SuccessAt 100% %Comment pools1.txt Arguments = "-i 0 -n pools.txt -f 22 -p 723“ Input = pools.txt=http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_inputs/pools1.txt Queue[….] %Comment pools100.txt Input = pools.txt=http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_inputs/pools100.txt Queue Alternatively, you can download an example at http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_metajob_configs/_3gb-metajob-dsp-100 Alternatively, you can download an example at http://mishra.lpds.sztaki.hu/edgidemo/download/dsp_metajob_configs/_3gb-metajob-dsp-100 21
22
Step 1: Choose application from the EDGI AR 22 Let us select the dsp app… http://edgi-repo.cpc.wmin.ac.uk/repository/
23
Step 2: Create a workflow for the selected application Using the graph editor, create your workflow Using the graph editor, create your workflow Number of inputs and outputs can be extracted from the information about the application stored in the AR (edgi- repo.cpc.wmin.ac.uk) Number of inputs and outputs can be extracted from the information about the application stored in the AR (edgi- repo.cpc.wmin.ac.uk) When the graph and the derived concrete workflow is there, the workflow must appear in the list of workflows dialog When the graph and the derived concrete workflow is there, the workflow must appear in the list of workflows dialog IMPORTANT: please, add one more extra input port to the job. Later you will need it! IMPORTANT: please, add one more extra input port to the job. Later you will need it! 23 Single job workflow for the DSP application
24
Step 3-8: Repeat steps defined for single job submission Please configure your dsp application the same way as you did it in the single job submission tutorial Please configure your dsp application the same way as you did it in the single job submission tutorial The only change is that instead of one, please put 2 input ports for the job. The definition of the second input is defined in the next slide The only change is that instead of one, please put 2 input ports for the job. The definition of the second input is defined in the next slide 24
25
Step 9: configure the extra input file to your dsp job Modify or create a new dsp application where there is two inputs and one output port (file). Modify or create a new dsp application where there is two inputs and one output port (file). On the left you see how it should look like On the left you see how it should look like Then upload your metajob configuration file. Then upload your metajob configuration file. You can take an example from the previous slide You can take an example from the previous slide The file must be named starting with “_3gb- metajob”. In the dsp application the filename is “_3gb-metajob-dsp-100” containing the definition of 100 jobs. The file must be named starting with “_3gb- metajob”. In the dsp application the filename is “_3gb-metajob-dsp-100” containing the definition of 100 jobs. You can download an example at http://mishra.lpds.sztaki.hu/edgidemo/downloa d/dsp_metajob_configs/_3gb-metajob-dsp-100 You can download an example at http://mishra.lpds.sztaki.hu/edgidemo/downloa d/dsp_metajob_configs/_3gb-metajob-dsp-100
26
Step 10: submit, track the status and download result Submit your workflow and track the status. Submit your workflow and track the status. Wait until it finishes (may takes some times as you are running 100 jobs) Wait until it finishes (may takes some times as you are running 100 jobs) You must see this when your workflow completed successfully You must see this when your workflow completed successfully Go to details of your workflow instance and download its output. Go to details of your workflow instance and download its output.
27
Step 11: extract the output file to get the result of the individual runs 27 Extracting: tar zxvf cost.txt./outputs/ /cost.txt …./outputs/ /cost.txt See the mapping between your individual job definition and the jobids (which gives the name of directories storing the output files of your app) one subjob id
28
GBAC job submission 28
29
OverviewOverview 29 Why? Why? Majority of Desktop Grid resources are Windows based (68.9%) - however majority of scientific applications run on Linux... Majority of Desktop Grid resources are Windows based (68.9%) - however majority of scientific applications run on Linux... Desktop Grids cannot run arbitrary applications only those that were “deployed” beforehand. Desktop Grids cannot run arbitrary applications only those that were “deployed” beforehand. How ? How ? A single “generic” application is deployed which executes all “real” (including your) application in a virtual machine (that runs Linux) on the Desktop Grid resources. A single “generic” application is deployed which executes all “real” (including your) application in a virtual machine (that runs Linux) on the Desktop Grid resources.
30
(an EDGI DG site) BOINC Server … Overview 3G Bridge BOINC Client GBAC Application and Inputs BOINC Client GBAC Application and Inputs 1.A User submits her application and inputs via the portal which transfer the job to the 3gBridge through gLite. 2.3G Bridge detects that the application is “legacy” (not BOINC native) and redirects the binaries and inputs to the GBAC native BOINC application at EDGeS@Home. 3.Clients (who have VirtualBox installed) will download the BOINC native GBAC application with the submitted application and its inputs. 4.GBAC starts a Linux Virtual Machine (using VirtualBox). 5.GBAC copies the applications and inputs into the Virtual Machine. 6.The application is executed in the Linux VM. 7.The result is fetched from the VM by GBAC. 8.The VM is shut down and discarded. 9.GBAC finishes and the result is returned to EDGeS@Home from the Client. 10.The results are returned to gLite through 3G Bridge from EDGeS@Home. 11. EDGI portal
31
Step 1: create a legacy application and its description file Now, let us create simple script (which represents our legacy application). Now, let us create simple script (which represents our legacy application). Save it as “myhelloworld.sh” Save it as “myhelloworld.sh” GBAC requires the information on the name of file that is to be launched on the resource GBAC requires the information on the name of file that is to be launched on the resource Name of file should be described in an XML with a format shown on the left. Name of file should be described in an XML with a format shown on the left. This file _must_ be named as “gbac_job.xml” This file _must_ be named as “gbac_job.xml” Executable name is our previously saved one: “myhelloworld.sh” Executable name is our previously saved one: “myhelloworld.sh”
32
Step 2: create a workflow with a single job The job must be created with the 2 input ports: 0: for the gbac_job.xml file 1: for our legacy application (script: “myhelloworld.sh”) The job must be created with the 2 input ports: 0: for the gbac_job.xml file 1: for our legacy application (script: “myhelloworld.sh”) The job must be created with as many output ports as output file the script generates, i.e. one (“output”) The job must be created with as many output ports as output file the script generates, i.e. one (“output”) Create the workflow based on the graph, in this example it is named “GBAC-EDGIDEMO”, see on the left Create the workflow based on the graph, in this example it is named “GBAC-EDGIDEMO”, see on the left
33
Step 3: configure the GBAC job Configure the job the same way you did it for simple job submission Configure the job the same way you did it for simple job submission The only difference is in the name of the job, it must be “gbac”, see on the left The only difference is in the name of the job, it must be “gbac”, see on the left
34
Step 4: configure input and output ports Configure port 0’s name to “gbac_job.xml” and upload the file your created in the 1 st step Configure port 0’s name to “gbac_job.xml” and upload the file your created in the 1 st step Configure port 1’s name to “myhelloworld.sh” and upload the file your created in the 1 st step Configure port 1’s name to “myhelloworld.sh” and upload the file your created in the 1 st step Configure port 2’s name to “output” as this will be the name of the file your “myhelloworld.sh” generates Configure port 2’s name to “output” as this will be the name of the file your “myhelloworld.sh” generates
35
Step 5: submit, track the status and download result Submit your workflow and track the status. Submit your workflow and track the status. Wait until it finishes Wait until it finishes You must see this when your workflow completed successfully You must see this when your workflow completed successfully Go to details of your workflow instance and download its output. Go to details of your workflow instance and download its output.
36
Step 6: extract the output file and check its content After unzipping, the output file name “output” must be part of the zipped file. There are some additional log generated by the portal or GBAC. After unzipping, the output file name “output” must be part of the zipped file. There are some additional log generated by the portal or GBAC. Opening the file “output” you will see the result of the “myhelloworld.sh” script Opening the file “output” you will see the result of the “myhelloworld.sh” script
37
Thank you for your attention! www.edgi-project.eu 37
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.