Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLUSTER 2005 Tutorial Boston, 26th September 2005 Programming with GRID superscalar Rosa M. Badia Toni Cortes Pieter Bellens, Vasilis Dialinos, Jesús Labarta,

Similar presentations


Presentation on theme: "CLUSTER 2005 Tutorial Boston, 26th September 2005 Programming with GRID superscalar Rosa M. Badia Toni Cortes Pieter Bellens, Vasilis Dialinos, Jesús Labarta,"— Presentation transcript:

1 CLUSTER 2005 Tutorial Boston, 26th September 2005 Programming with GRID superscalar Rosa M. Badia Toni Cortes Pieter Bellens, Vasilis Dialinos, Jesús Labarta, Josep M. Pérez, Raül Sirvent

2 CLUSTER 2005 Tutorial Boston, 26th September 2005 Tutorial Detailed Description Introduction to GRID superscalar (55%) 9:00AM-10:30AM 1.GRID superscalar objective 2.Framework overview 3.A sample GRID superscalar code 4.Code generation: gsstubgen 5.Automatic configuration and compilation: Deployment center 6.Runtime library features Break 10:30-10:45am Programming with GRID superscalar (45%) 10:45AM-Noon 6.Users interface: The IDL file GRID superscalar API User resource constraints and performance cost Configuration files Use of checkpointing 8.Use of the Deployment center 9.Programming Examples

3 CLUSTER 2005 Tutorial Boston, 26th September 2005 Introduction to GRID superscalar 1.GRID superscalar objective 2.Framework overview 3.A sample GRID superscalar code 4.Code generation: gsstubgen 5.Automatic configuration and compilation: Deployment center 6.Runtime library features

4 CLUSTER 2005 Tutorial Boston, 26th September 2005 1. GRID superscalar Objective  Ease the programming of GRID applications  Basic idea: Grid  ns  seconds/minutes/hours

5 CLUSTER 2005 Tutorial Boston, 26th September 2005 1. GRID superscalar Objective  Reduce the development complexity of Grid applications to the minimum –writing an application for a computational Grid may be as easy as writing a sequential application  Target applications: composed of tasks, most of them repetitive –Granularity of the tasks of the level of simulations or programs –Data objects are files

6 CLUSTER 2005 Tutorial Boston, 26th September 2005 2. Framework overview 1.Behavior description 2.Elements of the framework

7 CLUSTER 2005 Tutorial Boston, 26th September 2005 Input/output files 2.1 Behavior description for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd);

8 CLUSTER 2005 Tutorial Boston, 26th September 2005 2.1 Behavior description Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT … GS_open Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Display CIRI Grid

9 CLUSTER 2005 Tutorial Boston, 26th September 2005 EXTRACT 2.1 Behavior description Subst DIMEMAS Subst DIMEMAS … GS_open Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Display CIRI Grid

10 CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 1.Users interface 2.Code generation: gsstubgen 3.Deployment center 4.Runtime library

11 CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 1.Users interface –Assembly language for the GRID Well defined operations and operands Simple sequential programming on top of it (C/C++, Perl, …) –Three components: Main program Subroutines/functions Interface Definition Language (IDL) file

12 CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 2. Code generation: gsstubgen –Generates the code necessary to build a Grid application from a sequential application Function stubs (master side) Main program (worker side)

13 CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 3. Deployment center –Designed for helping user Grid configuration setting Deployment of applications in local and remote servers

14 CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 4. Runtime library –Transparent access to the Grid –Automatic parallelization between operations at run-time Uses architectural concepts from microprocessor design Instruction window (DAG), Dependence analysis, scheduling, locality, renaming, forwarding, prediction, speculation,…

15 CLUSTER 2005 Tutorial Boston, 26th September 2005  Three components: –Main program –Subroutines/functions –Interface Definition Language (IDL) file  Programming languages: –C/C++, Perl –Prototype version for Java and shell script 3. A sample GRID superscalar code

16 CLUSTER 2005 Tutorial Boston, 26th September 2005  Main program: A Typical sequential program for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd); 3. A sample GRID superscalar code

17 CLUSTER 2005 Tutorial Boston, 26th September 2005 3. A sample GRID superscalar code void dimemas(in File newCFG, in File traceFile, out File DimemasOUT) { char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command); }  Subroutines/functions void display(in File toplot) { char command[500]; sprintf(command, "./display.sh %s", toplot); GS_System(command); }

18 CLUSTER 2005 Tutorial Boston, 26th September 2005 interface MC { void subst(in File referenceCFG, in double newBW, out File newCFG); void dimemas(in File newCFG, in File traceFile, out File DimemasOUT); void post(in File newCFG, in File DimemasOUT, inout File FinalOUT); void display(in File toplot) };  Interface Definition Language (IDL) file –CORBA-IDL Like Interface: In/Out/InOut files Scalar values (in or out) –The subroutines/functions listed in this file will be executed in a remote server in the Grid. 3. A sample GRID superscalar code

19 CLUSTER 2005 Tutorial Boston, 26th September 2005 3. A sample GRID superscalar code  GRID superscalar programming requirements –Main program (master side): Begin/finish with calls GS_On, GS_Off Open/close files with: GS_FOpen, GS_Open, GS_FClose, GS_Close Possibility of explicit synchronization: GS_Barrier Possibility of declaration of speculative areas: GS_Speculative_End(func) –Subroutines/functions (worker side): Temporal files on local directory or ensure uniqueness of name per subroutine invocation GS_System instead of system All input/output files required must be passed as arguments Possibility of throwing exceptions: GS_Throw

20 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen app.idl app-worker.capp.capp-functions.c server gsstubgen app.h client app-stubs.c app_constraints.cc app_constraints_wrapper.cc app_constraints.h

21 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen app-stubs.c IDL function stubs app.h IDL functions headers app_constraints.cc User resource constraints and perfomance cost app_constraints.h app_constraints_wrapper.cc app-worker.c Main program for the worker side (calls to user functions)

22 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include … int gs_result; void Subst(file referenceCFG, double seed, file newCFG) { /* Marshalling/Demarshalling buffers */ char *buff_seed; /* Allocate buffers */ buff_seed = (char *)malloc(atoi(getenv("GS_GENLENGTH"))+1); /* Parameter marshalling */ sprintf(buff_seed, "%.20g", seed); Execute(SubstOp, 1, 1, 1, 0, referenceCFG, buff_seed, newCFG); /* Deallocate buffers */ free(buff_seed); } … Sample stubs file

23 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include … int main(int argc, char **argv) { enum operationCode opCod = (enum operationCode)atoi(argv[2]); IniWorker(argc, argv); switch(opCod) { case SubstOp: { double seed; seed = strtod(argv[4], NULL); Subst(argv[3], seed, argv[5]); } break; … } EndWorker(gs_result, argc, argv); return 0; } Sample worker main file

24 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include "mcarlo_constraints.h" #include "user_provided_functions.h" string Subst_constraints(file referenceCFG, double seed, file newCFG) { string constraints = ""; return constraints; } double Subst_cost(file referenceCFG, double seed, file newCFG) { return 1.0; } … Sample constraints skeleton file

25 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include … typedef ClassAd (*constraints_wrapper) (char **_parameters); typedef double (*cost_wrapper) (char **_parameters); // Prototypes ClassAd Subst_constraints_wrapper(char **_parameters); double Subst_cost_wrapper(char **_parameters); … // Function tables constraints_wrapper constraints_functions[4] = { Subst_constraints_wrapper, … }; cost_wrapper cost_functions[4] = { Subst_cost_wrapper, … }; Sample constraints wrapper file (1)

26 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen ClassAd Subst_constraints_wrapper(char **_parameters) { char **_argp; // Generic buffers char *buff_referenceCFG; char *buff_seed; // Real parameters char *referenceCFG; double seed; // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++); //Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); string _constraints = Subst_constraints(referenceCFG, seed); ClassAd _ad; ClassAdParser _parser; _ad.Insert("Requirements", _parser.ParseExpression(_constraints)); // Free buffers return _ad; } Sample constraints wrapper file (2)

27 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen double Subst_cost_wrapper(char **_parameters) { char **_argp; // Generic buffers char *buff_referenceCFG; char *buff_referenceCFG; char *buff_seed; // Real parameters char *referenceCFG; double seed; // Allocate buffers // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++); //Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); double _cost = Subst_cost(referenceCFG, seed); // Free buffers return _cost; } … Sample constraints wrapper file (3)

28 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen client GRID superscalar runtime server i app-functions.c app-worker.c app-stubs.c app.c GT2...... server i app-functions.c app-worker.c Globus services: gsiftp, gram app_constraints.cc app_constraints_wrapper.cc Binary building

29 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen User provided files Files generated from IDL Files generated by deployer app.c app-stubs.c app_constraints_wrapper.ccapp_constraints.cc app-functions.c app-worker.capp.h app_constraints.h (projectname).xml config.xml app.idl Putting all together: involved files

30 CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen GRID superscalar applications architecture

31 CLUSTER 2005 Tutorial Boston, 26th September 2005 5. Deployment center  Java based GUI. Allows: –Specification of grid computation resources: host details, libraries location… –Allows selection of Grid configuration  Grid configuration checking process: –Aliveness of host (ping) –Globus service is checked by submitting a simple test –Sends a remote job that copies the code needed in the worker, and compiles it  Automatic deployment –sends and compiles code in the remote workers and the master  Configuration files generation

32 CLUSTER 2005 Tutorial Boston, 26th September 2005 5. Deployment center Automatic deployment

33 CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features  Initial prototype over Condor and MW  Current version over Globus 2.4, Globus 4.0, ssh/scp, Ninf-G2  File transfer, security and other features provided by the middleware (Globus, …)

34 CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features 1.Data dependence analysis 2.File Renaming 3.Task scheduling 4.Resource brokering 5.Shared disks management and file transfer policy 6.Scalar results collection 7.Checkpointing at task level 8.API functions implementation

35 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.1 Data-dependence analysis  Data dependence analysis –Detects RaW, WaR, WaW dependencies based on file parameters  Oriented to simulations, FET solvers, bioinformatic applications –Main parameters are data files  Tasks’ Directed Acyclic Graph is built based on these dependencies Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Display

36 CLUSTER 2005 Tutorial Boston, 26th September 2005 “f1_2” “f1_1” 6.2 File renaming  WaW and WaR dependencies are avoidable with renaming T1_1 T2_1 T3_1 T1_2 T2_2 T3_2 T1_N … “f1” While (!end_condition()) { T1 (…,…, “f1”); T2 (“f1”, …, …); T3 (…,…,…); } WaR WaW

37 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.3 Task scheduling  Distributed between the Execute call, the callback function and the GS_Barrier call  Possibilities –The task can be submitted immediately after being created –Task waiting for resource –Task waiting for data dependency  Task submission composed of –File transfer –Task submission –All specified in Globus RSL (for Globus case)

38 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.3 Task scheduling  Temporal directory created in the server working directory for each task  Calls to globus: –globus_gram_client_job_request –globus_gram_client_callback_allow –globus_poll_blocking  End of task notification: Asynchronous state- change callbacks monitoring system –globus_gram_client_callback_allow() –callback_func function  Data structures update in Execute function, GRID superscalar primitives and GS_Barrier  GS_Barrier primitive before ending the program that waits for all tasks (performed inside GS_Off)

39 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.4 Resource brokering  When a task is ready for execution, the scheduler tries to allocate a resource  Broker receives a request –The classAd library is used to match resource ClassAds with task ClassAds –If more than one resources fulfils the constraint, the resource which minimizes this formula is selected: FT = File transfer time to resource r ET = Execution time of task t on resource r (using user provided cost function)

40 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.5 Shared disks management and file transfer policy client server 1 server 2 T1 f1f4 T2 f4 f7 f1 f7 Working directories File transfers policy T1 T2 f1 f4 (temp.) f7

41 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.5 Shared disks management and file transfer policy client server 1 server 2 f1 f4 f7 f1 f7 T1 T2 Working directories Shared working directories (NFS)

42 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.5 Shared disks management and file transfer policy client server 1 server 2 Input directories Shared input disks

43 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.6 Scalar results collection  Collection of output parameters which are not files –Main code cannot continue until the scalar result value is obtained Partial barrier synchronization … grid_task_1 (“file1.txt”, “file2.cfg”, var_x); if (var_x>10){ …  Socket and file mechanisms provided output variable

44 CLUSTER 2005 Tutorial Boston, 26th September 2005 3 6.7 Task level checkpointing  Inter-task checkpointing  Recovers sequential consistency in the out-of-order execution of tasks 0123456 Completed Running Committed Successful execution

45 CLUSTER 2005 Tutorial Boston, 26th September 2005 3 6.7 Task level checkpointing  Inter-task checkpointing  Recovers sequential consistency in the out-of-order execution of tasks 0123456 Completed Running Committed Failing execution Failing Cancel Finished correctly

46 CLUSTER 2005 Tutorial Boston, 26th September 2005 3 6.7 Task level checkpointing  Inter-task checkpointing  Recovers sequential consistency in the out-of-order execution of tasks 0123456 Completed Running Committed Restart execution Failing Finished correctly Execution continues normally!

47 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.7 Task level checkpointing  On fail: from N versions of a file to one version (last committed version)  Transparent to application developer

48 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation –Master side GS_On GS_Off GS_Barrier GS_FOpen GS_FClose GS_Open GS_Close GS_Speculative_End(func) –Worker side GS_System GS_Throw

49 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation  Implicit task synchronization – GS_Barrier –Inserted in the user main program when required –Main program execution is blocked –globus_poll_blocking() called –Once all tasks are finished the program may resume

50 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation  GRID superscalar file management API primitives: –GS_FOpen –GS_FClose –GS_Open –GS_Close  Mandatory for file management operations in main program  Opening a file with write option –Data dependence analysis –Renaming is applied  Opening a file with read option –Partial barrier until the task that is generating that file as output file finishes  Internally file management functions are handled as local tasks –Task node inserted –Data-dependence analysis –Function locally executed  Future work: offer a C library with GS semantic (source code with typical calls could be used)

51 CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation  GS_Speculative_End(func) / GS_Throw  Any worker can call to GS_Throw at any moment  Task that rises the GS_Throw is the last valid task (all sequential tasks after that must be undone)  The speculative part is considered from the task that throws the exception until the GS_Speculative_End  Possibility of calling a local function when the exception is detected.

52 CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features app.c LocalHost app-functions.c Calls sequence without GRID superscalar

53 CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features app.c app-stubs.c GRID superscalar runtime app_constraints_wrapper.cc app_constraints.cc GT2 LocalHost RemoteHost app-functions.c app-worker.c Calls sequence with GRID superscalar

54 CLUSTER 2005 Tutorial Boston, 26th September 2005 Tutorial Detailed Description Introduction to GRID superscalar (55%) 9:00AM-10:30AM 1.GRID superscalar objective 2.Framework overview 3.A sample GRID superscalar code 4.Code generation: gsstubgen 5.Automatic configuration and compilation: Deployment center 6.Runtime library features Break 10:30-10:45am Programming with GRID superscalar (45%) 10:45AM-Noon 7.Users interface: The IDL file GRID superscalar API User resource constraints and performance cost Configuration files Use of checkpointing 8.Use of the Deployment center 9.Programming Examples

55 CLUSTER 2005 Tutorial Boston, 26th September 2005 7. Users interface 1.The IDL file 2.GRID superscalar API 3.User resource constraints and performance cost 4.Configuration files 5.Use of checkpointing

56 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file  GRID superscalar uses a simplified interface definition language based on the CORBA IDL standard  The IDL file describes the headers of the functions that will be executed on the GRID interface MYAPPL { void myfunction1(in File file1, in scalar_type scalar1, out File file2); void myfunction2(in File file1, in File file2, out scalar_type scalar1); void myfunction3(inout scalar_type scalar1, inout File file1); };  Requirement –All functions must be void  All parameters defined as in, out or inout.  Types supported –filenames: special type –integers, floating point, booleans, characters and strings  Scalar_type can be: –short, int, long, float, double, boolean, char, and string

57 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file

58 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file  Example: –Initial call void subst (char *referenceCFG, double newBWd, char *newCFG); –IDL interface void subst (in File referenceCFG, in double newBWd, out File newCFG); Input filename Output filenameInput parameter

59 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file  Example: –Initial call: void subst (char *referenceCFG, double newBWd, int *outval); –IDL interface void subst (in File referenceCFG, in double newBWd, out int outval); –Although output parameter type changes in IDL file, not changes are required in function code Input filename Output integerInput parameter

60 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API  Master side –GS_On –GS_Off –GS_Barrier –GS_FOpen –GS_FClose –GS_Open –GS_Close –GS_Speculative_End(func)  Worker side –GS_Throw –GS_System

61 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API  Initialization and finalization void GS_On(); void GS_Off(int code); –Mandatory –GS_On(): at the beginning of main program code or at least before any call to functions listed in the IDL file (task call) Initializations –GS_Off (code): at the end of main program code or at least after any task call Finalizations Waits for all pending tasks Code: 0: normal end -1: error. Will store necessary checkpoint information to enable later restart.

62 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API  Synchronization void GS_Barrier(); –Can be called at any point of the main program code –Waits until all tasks called previously had finished –Can Reduce performance!

63 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API  File primitives: FILE * GS_FOpen (char *filename, int mode); int GS_FClose(FILE *f); int GS_Open(char *pathname, int mode); int GS_Close(int fd);  Modes: R (reading), W (writing) and A (append)  Examples: FILE * fp; char STRAUX[20]; … fp = GS_FOpen(“myfile.ext”, R); // Read something fscanf( fp, "%s", STRAUX); GS_FClose (fp);

64 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API  Examples: int filedesc; … filedesc= GS_Open(“myfile.ext”, W); // Write something write(filedesc,”abc”, 3); GS_Close (filedesc);  Behavior: –At user level, the same as fopen or open (or fclose and close)  Where to use it: –In then main program (not required in worker code)  When to use it: –Always when opening/closing files between GS_On() and GS_Off calls

65 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API  Exception handling void GS_Speculative_End(void (*fptr)()); GS_Throw –Enables exception handling from tasks functions to main program –A speculative area can be defined in the main program which is not executed when an exception is thrown –The user can provide a function that it is executed in the main program when an exception is raised

66 CLUSTER 2005 Tutorial Boston, 26th September 2005 Main program code 7.2 GRID superscalar API while (j<MAX_ITERS){ getRanges(Lini, BWini, &Lmin, &Lmax, &BWmin, &BWmax); for (i=0; i<ITERS; i++){ L[i] = gen_rand(Lmin, Lmax); BW[i] = gen_rand(BWmin, BWmax); Filter("nsend.cfg", L[i], BW[i], "tmp.cfg"); Dimemas("tmp.cfg", "nsend_rec_nosm.trf", Elapsed_goal, "dim_ou.txt"); Extract("tmp.cfg", "dim_out.txt", "final_result.txt"); } getNewIniRange("final_result.txt",&Lini, &BWini); j++; } GS_Speculative_End(my_func); void Dimemas(char * cfgFile, char * traceFile, double goal, char * DimemasOUT) { … putenv("DIMEMAS_HOME=/aplic/DIMEMAS"); sprintf(aux, "/aplic/DIMEMAS/bin/Dimemas -o %s %s", DimemasOUT, cfgFile); gs_result = GS_System(aux); distance_to_goal = distance(get_time(DimemasOUT), goal); if (distance_to_goal < goal*0.1) { printf("Goal Reached!!! Throwing exception.\n"); GS_Throw; } } Function executed when a exception is thrown task code

67 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API  Wrapping legacy code in tasks’ code int GS_System(char *command); –At user level has the same behaviour as a system() call. void dimemas(in File newCFG, in File traceFile, out File DimemasOUT) { char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command); }

68 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost  For each task specified in the IDL file the user can provide: –A resource constraints function –A performance cost modelling function  Resource constraints function: –Specifies constraints on the resources that can be used to executed the given task

69 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost AttributeDescriptionType OpSysOperating systemString MemPhysical memory (MB)Integer QueueNameName of the queueString MachineNameName of the MachineString NetKbpsAvailable bandwidthDouble ArchProcessors architecureString NumWorkers Number of simultaneous jobs that can be run Integer GFlops Processor performance (GF). Theoretical or effective. Double NCPUsNumber of CPUs of the machineInteger SoftNameList List of software available in the machine ClassAd list of strings Attributes currently supported:

70 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost  Resource constraints specification –A function interface is generated for each IDL task by gsstubgen in file {appname}_constraints.cc –The name of the function is {task_name}_constraints –The function initially generated is a default function (always evaluates true)

71 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost  Example: –IDL file (mc.idl) interface MC { void subst(in File referenceCFG, in double newBW, out File newCFG); void dimemas(in File newCFG, in File traceFile, out File DimemasOUT); void post(in File newCFG, in File DimemasOUT, inout File FinalOUT); void display(in File toplot) }; –Generated function in mc_constraints.cc string Subst_constraints(file referenceCFG, double seed) { return “true”; }

72 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost  Resource constraints specification (ClassAds strings) string Subst_constraints(file referenceCFG, double seed) { return "(other.Arch == \"powerpc\")“; }

73 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost  Performance cost modelling function –Specifies a model of the performance cost of the given task –As with the resource constraint function, a default function is generated that always returns “1” double Subst_cost(file referenceCFG, double newBW) { return 1.0; }

74 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost  Built-in functions: int GS_Filesize(char *name); double GS_GFlops();  Example double Subst_cost(file referenceCFG, double newBW) { double time; time = GS_filesize(referenceCFG)/1000000) * GS_GFlops(); return(time); }

75 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files  Two configuration files: –Grid configuration file $HOME/.gridsuperscalar/config.xml –Project configuration file {project_name}.gsdeploy  Both are xml files  Both generated by deployment center

76 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files  Grid configuration file –Saved automatically by the deployment center  Contains information about –Available resources in the Grid (server hosts) –Characteristics of the resource Processor architecture Operating system Processor performance Number of CPUs Memory available Queues available …

77 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files  Grid configuration file

78 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files <worker fqdn="kadesh.cepba.upc.es" globusLocation="/usr/gt2" gssLocation="/aplic /GRID-S/HEAD" minPort="20340" maxPort="20460" bandwidth="1250000" architecture=" Power3" operatingSystem="AIX" gFlops="1.5" memorySize="512" cpuCount="16"> …

79 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files  Project configuration file –Generated by the deployment center to save all the information required to run a given application  Contains information about –Resources selected for execution –Concrete characteristics selected by the user Queues Number of concurrent tasks in each server … –Project information Location of binaries in localhost Location of binaries in servers …

80 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files  Resources information......

81 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files Shared disks information:...

82 CLUSTER 2005 Tutorial Boston, 26th September 2005 7.5 Use of checkpointing  When running a GRID superscalar application, information for the checkpointing is automatically stored in file “.tasks.chk”  The checkpointing file simply lists the tasks that have finished  To recover: just restart application as usual  To start again from the beginning: erase “.tasks.chk” file

83 CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center  Initialization of the deployment center

84 CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center  Adding a new host in the Grid configuration

85 CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center  Create a new project

86 CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center  Selection of hosts for a project

87 CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center  Deployment of main program (master) in localhost

88 CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center  Execution of application after deployment

89 CLUSTER 2005 Tutorial Boston, 26th September 2005 9. Programming examples 1.Programming experiences –Ghyper: computation of molecular potential energy hypersurfaces –fastDNAml: likelihood of phylogenetic trees 2.Simple programming examples  Matrix multiply  Mean calculation  Performance modelling

90 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  GHyper –A problem of great interest in the physical- molecular field is the evaluation of molecular potential energy hypersurfaces –Previous approach: Hire a student Execute sequentially a set of N evalutions –Implemented with GRID superscalar as a simple program Iterative structure (simple for loop) Concurrency automatically exploited and run in the Grid with GRID superscalar

91 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  GHyper –Aplication: computation of molecular potential energy hypersurfaces –Run 1 Total execution time: 17 hours Number of executed tasks: 1120 Each task between 45 and 65 minutes BSC (Barcelona): 28 processors IBM Power4 UCLM (Ciudad Real): 11+11 processors AMD + Pentium IV Univ. de Puebla (Mexico) 14 processors AMD64

92 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  Run 2  Total execution time: 31 hours  Number of executed tasks: 1120 BSC (Barcelona): 8 processors IBM Power4 UCLM (Ciudad Real): 8+8 processors AMD + Pentium IV Univ. de Puebla (Mexico) 8 processors AMD64

93 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Two-dimensional potential energy hypersurface for acetone as a function of the 1, and 2 angles

94 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  fastDNAml  Starting point: code from Olsen et al. –Sequential code –Biological application that evaluates maximum likelihood phylogenetic inference  MPI version by Indiana University –Used with PACX-MPI for the HPC- Challenge context in SC2003

95 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Structure of the sequential application: –Iterative algorithm that builds the solution incrementally Solution: maximum likelihood phylogenetic tree –In each iteration, the tree is extended by adding a new taxon –In each iteration 2i-5 possible trees are evaluated –Additionally, each iteration performs a local arrangement phase with 2i-6 additional evaluations –In the sequential algorithm, although these evaluations are independent, are performed sequentially

96 CLUSTER 2005 Tutorial Boston, 26th September 2005  GRID superscalar implementation –Selection of IDL tasks: Evaluation function: –Tree information stored in TreeFile before calling GSevaluate –GS_FOpen and GS_FClose used for files related with evaluation –Automatic parallelization is achieved interface GSFASTDNAML{ GSevaluate (in File InputData, in File TreeFile, out File EvaluatedTreeFile ); }; 9.1 Programming experiences

97 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  Task graph automatically generated by GRID superscalar runtime: Tree evaluation Barrier i-1 i i+1

98 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  With some data set, evaluation is a fast task  Optimization 1: tree clustering –Several tree evaluations are grouped into a single evaluation task –Reduces task initialization and Globus overhead –Reduces parallelism  Optimization 2: local executions –Initial executions are executed locally  Both optimizations are combined

99 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Policies –DEFAULT The number of evaluations grows with the iterations. All evaluations have the same number of trees (MAX_PACKET_SIZE) –UNFAIR Same as DEFAULT, but with a maximum of NUM_WORKER evaluations –FAIR Each iteration has the same fixed number of evaluations (NUM_WORKER)

100 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  Heterogeneous computational Grid: –IBM based machine 816 Nighthawk Power3 processors and 94 p630 Power4 processors (Kadesh), –IBM xSeries 250 with 4 Intel Pentium III (Khafre) –Bull Novascale 5160 with 8 Itanium2 processors (Kharga) –Parsytec CCi-8D with 16 Pentium II processors (Kandake) –some of the authors laptops  For some of the machines the production queues were used  All machines located in Barcelona

101 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences KDKGKFLTPolicyEllapsed time 5240UNFAIR33000 s 8001UNFAIR15978 s 8201UNFAIR16520 s 8220DEFAULT24216 s 8200FAIR14235 s  All results for the HPC-Challenge data set  PACX-MPI gets better performance with similar configurations (less than 10800 s)  However it is using all the resources during all the execution time!

102 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences  An SSH/SCP GRID superscalar version has been developed  Specially interesting for large clusters  New heterogeneous computation Grid configuration: –Machines from previous results –Machines from a site in Madrid, basically Pentium III and Pentium IV based

103 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences upc.edurediris.esucm.esEllapsed time 2176605 s 1177129 s  Even using a larger distance computational Grid, the performance is doubled  PACX-MPI version ellapsed time: 9240s (second configuration case)

104 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Matrix multiply: A[0,0]A[0,1]A[0,2] A[1,0]A[1,1]A[1,2] A[2,0]A[2,1]A[2,2] B[0,0]B[0,1]B[0,2] B[1,0]B[1,1]B[1,2] B[2,0]B[2,1]B[2,2]  C[0,0]C[0,1]C[0,2] C[1,0]C[1,1]C[1,2] C[2,0]C[2,1]C[2,2] = Hypermatrices: each element of the matrix is a matrix Each internal matrix stored in a file

105 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples  Program structure: –Inner product: –Each A[i][k]∙B[k][j] is a matrix multiplication itself: Encapsulated in a function matmul(A_i_k, B_k_j, C_i_j);

106 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: main program int main(int argc, char **argv) { char *f1, *f2, *f3; int i, j, k; IniMatrixes(); for (i = 0; i < MSIZE; i++) { for (j = 0; j < MSIZE; j++) { for (k = 0; k < MSIZE; k++) { f1 = getfilename("A", i, k); f2 = getfilename("B", k, j); f3 = getfilename("C", i, j); matmul(f1, f2, f3); } return 0; }

107 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: matmul function code void matmul(char *f1, char *f2, char *f3){ block *A,*B,*C; A = get_block(f1, BSIZE, BSIZE); B = get_block(f2, BSIZE, BSIZE); C = get_block(f3, BSIZE, BSIZE); block_mul(A, B, C); put_block(C, f3); delete_block(A); delete_block(B); delete_block(C); } static block *block_mul(block *A, block *B, block *C) { int i, j, k; for (i = 0; i rows; i++) { for (j = 0; j cols; j++) { for (k = 0; k cols; k++) { C->data[i][j] += A->data[i][k] * B->data[k][j]; } return C; }

108 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples GRID superscalar code: IDL file interface MATMUL { void matmul(in File f1, in File f2, inout File f3); }; GRID superscalar code: main program int main(int argc, char **argv){ char *f1, *f2, *f3; int i, j, k; GS_On(); IniMatrixes(); for (i = 0; i < MSIZE; i++) { for (j = 0; j < MSIZE; j++) { for (k = 0; k < MSIZE; k++) { f1 = getfilename("A", i, k); f2 = getfilename("B", k, j); f3 = getfilename("C", i, j); matmul(f1, f2, f3); } GS_Off(0); return 0; } NO CHANGES REQUIRED TO FUNCTIONS!

109 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples  Mean calculation –Simple iterative example executed LOOPS times –In each iteration a given number of random numbers are generated –The mean of the random numbers is calculated

110 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: main program int main( int argc, char * argv[] ) { FILE *results_fp; int i, mn; for ( i = 0; i < LOOPS; i ++ ) { gen_random( “random.txt” ); mean( “random.txt”, “results.txt”); } results_fp = fopen( “results.txt”, "r" );{ for( i = 0; i < LOOPS; i ++ ) { fscanf( results_fp, "%d", &mn ); printf( "mean %i : %d\n", i, mn ); } fclose( results_fp); return 0; }

111 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: gen_random function code void gen_random( char * rnumber_file ) { FILE *rnumber_fp; int i; rnumber_fp = fopen(rnumber_file, "w"); for ( i = 0; i < MAX_RANDOM_NUMBERS; i++ ) { int r = 1 + (int) (RANDOM_RANGE*rand()/(RAND_MAX+1.0)); fprintf( rnumber_fp, "%d ", r ); } fclose(rnumber_fp); }

112 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: mean function code void mean( char * rnumber_file, char * results_file ) { FILE *rnumber_fp, *results_fp; int r, sum = 0; div_t div_res; int i; rnumber_fp = fopen(rnumber_file, "r"); for ( i = 0; i < MAX_RANDOM_NUMBERS; i++ ) { fscanf( rnumber_fp, "%d ", &r ); sum += r; } fclose(rnumber_fp); results_fp = fopen(results_file, "a"); div_res = div( sum, MAX_RANDOM_NUMBERS ); fprintf( results_fp, "%d ", div_res.quot ); fclose(results_fp); }

113 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples GRID superscalar code: IDL file interface MEAN { void gen_random( out File rnumber_file ); void mean( in File rnumber_file, inout File results_file ); }; GRID superscalar code: main program int main( int argc, char * argv[] ) { FILE *results_fp; int i, mn; GS_On(); for ( i = 0; i < LOOPS; i ++ ) { gen_random( “random.txt” ); mean( “random.txt”, “results.txt” ); } results_fp = GS_FOpen( “results.txt”, R ); for( i = 0; i < LOOPS; i ++ ) { fscanf( results_fp, "%d", &mn ); printf( "mean %i : %d\n", i, mn ); } GS_FClose( results_fp ); GS_Off(0); } NO CHANGES REQUIRED TO FUNCTIONS!

114 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Performance modelling: IDL file interface MC { void subst(in File referenceCFG, in double newBW, out File newCFG); void dimemas(in File newCFG, in File traceFile, out File DimemasOUT); void post(in File newCFG, in File DimemasOUT, inout File FinalOUT); void display(in File toplot) };

115 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Performance modelling: main program GS_On(); for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd); GS_Off(0);

116 CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Performance modelling: dimemas function code void dimemas(in File newCFG, in File traceFile, out File DimemasOUT) { char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command); }

117 CLUSTER 2005 Tutorial Boston, 26th September 2005 More information  GRID superscalar home page: http://www.cepba.upc.edu/grid  Rosa M. Badia, Jesús Labarta, Raül Sirvent, Josep M. Pérez, José M. Cela, Rogeli Grima, “Programming Grid Applications with GRID Superscalar”, Journal of Grid Computing, Volume 1 (Number 2): 151-170 (2003).  Vasilis Dialinos, Rosa M. Badia, Raul Sirvent, Josep M. Perez y Jesus Labarta, "Implementing Phylogenetic Inference with GRID superscalar", Cluster Computing and Grid 2005 (CCGRID 2005), Cardiff, UK, 2005 grid-superscalar@ac.upc.edu


Download ppt "CLUSTER 2005 Tutorial Boston, 26th September 2005 Programming with GRID superscalar Rosa M. Badia Toni Cortes Pieter Bellens, Vasilis Dialinos, Jesús Labarta,"

Similar presentations


Ads by Google