Download presentation
Presentation is loading. Please wait.
1
CLUSTER 2005 Tutorial Boston, 26th September 2005 Programming with GRID superscalar Rosa M. Badia Toni Cortes Pieter Bellens, Vasilis Dialinos, Jesús Labarta, Josep M. Pérez, Raül Sirvent
2
CLUSTER 2005 Tutorial Boston, 26th September 2005 Tutorial Detailed Description Introduction to GRID superscalar (55%) 9:00AM-10:30AM 1.GRID superscalar objective 2.Framework overview 3.A sample GRID superscalar code 4.Code generation: gsstubgen 5.Automatic configuration and compilation: Deployment center 6.Runtime library features Break 10:30-10:45am Programming with GRID superscalar (45%) 10:45AM-Noon 6.Users interface: The IDL file GRID superscalar API User resource constraints and performance cost Configuration files Use of checkpointing 8.Use of the Deployment center 9.Programming Examples
3
CLUSTER 2005 Tutorial Boston, 26th September 2005 Introduction to GRID superscalar 1.GRID superscalar objective 2.Framework overview 3.A sample GRID superscalar code 4.Code generation: gsstubgen 5.Automatic configuration and compilation: Deployment center 6.Runtime library features
4
CLUSTER 2005 Tutorial Boston, 26th September 2005 1. GRID superscalar Objective Ease the programming of GRID applications Basic idea: Grid ns seconds/minutes/hours
5
CLUSTER 2005 Tutorial Boston, 26th September 2005 1. GRID superscalar Objective Reduce the development complexity of Grid applications to the minimum –writing an application for a computational Grid may be as easy as writing a sequential application Target applications: composed of tasks, most of them repetitive –Granularity of the tasks of the level of simulations or programs –Data objects are files
6
CLUSTER 2005 Tutorial Boston, 26th September 2005 2. Framework overview 1.Behavior description 2.Elements of the framework
7
CLUSTER 2005 Tutorial Boston, 26th September 2005 Input/output files 2.1 Behavior description for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd);
8
CLUSTER 2005 Tutorial Boston, 26th September 2005 2.1 Behavior description Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT … GS_open Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Display CIRI Grid
9
CLUSTER 2005 Tutorial Boston, 26th September 2005 EXTRACT 2.1 Behavior description Subst DIMEMAS Subst DIMEMAS … GS_open Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Display CIRI Grid
10
CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 1.Users interface 2.Code generation: gsstubgen 3.Deployment center 4.Runtime library
11
CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 1.Users interface –Assembly language for the GRID Well defined operations and operands Simple sequential programming on top of it (C/C++, Perl, …) –Three components: Main program Subroutines/functions Interface Definition Language (IDL) file
12
CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 2. Code generation: gsstubgen –Generates the code necessary to build a Grid application from a sequential application Function stubs (master side) Main program (worker side)
13
CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 3. Deployment center –Designed for helping user Grid configuration setting Deployment of applications in local and remote servers
14
CLUSTER 2005 Tutorial Boston, 26th September 2005 2.2 Elements of the framework 4. Runtime library –Transparent access to the Grid –Automatic parallelization between operations at run-time Uses architectural concepts from microprocessor design Instruction window (DAG), Dependence analysis, scheduling, locality, renaming, forwarding, prediction, speculation,…
15
CLUSTER 2005 Tutorial Boston, 26th September 2005 Three components: –Main program –Subroutines/functions –Interface Definition Language (IDL) file Programming languages: –C/C++, Perl –Prototype version for Java and shell script 3. A sample GRID superscalar code
16
CLUSTER 2005 Tutorial Boston, 26th September 2005 Main program: A Typical sequential program for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd); 3. A sample GRID superscalar code
17
CLUSTER 2005 Tutorial Boston, 26th September 2005 3. A sample GRID superscalar code void dimemas(in File newCFG, in File traceFile, out File DimemasOUT) { char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command); } Subroutines/functions void display(in File toplot) { char command[500]; sprintf(command, "./display.sh %s", toplot); GS_System(command); }
18
CLUSTER 2005 Tutorial Boston, 26th September 2005 interface MC { void subst(in File referenceCFG, in double newBW, out File newCFG); void dimemas(in File newCFG, in File traceFile, out File DimemasOUT); void post(in File newCFG, in File DimemasOUT, inout File FinalOUT); void display(in File toplot) }; Interface Definition Language (IDL) file –CORBA-IDL Like Interface: In/Out/InOut files Scalar values (in or out) –The subroutines/functions listed in this file will be executed in a remote server in the Grid. 3. A sample GRID superscalar code
19
CLUSTER 2005 Tutorial Boston, 26th September 2005 3. A sample GRID superscalar code GRID superscalar programming requirements –Main program (master side): Begin/finish with calls GS_On, GS_Off Open/close files with: GS_FOpen, GS_Open, GS_FClose, GS_Close Possibility of explicit synchronization: GS_Barrier Possibility of declaration of speculative areas: GS_Speculative_End(func) –Subroutines/functions (worker side): Temporal files on local directory or ensure uniqueness of name per subroutine invocation GS_System instead of system All input/output files required must be passed as arguments Possibility of throwing exceptions: GS_Throw
20
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen app.idl app-worker.capp.capp-functions.c server gsstubgen app.h client app-stubs.c app_constraints.cc app_constraints_wrapper.cc app_constraints.h
21
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen app-stubs.c IDL function stubs app.h IDL functions headers app_constraints.cc User resource constraints and perfomance cost app_constraints.h app_constraints_wrapper.cc app-worker.c Main program for the worker side (calls to user functions)
22
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include … int gs_result; void Subst(file referenceCFG, double seed, file newCFG) { /* Marshalling/Demarshalling buffers */ char *buff_seed; /* Allocate buffers */ buff_seed = (char *)malloc(atoi(getenv("GS_GENLENGTH"))+1); /* Parameter marshalling */ sprintf(buff_seed, "%.20g", seed); Execute(SubstOp, 1, 1, 1, 0, referenceCFG, buff_seed, newCFG); /* Deallocate buffers */ free(buff_seed); } … Sample stubs file
23
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include … int main(int argc, char **argv) { enum operationCode opCod = (enum operationCode)atoi(argv[2]); IniWorker(argc, argv); switch(opCod) { case SubstOp: { double seed; seed = strtod(argv[4], NULL); Subst(argv[3], seed, argv[5]); } break; … } EndWorker(gs_result, argc, argv); return 0; } Sample worker main file
24
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include "mcarlo_constraints.h" #include "user_provided_functions.h" string Subst_constraints(file referenceCFG, double seed, file newCFG) { string constraints = ""; return constraints; } double Subst_cost(file referenceCFG, double seed, file newCFG) { return 1.0; } … Sample constraints skeleton file
25
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen #include … typedef ClassAd (*constraints_wrapper) (char **_parameters); typedef double (*cost_wrapper) (char **_parameters); // Prototypes ClassAd Subst_constraints_wrapper(char **_parameters); double Subst_cost_wrapper(char **_parameters); … // Function tables constraints_wrapper constraints_functions[4] = { Subst_constraints_wrapper, … }; cost_wrapper cost_functions[4] = { Subst_cost_wrapper, … }; Sample constraints wrapper file (1)
26
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen ClassAd Subst_constraints_wrapper(char **_parameters) { char **_argp; // Generic buffers char *buff_referenceCFG; char *buff_seed; // Real parameters char *referenceCFG; double seed; // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++); //Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); string _constraints = Subst_constraints(referenceCFG, seed); ClassAd _ad; ClassAdParser _parser; _ad.Insert("Requirements", _parser.ParseExpression(_constraints)); // Free buffers return _ad; } Sample constraints wrapper file (2)
27
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen double Subst_cost_wrapper(char **_parameters) { char **_argp; // Generic buffers char *buff_referenceCFG; char *buff_referenceCFG; char *buff_seed; // Real parameters char *referenceCFG; double seed; // Allocate buffers // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++); //Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); double _cost = Subst_cost(referenceCFG, seed); // Free buffers return _cost; } … Sample constraints wrapper file (3)
28
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen client GRID superscalar runtime server i app-functions.c app-worker.c app-stubs.c app.c GT2...... server i app-functions.c app-worker.c Globus services: gsiftp, gram app_constraints.cc app_constraints_wrapper.cc Binary building
29
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen User provided files Files generated from IDL Files generated by deployer app.c app-stubs.c app_constraints_wrapper.ccapp_constraints.cc app-functions.c app-worker.capp.h app_constraints.h (projectname).xml config.xml app.idl Putting all together: involved files
30
CLUSTER 2005 Tutorial Boston, 26th September 2005 4. Code generation: gsstubgen GRID superscalar applications architecture
31
CLUSTER 2005 Tutorial Boston, 26th September 2005 5. Deployment center Java based GUI. Allows: –Specification of grid computation resources: host details, libraries location… –Allows selection of Grid configuration Grid configuration checking process: –Aliveness of host (ping) –Globus service is checked by submitting a simple test –Sends a remote job that copies the code needed in the worker, and compiles it Automatic deployment –sends and compiles code in the remote workers and the master Configuration files generation
32
CLUSTER 2005 Tutorial Boston, 26th September 2005 5. Deployment center Automatic deployment
33
CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features Initial prototype over Condor and MW Current version over Globus 2.4, Globus 4.0, ssh/scp, Ninf-G2 File transfer, security and other features provided by the middleware (Globus, …)
34
CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features 1.Data dependence analysis 2.File Renaming 3.Task scheduling 4.Resource brokering 5.Shared disks management and file transfer policy 6.Scalar results collection 7.Checkpointing at task level 8.API functions implementation
35
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.1 Data-dependence analysis Data dependence analysis –Detects RaW, WaR, WaW dependencies based on file parameters Oriented to simulations, FET solvers, bioinformatic applications –Main parameters are data files Tasks’ Directed Acyclic Graph is built based on these dependencies Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Subst DIMEMAS EXTRACT Display
36
CLUSTER 2005 Tutorial Boston, 26th September 2005 “f1_2” “f1_1” 6.2 File renaming WaW and WaR dependencies are avoidable with renaming T1_1 T2_1 T3_1 T1_2 T2_2 T3_2 T1_N … “f1” While (!end_condition()) { T1 (…,…, “f1”); T2 (“f1”, …, …); T3 (…,…,…); } WaR WaW
37
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.3 Task scheduling Distributed between the Execute call, the callback function and the GS_Barrier call Possibilities –The task can be submitted immediately after being created –Task waiting for resource –Task waiting for data dependency Task submission composed of –File transfer –Task submission –All specified in Globus RSL (for Globus case)
38
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.3 Task scheduling Temporal directory created in the server working directory for each task Calls to globus: –globus_gram_client_job_request –globus_gram_client_callback_allow –globus_poll_blocking End of task notification: Asynchronous state- change callbacks monitoring system –globus_gram_client_callback_allow() –callback_func function Data structures update in Execute function, GRID superscalar primitives and GS_Barrier GS_Barrier primitive before ending the program that waits for all tasks (performed inside GS_Off)
39
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.4 Resource brokering When a task is ready for execution, the scheduler tries to allocate a resource Broker receives a request –The classAd library is used to match resource ClassAds with task ClassAds –If more than one resources fulfils the constraint, the resource which minimizes this formula is selected: FT = File transfer time to resource r ET = Execution time of task t on resource r (using user provided cost function)
40
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.5 Shared disks management and file transfer policy client server 1 server 2 T1 f1f4 T2 f4 f7 f1 f7 Working directories File transfers policy T1 T2 f1 f4 (temp.) f7
41
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.5 Shared disks management and file transfer policy client server 1 server 2 f1 f4 f7 f1 f7 T1 T2 Working directories Shared working directories (NFS)
42
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.5 Shared disks management and file transfer policy client server 1 server 2 Input directories Shared input disks
43
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.6 Scalar results collection Collection of output parameters which are not files –Main code cannot continue until the scalar result value is obtained Partial barrier synchronization … grid_task_1 (“file1.txt”, “file2.cfg”, var_x); if (var_x>10){ … Socket and file mechanisms provided output variable
44
CLUSTER 2005 Tutorial Boston, 26th September 2005 3 6.7 Task level checkpointing Inter-task checkpointing Recovers sequential consistency in the out-of-order execution of tasks 0123456 Completed Running Committed Successful execution
45
CLUSTER 2005 Tutorial Boston, 26th September 2005 3 6.7 Task level checkpointing Inter-task checkpointing Recovers sequential consistency in the out-of-order execution of tasks 0123456 Completed Running Committed Failing execution Failing Cancel Finished correctly
46
CLUSTER 2005 Tutorial Boston, 26th September 2005 3 6.7 Task level checkpointing Inter-task checkpointing Recovers sequential consistency in the out-of-order execution of tasks 0123456 Completed Running Committed Restart execution Failing Finished correctly Execution continues normally!
47
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.7 Task level checkpointing On fail: from N versions of a file to one version (last committed version) Transparent to application developer
48
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation –Master side GS_On GS_Off GS_Barrier GS_FOpen GS_FClose GS_Open GS_Close GS_Speculative_End(func) –Worker side GS_System GS_Throw
49
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation Implicit task synchronization – GS_Barrier –Inserted in the user main program when required –Main program execution is blocked –globus_poll_blocking() called –Once all tasks are finished the program may resume
50
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation GRID superscalar file management API primitives: –GS_FOpen –GS_FClose –GS_Open –GS_Close Mandatory for file management operations in main program Opening a file with write option –Data dependence analysis –Renaming is applied Opening a file with read option –Partial barrier until the task that is generating that file as output file finishes Internally file management functions are handled as local tasks –Task node inserted –Data-dependence analysis –Function locally executed Future work: offer a C library with GS semantic (source code with typical calls could be used)
51
CLUSTER 2005 Tutorial Boston, 26th September 2005 6.8 API functions implementation GS_Speculative_End(func) / GS_Throw Any worker can call to GS_Throw at any moment Task that rises the GS_Throw is the last valid task (all sequential tasks after that must be undone) The speculative part is considered from the task that throws the exception until the GS_Speculative_End Possibility of calling a local function when the exception is detected.
52
CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features app.c LocalHost app-functions.c Calls sequence without GRID superscalar
53
CLUSTER 2005 Tutorial Boston, 26th September 2005 6. Runtime library features app.c app-stubs.c GRID superscalar runtime app_constraints_wrapper.cc app_constraints.cc GT2 LocalHost RemoteHost app-functions.c app-worker.c Calls sequence with GRID superscalar
54
CLUSTER 2005 Tutorial Boston, 26th September 2005 Tutorial Detailed Description Introduction to GRID superscalar (55%) 9:00AM-10:30AM 1.GRID superscalar objective 2.Framework overview 3.A sample GRID superscalar code 4.Code generation: gsstubgen 5.Automatic configuration and compilation: Deployment center 6.Runtime library features Break 10:30-10:45am Programming with GRID superscalar (45%) 10:45AM-Noon 7.Users interface: The IDL file GRID superscalar API User resource constraints and performance cost Configuration files Use of checkpointing 8.Use of the Deployment center 9.Programming Examples
55
CLUSTER 2005 Tutorial Boston, 26th September 2005 7. Users interface 1.The IDL file 2.GRID superscalar API 3.User resource constraints and performance cost 4.Configuration files 5.Use of checkpointing
56
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file GRID superscalar uses a simplified interface definition language based on the CORBA IDL standard The IDL file describes the headers of the functions that will be executed on the GRID interface MYAPPL { void myfunction1(in File file1, in scalar_type scalar1, out File file2); void myfunction2(in File file1, in File file2, out scalar_type scalar1); void myfunction3(inout scalar_type scalar1, inout File file1); }; Requirement –All functions must be void All parameters defined as in, out or inout. Types supported –filenames: special type –integers, floating point, booleans, characters and strings Scalar_type can be: –short, int, long, float, double, boolean, char, and string
57
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file
58
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file Example: –Initial call void subst (char *referenceCFG, double newBWd, char *newCFG); –IDL interface void subst (in File referenceCFG, in double newBWd, out File newCFG); Input filename Output filenameInput parameter
59
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.1 The IDL file Example: –Initial call: void subst (char *referenceCFG, double newBWd, int *outval); –IDL interface void subst (in File referenceCFG, in double newBWd, out int outval); –Although output parameter type changes in IDL file, not changes are required in function code Input filename Output integerInput parameter
60
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API Master side –GS_On –GS_Off –GS_Barrier –GS_FOpen –GS_FClose –GS_Open –GS_Close –GS_Speculative_End(func) Worker side –GS_Throw –GS_System
61
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API Initialization and finalization void GS_On(); void GS_Off(int code); –Mandatory –GS_On(): at the beginning of main program code or at least before any call to functions listed in the IDL file (task call) Initializations –GS_Off (code): at the end of main program code or at least after any task call Finalizations Waits for all pending tasks Code: 0: normal end -1: error. Will store necessary checkpoint information to enable later restart.
62
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API Synchronization void GS_Barrier(); –Can be called at any point of the main program code –Waits until all tasks called previously had finished –Can Reduce performance!
63
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API File primitives: FILE * GS_FOpen (char *filename, int mode); int GS_FClose(FILE *f); int GS_Open(char *pathname, int mode); int GS_Close(int fd); Modes: R (reading), W (writing) and A (append) Examples: FILE * fp; char STRAUX[20]; … fp = GS_FOpen(“myfile.ext”, R); // Read something fscanf( fp, "%s", STRAUX); GS_FClose (fp);
64
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API Examples: int filedesc; … filedesc= GS_Open(“myfile.ext”, W); // Write something write(filedesc,”abc”, 3); GS_Close (filedesc); Behavior: –At user level, the same as fopen or open (or fclose and close) Where to use it: –In then main program (not required in worker code) When to use it: –Always when opening/closing files between GS_On() and GS_Off calls
65
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API Exception handling void GS_Speculative_End(void (*fptr)()); GS_Throw –Enables exception handling from tasks functions to main program –A speculative area can be defined in the main program which is not executed when an exception is thrown –The user can provide a function that it is executed in the main program when an exception is raised
66
CLUSTER 2005 Tutorial Boston, 26th September 2005 Main program code 7.2 GRID superscalar API while (j<MAX_ITERS){ getRanges(Lini, BWini, &Lmin, &Lmax, &BWmin, &BWmax); for (i=0; i<ITERS; i++){ L[i] = gen_rand(Lmin, Lmax); BW[i] = gen_rand(BWmin, BWmax); Filter("nsend.cfg", L[i], BW[i], "tmp.cfg"); Dimemas("tmp.cfg", "nsend_rec_nosm.trf", Elapsed_goal, "dim_ou.txt"); Extract("tmp.cfg", "dim_out.txt", "final_result.txt"); } getNewIniRange("final_result.txt",&Lini, &BWini); j++; } GS_Speculative_End(my_func); void Dimemas(char * cfgFile, char * traceFile, double goal, char * DimemasOUT) { … putenv("DIMEMAS_HOME=/aplic/DIMEMAS"); sprintf(aux, "/aplic/DIMEMAS/bin/Dimemas -o %s %s", DimemasOUT, cfgFile); gs_result = GS_System(aux); distance_to_goal = distance(get_time(DimemasOUT), goal); if (distance_to_goal < goal*0.1) { printf("Goal Reached!!! Throwing exception.\n"); GS_Throw; } } Function executed when a exception is thrown task code
67
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.2 GRID superscalar API Wrapping legacy code in tasks’ code int GS_System(char *command); –At user level has the same behaviour as a system() call. void dimemas(in File newCFG, in File traceFile, out File DimemasOUT) { char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command); }
68
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost For each task specified in the IDL file the user can provide: –A resource constraints function –A performance cost modelling function Resource constraints function: –Specifies constraints on the resources that can be used to executed the given task
69
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost AttributeDescriptionType OpSysOperating systemString MemPhysical memory (MB)Integer QueueNameName of the queueString MachineNameName of the MachineString NetKbpsAvailable bandwidthDouble ArchProcessors architecureString NumWorkers Number of simultaneous jobs that can be run Integer GFlops Processor performance (GF). Theoretical or effective. Double NCPUsNumber of CPUs of the machineInteger SoftNameList List of software available in the machine ClassAd list of strings Attributes currently supported:
70
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost Resource constraints specification –A function interface is generated for each IDL task by gsstubgen in file {appname}_constraints.cc –The name of the function is {task_name}_constraints –The function initially generated is a default function (always evaluates true)
71
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost Example: –IDL file (mc.idl) interface MC { void subst(in File referenceCFG, in double newBW, out File newCFG); void dimemas(in File newCFG, in File traceFile, out File DimemasOUT); void post(in File newCFG, in File DimemasOUT, inout File FinalOUT); void display(in File toplot) }; –Generated function in mc_constraints.cc string Subst_constraints(file referenceCFG, double seed) { return “true”; }
72
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost Resource constraints specification (ClassAds strings) string Subst_constraints(file referenceCFG, double seed) { return "(other.Arch == \"powerpc\")“; }
73
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost Performance cost modelling function –Specifies a model of the performance cost of the given task –As with the resource constraint function, a default function is generated that always returns “1” double Subst_cost(file referenceCFG, double newBW) { return 1.0; }
74
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.3 User resource constraints and performance cost Built-in functions: int GS_Filesize(char *name); double GS_GFlops(); Example double Subst_cost(file referenceCFG, double newBW) { double time; time = GS_filesize(referenceCFG)/1000000) * GS_GFlops(); return(time); }
75
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files Two configuration files: –Grid configuration file $HOME/.gridsuperscalar/config.xml –Project configuration file {project_name}.gsdeploy Both are xml files Both generated by deployment center
76
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files Grid configuration file –Saved automatically by the deployment center Contains information about –Available resources in the Grid (server hosts) –Characteristics of the resource Processor architecture Operating system Processor performance Number of CPUs Memory available Queues available …
77
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files Grid configuration file
78
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files <worker fqdn="kadesh.cepba.upc.es" globusLocation="/usr/gt2" gssLocation="/aplic /GRID-S/HEAD" minPort="20340" maxPort="20460" bandwidth="1250000" architecture=" Power3" operatingSystem="AIX" gFlops="1.5" memorySize="512" cpuCount="16"> …
79
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files Project configuration file –Generated by the deployment center to save all the information required to run a given application Contains information about –Resources selected for execution –Concrete characteristics selected by the user Queues Number of concurrent tasks in each server … –Project information Location of binaries in localhost Location of binaries in servers …
80
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files Resources information......
81
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.4 Configuration files Shared disks information:...
82
CLUSTER 2005 Tutorial Boston, 26th September 2005 7.5 Use of checkpointing When running a GRID superscalar application, information for the checkpointing is automatically stored in file “.tasks.chk” The checkpointing file simply lists the tasks that have finished To recover: just restart application as usual To start again from the beginning: erase “.tasks.chk” file
83
CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center Initialization of the deployment center
84
CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center Adding a new host in the Grid configuration
85
CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center Create a new project
86
CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center Selection of hosts for a project
87
CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center Deployment of main program (master) in localhost
88
CLUSTER 2005 Tutorial Boston, 26th September 2005 8. Use of the deployment center Execution of application after deployment
89
CLUSTER 2005 Tutorial Boston, 26th September 2005 9. Programming examples 1.Programming experiences –Ghyper: computation of molecular potential energy hypersurfaces –fastDNAml: likelihood of phylogenetic trees 2.Simple programming examples Matrix multiply Mean calculation Performance modelling
90
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences GHyper –A problem of great interest in the physical- molecular field is the evaluation of molecular potential energy hypersurfaces –Previous approach: Hire a student Execute sequentially a set of N evalutions –Implemented with GRID superscalar as a simple program Iterative structure (simple for loop) Concurrency automatically exploited and run in the Grid with GRID superscalar
91
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences GHyper –Aplication: computation of molecular potential energy hypersurfaces –Run 1 Total execution time: 17 hours Number of executed tasks: 1120 Each task between 45 and 65 minutes BSC (Barcelona): 28 processors IBM Power4 UCLM (Ciudad Real): 11+11 processors AMD + Pentium IV Univ. de Puebla (Mexico) 14 processors AMD64
92
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Run 2 Total execution time: 31 hours Number of executed tasks: 1120 BSC (Barcelona): 8 processors IBM Power4 UCLM (Ciudad Real): 8+8 processors AMD + Pentium IV Univ. de Puebla (Mexico) 8 processors AMD64
93
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Two-dimensional potential energy hypersurface for acetone as a function of the 1, and 2 angles
94
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences fastDNAml Starting point: code from Olsen et al. –Sequential code –Biological application that evaluates maximum likelihood phylogenetic inference MPI version by Indiana University –Used with PACX-MPI for the HPC- Challenge context in SC2003
95
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Structure of the sequential application: –Iterative algorithm that builds the solution incrementally Solution: maximum likelihood phylogenetic tree –In each iteration, the tree is extended by adding a new taxon –In each iteration 2i-5 possible trees are evaluated –Additionally, each iteration performs a local arrangement phase with 2i-6 additional evaluations –In the sequential algorithm, although these evaluations are independent, are performed sequentially
96
CLUSTER 2005 Tutorial Boston, 26th September 2005 GRID superscalar implementation –Selection of IDL tasks: Evaluation function: –Tree information stored in TreeFile before calling GSevaluate –GS_FOpen and GS_FClose used for files related with evaluation –Automatic parallelization is achieved interface GSFASTDNAML{ GSevaluate (in File InputData, in File TreeFile, out File EvaluatedTreeFile ); }; 9.1 Programming experiences
97
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Task graph automatically generated by GRID superscalar runtime: Tree evaluation Barrier i-1 i i+1
98
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences With some data set, evaluation is a fast task Optimization 1: tree clustering –Several tree evaluations are grouped into a single evaluation task –Reduces task initialization and Globus overhead –Reduces parallelism Optimization 2: local executions –Initial executions are executed locally Both optimizations are combined
99
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Policies –DEFAULT The number of evaluations grows with the iterations. All evaluations have the same number of trees (MAX_PACKET_SIZE) –UNFAIR Same as DEFAULT, but with a maximum of NUM_WORKER evaluations –FAIR Each iteration has the same fixed number of evaluations (NUM_WORKER)
100
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences Heterogeneous computational Grid: –IBM based machine 816 Nighthawk Power3 processors and 94 p630 Power4 processors (Kadesh), –IBM xSeries 250 with 4 Intel Pentium III (Khafre) –Bull Novascale 5160 with 8 Itanium2 processors (Kharga) –Parsytec CCi-8D with 16 Pentium II processors (Kandake) –some of the authors laptops For some of the machines the production queues were used All machines located in Barcelona
101
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences KDKGKFLTPolicyEllapsed time 5240UNFAIR33000 s 8001UNFAIR15978 s 8201UNFAIR16520 s 8220DEFAULT24216 s 8200FAIR14235 s All results for the HPC-Challenge data set PACX-MPI gets better performance with similar configurations (less than 10800 s) However it is using all the resources during all the execution time!
102
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences An SSH/SCP GRID superscalar version has been developed Specially interesting for large clusters New heterogeneous computation Grid configuration: –Machines from previous results –Machines from a site in Madrid, basically Pentium III and Pentium IV based
103
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.1 Programming experiences upc.edurediris.esucm.esEllapsed time 2176605 s 1177129 s Even using a larger distance computational Grid, the performance is doubled PACX-MPI version ellapsed time: 9240s (second configuration case)
104
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Matrix multiply: A[0,0]A[0,1]A[0,2] A[1,0]A[1,1]A[1,2] A[2,0]A[2,1]A[2,2] B[0,0]B[0,1]B[0,2] B[1,0]B[1,1]B[1,2] B[2,0]B[2,1]B[2,2] C[0,0]C[0,1]C[0,2] C[1,0]C[1,1]C[1,2] C[2,0]C[2,1]C[2,2] = Hypermatrices: each element of the matrix is a matrix Each internal matrix stored in a file
105
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Program structure: –Inner product: –Each A[i][k]∙B[k][j] is a matrix multiplication itself: Encapsulated in a function matmul(A_i_k, B_k_j, C_i_j);
106
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: main program int main(int argc, char **argv) { char *f1, *f2, *f3; int i, j, k; IniMatrixes(); for (i = 0; i < MSIZE; i++) { for (j = 0; j < MSIZE; j++) { for (k = 0; k < MSIZE; k++) { f1 = getfilename("A", i, k); f2 = getfilename("B", k, j); f3 = getfilename("C", i, j); matmul(f1, f2, f3); } return 0; }
107
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: matmul function code void matmul(char *f1, char *f2, char *f3){ block *A,*B,*C; A = get_block(f1, BSIZE, BSIZE); B = get_block(f2, BSIZE, BSIZE); C = get_block(f3, BSIZE, BSIZE); block_mul(A, B, C); put_block(C, f3); delete_block(A); delete_block(B); delete_block(C); } static block *block_mul(block *A, block *B, block *C) { int i, j, k; for (i = 0; i rows; i++) { for (j = 0; j cols; j++) { for (k = 0; k cols; k++) { C->data[i][j] += A->data[i][k] * B->data[k][j]; } return C; }
108
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples GRID superscalar code: IDL file interface MATMUL { void matmul(in File f1, in File f2, inout File f3); }; GRID superscalar code: main program int main(int argc, char **argv){ char *f1, *f2, *f3; int i, j, k; GS_On(); IniMatrixes(); for (i = 0; i < MSIZE; i++) { for (j = 0; j < MSIZE; j++) { for (k = 0; k < MSIZE; k++) { f1 = getfilename("A", i, k); f2 = getfilename("B", k, j); f3 = getfilename("C", i, j); matmul(f1, f2, f3); } GS_Off(0); return 0; } NO CHANGES REQUIRED TO FUNCTIONS!
109
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Mean calculation –Simple iterative example executed LOOPS times –In each iteration a given number of random numbers are generated –The mean of the random numbers is calculated
110
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: main program int main( int argc, char * argv[] ) { FILE *results_fp; int i, mn; for ( i = 0; i < LOOPS; i ++ ) { gen_random( “random.txt” ); mean( “random.txt”, “results.txt”); } results_fp = fopen( “results.txt”, "r" );{ for( i = 0; i < LOOPS; i ++ ) { fscanf( results_fp, "%d", &mn ); printf( "mean %i : %d\n", i, mn ); } fclose( results_fp); return 0; }
111
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: gen_random function code void gen_random( char * rnumber_file ) { FILE *rnumber_fp; int i; rnumber_fp = fopen(rnumber_file, "w"); for ( i = 0; i < MAX_RANDOM_NUMBERS; i++ ) { int r = 1 + (int) (RANDOM_RANGE*rand()/(RAND_MAX+1.0)); fprintf( rnumber_fp, "%d ", r ); } fclose(rnumber_fp); }
112
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Sequential code: mean function code void mean( char * rnumber_file, char * results_file ) { FILE *rnumber_fp, *results_fp; int r, sum = 0; div_t div_res; int i; rnumber_fp = fopen(rnumber_file, "r"); for ( i = 0; i < MAX_RANDOM_NUMBERS; i++ ) { fscanf( rnumber_fp, "%d ", &r ); sum += r; } fclose(rnumber_fp); results_fp = fopen(results_file, "a"); div_res = div( sum, MAX_RANDOM_NUMBERS ); fprintf( results_fp, "%d ", div_res.quot ); fclose(results_fp); }
113
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples GRID superscalar code: IDL file interface MEAN { void gen_random( out File rnumber_file ); void mean( in File rnumber_file, inout File results_file ); }; GRID superscalar code: main program int main( int argc, char * argv[] ) { FILE *results_fp; int i, mn; GS_On(); for ( i = 0; i < LOOPS; i ++ ) { gen_random( “random.txt” ); mean( “random.txt”, “results.txt” ); } results_fp = GS_FOpen( “results.txt”, R ); for( i = 0; i < LOOPS; i ++ ) { fscanf( results_fp, "%d", &mn ); printf( "mean %i : %d\n", i, mn ); } GS_FClose( results_fp ); GS_Off(0); } NO CHANGES REQUIRED TO FUNCTIONS!
114
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Performance modelling: IDL file interface MC { void subst(in File referenceCFG, in double newBW, out File newCFG); void dimemas(in File newCFG, in File traceFile, out File DimemasOUT); void post(in File newCFG, in File DimemasOUT, inout File FinalOUT); void display(in File toplot) };
115
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Performance modelling: main program GS_On(); for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd); GS_Off(0);
116
CLUSTER 2005 Tutorial Boston, 26th September 2005 9.2 Simple programming examples Performance modelling: dimemas function code void dimemas(in File newCFG, in File traceFile, out File DimemasOUT) { char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command); }
117
CLUSTER 2005 Tutorial Boston, 26th September 2005 More information GRID superscalar home page: http://www.cepba.upc.edu/grid Rosa M. Badia, Jesús Labarta, Raül Sirvent, Josep M. Pérez, José M. Cela, Rogeli Grima, “Programming Grid Applications with GRID Superscalar”, Journal of Grid Computing, Volume 1 (Number 2): 151-170 (2003). Vasilis Dialinos, Rosa M. Badia, Raul Sirvent, Josep M. Perez y Jesus Labarta, "Implementing Phylogenetic Inference with GRID superscalar", Cluster Computing and Grid 2005 (CCGRID 2005), Cardiff, UK, 2005 grid-superscalar@ac.upc.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.