Presentation is loading. Please wait.

Presentation is loading. Please wait.

InstantGrid: A Framework for On- Demand Grid Point Construction R.S.C. Ho, K.K. Yin, D.C.M. Lee, D.H.F. Hung, C.L. Wang, and F.C.M. Lau Dept. of Computer.

Similar presentations


Presentation on theme: "InstantGrid: A Framework for On- Demand Grid Point Construction R.S.C. Ho, K.K. Yin, D.C.M. Lee, D.H.F. Hung, C.L. Wang, and F.C.M. Lau Dept. of Computer."— Presentation transcript:

1 InstantGrid: A Framework for On- Demand Grid Point Construction R.S.C. Ho, K.K. Yin, D.C.M. Lee, D.H.F. Hung, C.L. Wang, and F.C.M. Lau Dept. of Computer Science, The University of Hong Kong Grid point construction a difficult task Different grid users/applications demand different execution environments (EE’s) Managing - and switching between - different EE’s incur much system administration overheads E.g. Computing grid (MPICH-G2, etc.) vs. service grid (GT3); different OS distributions/versions, libraries, etc. Our solution – InstantGrid A framework for efficient construction of grid point Convenient system administration for multiple EE’s Instant EE construction in remote nodes Complete transparency to user applications Supports in-memory execution – protects HD’s data from malicious access

2 The InstantGrid Framework All EE’s are installed, configured, and managed in central InstantGrid servers Cluster/grid nodes obtain customized EE’s through network (i.e., the “dissemination” process) Framework consists of the following key elements: Application-centric software grouping Proactive software configuration Discriminative file sharing mechanisms Options for file storage in compute nodes An EE dissemination service Single Linux Image Management (SLIM): The infrastructure for EE dissemination SLIM is able to deliver customized EE’s for: HPC cluster/grid systems Linux desktops Diskless Linux nodes

3 Application-centric Software Grouping (a)A service-oriented grid point (b)A frontend node for HPC job submission (c)A typical cluster node which processes jobs dispatched from the frontend node (b)+(c): A single EE group indicating the software requirement of a cluster- based grid point, which includes a gatekeeper and a number of compute nodes Software are grouped together to match the specific requirements of applications An EE is a collection of software components, which include an OS, system libraries, grid/cluster middleware, applications, and the user data Customized EE “images” for different users/applications Facilitates software management and dissemination Sample EE’s:

4 Proactive Software Configuration Discrimitive File Sharing Mechanism Full replication is impractical due to large size of typical EE’s Updating files through NFS is slow InstantGrid adopts a hybrid approach: Replicate (frequently- updated files) + NFS (other files) Traditionally, software are installed/configured incrementally InstantGrid advocates “configuration before dissemination” Try to configure all software in the central server if possible The EE’s disseminated are (almost) ready-to-run Option for File Storage in Compute Nodes “Full-copy to RAM” – files stored entirely in physical memory “Full-copy to HD” – files stored in hard disk “Copy-if-needed” – files stored in HD; only new files are copied EE Dissemination Service Service is offered through a DHCP server, a TFTP server and an NFS server When a client machine boots up, it obtains its IP address and the kernel from the DCHP and TFTP servers respectively Constructs the pre-defined EE by replicating writable files to local storage and mounting the read-only directories through the NFS

5 Example – Constructing a service- oriented grid point 1. Software installation at SLIM server 2. Client boots and obtains kernel 3. OS image/App disseminated4. Process to generate certificates

6 Performance evaluation Future Work To devise standard protocols for communicating EE specifications between the InstantGrid servers and compute nodes To optimize InstantGrid’s performance in WAN A 256-node cluster-based grid point can be constructed from scratch in three (copy-if-needed) to five (full-copy to hard disk) minutes Standalone grid points take longer time to construct. The bottleneck mainly lies on the process to generate host certificates Conducted in HKU CS’s Gideon Cluster (Pentium 4 x 300; fast ethernet; each node has 512MB ram, 40GB IDE hard disk) Two tests: (a) a cluster-based grid point, and (b) standalone grid points


Download ppt "InstantGrid: A Framework for On- Demand Grid Point Construction R.S.C. Ho, K.K. Yin, D.C.M. Lee, D.H.F. Hung, C.L. Wang, and F.C.M. Lau Dept. of Computer."

Similar presentations


Ads by Google