Research Achievements Kenji Kaneda
Agenda Research background and goal Research background and goal Overview of my research achievements Overview of my research achievements Phoenix Phoenix Virtual Private Grid Virtual Private Grid Summary and recent activities Summary and recent activities
Research Background and Goal
Background Grid computing Grid computing Parallel computing with harnessing many widely-distributed resources Parallel computing with harnessing many widely-distributed resources E.g.) aggregate of PC clusters spread over multiple LANs
Traditional Parallel Computing vs. Grid Computing Traditional parallel computing Reliable processors Reliable processors Single-LAN resources Single-LAN resources Grid computing Unreliable processors Unreliable processors Multi-LANs resources Multi-LANs resources
Difficulty in Grid computing Frequent machine/network failures Frequent machine/network failures E.g.) 1 machine failure per a day Restricted Connectivity Restricted Connectivity Administrative policies restrict communications between machines Administrative policies restrict communications between machines E.g.) firewall, NAT, DHCP Gateway TCP Gateway Firewall
Research Goal Allow a user to harness a computational grid like traditional parallel computing Allow a user to harness a computational grid like traditional parallel computing Fault tolerance Fault tolerance Transparent communication on WANs Transparent communication on WANs
My Research Achievements Design/implementation of middlewares Phoenix Phoenix Parallel programming library for accommodating dynamically joining/leaving resources Parallel programming library for accommodating dynamically joining/leaving resources Virtual Private Grid Virtual Private Grid Command shell for utilizing hundreds of computers spread over multiple LANs Command shell for utilizing hundreds of computers spread over multiple LANs
Phoenix
Phoenix Parallel programming library for accommodating dynamically joining/leaving resources Parallel programming library for accommodating dynamically joining/leaving resources Programming model for supporting migration of application states Programming model for supporting migration of application states Transparent communication mechanism for WANs Transparent communication mechanism for WANs
Programming Model for Supporting Migration of Application States Subsumes a regular message passing model Subsumes a regular message passing model Provides a namespace that does not depend on physical machines Provides a namespace that does not depend on physical machines Programmer uses this name to specify a message destination Programmer uses this name to specify a message destination Programmer can write a program without being aware of physical machines Programmer can write a program without being aware of physical machines
Transparent Communication mechanism for WANs Overlay network construction Overlay network construction Application-level message routing Application-level message routing Processes can communicate with one another Processes can communicate with one another even if networks are not fully connected even if networks are not fully connected even if connection topologies change dynamically even if connection topologies change dynamically
Demonstration Boot processes on 3 subnets Boot processes on 3 subnets Add processes dynamically Add processes dynamically
Demonstration
Experiments (1/3) Speedup with fixed resources Speedup with fixed resources POV-Ray: 78 speedup using 104 processors on 3 LANs POV-Ray: 78 speedup using 104 processors on 3 LANs LU: comparable to MPICH (on a single LAN) LU: comparable to MPICH (on a single LAN)
Experiments (2/3) Speedup with dynamic resources Speedup with dynamic resources POV-Ray takes advantage of dynamically added resources quickly POV-Ray takes advantage of dynamically added resources quickly
Experiments (3/3) Parallel shogi (Japanese chess) program on 720 laptop PCs Parallel shogi (Japanese chess) program on 720 laptop PCs 7~8 speedup 7~8 speedup
Related Work Grid enabled MPIs Grid enabled MPIs E.g.) MPICH-G [G. Bosilca et al. SC ’ 02] Based on a traditional message passing model Based on a traditional message passing model Difficult to support dynamic changes of resources Difficult to support dynamic changes of resources Communications libraries for Grids Communications libraries for Grids E.g.) Ibis [A. Denis et al. HPDC ’ 04] Static message routing Static message routing
Summary ~ Phoenix ~ Parallel programming library for dynamically changing resources Parallel programming library for dynamically changing resources Good speedup with a large number of machines on multiple LANs Good speedup with a large number of machines on multiple LANs
Virtual Private Grid
Virtual Private Grid (VPG) Command shell for utilizing hundreds of computers spread over multiple LANs Command shell for utilizing hundreds of computers spread over multiple LANs
Features (1/2) User can submit jobs without caring administrative restrictions User can submit jobs without caring administrative restrictions E.g.) | > Firewall host1host2 Firewall NAT host3 Execute cmd1 Write to file3 Execute cmd2
Features (2/2) Fault tolerance Fault tolerance VPG can continue to run even if some machines are added/deleted dynamically VPG can continue to run even if some machines are added/deleted dynamically No central server is required No central server is required
Demonstration Environment Environment 3 LANs 3 LANs CPU: Sparc, x86, MIPS, PowerPC CPU: Sparc, x86, MIPS, PowerPC OS: Solaris, Linux, IRIX OS: Solaris, Linux, IRIX
Demonstration
Related Work Grid job submission tools Grid job submission tools E.g.) Globus, Condor-G Difficult to submit jobs to machines under administrative restrictions Difficult to submit jobs to machines under administrative restrictions
Summary ~ Virtual Private Grid ~ Command shell for utilizing hundreds of computers spread over multiple LANs Command shell for utilizing hundreds of computers spread over multiple LANs Fast job submission to more than 100 machines Fast job submission to more than 100 machines
Summary and Recent activities
Summary ~ My Research Achievements ~ Middlewares for Grid computing Middlewares for Grid computing Phoenix Phoenix Virtual Private Grid Virtual Private Grid
Recent Activities (1/2) Virtual SMP Virtual SMP Emulates a multi-processor machines on a loosely-coupled computes Emulates a multi-processor machines on a loosely-coupled computers Virtual dual processor machine on two single processor machines
Recent Activities (2/2) Virtual SMP Virtual SMP Easy utilization of distributed resources with a common OS (e.g., Windows, Linux) Easy utilization of distributed resources with a common OS (e.g., Windows, Linux)