JICOS A Java-Centric Distributed Computing Service Computer Science Department UC Santa Barbara
Introduction Project Goals Minimize job completion time despite large communication latency
Introduction Project Goals Minimize job completion time despite large communication latency Jobs complete with high probability despite faulty components
Introduction Project Goals Minimize job completion time despite large communication latency Jobs complete with high probability despite faulty components Application program is oblivious to: Number of processors Inter-process communication Hardware faults
Introduction Fundamental Issue: Heterogeneity … OS1 OS2 OS3 OS4 OS5 M1 M2 M3 M4 M5 Heterogeneous machine/OS Functionally Homogeneous JVM
JICOS Attributes Heterogeneous hardware & OS Easy to program Fault tolerant compute servers (hosts) Adaptively parallel Small-grain parallel computation
Overview Computational model API Architecture Performance Benefit summary Plans
Computational Model DAC task graph f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) + + + +
Computational Model Embedded in an Environment Read/Write shared object f(4) Read input object f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) + + + +
Overview Computational model API Architecture Performance Benefit summary Plans
API f(3) f(2) + f(1) f(0) f(4) 2 task classes: F Sum
API f(4) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(4)
API + f(4) f(3) f(2) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(4) f(3) f(2) +
API + + f(3) f(2) f(2) f(1) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(3) f(2) f(2) f(1) + +
API + + + f(2) f(2) f(1) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(2) f(2) f(1) f(1) f(0) + + +
API + + + + f(2) f(1) f(1) f(0) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(2) f(1) f(1) f(0) f(1) f(0) + + + +
API + + + + f(1) f(1) f(0) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(1) f(1) f(0) f(1) f(0) + + + +
API + + + + f(1) f(0) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(1) f(0) f(1) f(0) + + + +
API + + + + f(0) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(0) f(1) f(0) + + + +
API + + + + f(1) f(0) public class Sum extends Task { public Object execute(Env e) { Integer I = ((Integer) getInput(0)); Integer J = ((Integer) getInput(1)); int sum = I.intValue() + J.intValue(); return new Integer( sum ); } f(1) f(0) + + + +
API + + + f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(1) f(0) + + +
API + + + f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(0) + + +
API + + + public class Sum extends Task { public Object execute(Env e) { Integer I = ((Integer) getInput(0)); Integer J = ((Integer) getInput(1)); int sum = I.intValue() + J.intValue(); return new Integer( sum ); } + + +
API + + public class Sum extends Task { public Object execute(Env e) { Integer I = ((Integer) getInput(0)); Integer J = ((Integer) getInput(1)); int sum = I.intValue() + J.intValue(); return new Integer( sum ); } + +
API + public class Sum extends Task { public Object execute(Env e) { Integer I = ((Integer) getInput(0)); Integer J = ((Integer) getInput(1)); int sum = I.intValue() + J.intValue(); return new Integer( sum ); } +
API recap + + + + f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) + + + +
API recap + + + + f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) + + + +
API recap + + + + f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) public class F extends Task { private int n; public F(int n) { this.n = n; } public Object execute(Env e) { if ( n < 2 ) { return new Integer( 1 ); else { compute( new F( n-1 ) ); compute( new F( n-2 ) ); return new Sum(); } f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) + + + +
API recap + + + + f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) public class Sum extends Task { public Object execute(Env e) { int i = ((Integer) getInput(0)); int j = ((Integer) getInput(1)); return new Integer( I + j ); } f(4) f(3) f(2) f(2) f(1) f(1) f(0) f(1) f(0) + + + +
Overview Computational model API Architecture Performance Benefit summary Plans
Architecture Goals Virtualize compute cycles Store/coordinate partial results Self-organizing Independent of hardware/OS Scale from LAN to Internet
Architecture … JICOS has 3 service component classes: Hosting Service Provider (HSP): clients interact solely with the HSP. HSP manages other service components Task server A task space Host Executes tasks
Architecture … Hosting Service Provider Client
Architecture … Adaptive parallelism HSP CLIENT
Architecture … Tolerates faulty Hosts HSP CLIENT
Architecture Latency hiding/reduction 1 Task caching Task pre-fetch Task server computation 2 5 3 8 6 7 4 9 10 11 12 13
Architecture Latency hiding/reduction 1 Task caching 2 5 3 8 6 7 4 9 10 11 12 13
Architecture Latency hiding/reduction 1 Task caching Task pre-fetch Issues pre-fetch 2 5 3 8 6 7 4 9 10 11 12 13
Architecture Latency hiding/reduction 1 Root task Task is cached Task issues pre-fetch Task is pre-fetched 2 5 3 8 6 7 4 9 10 11 12 13
Architecture Latency hiding/reduction 1 Task caching Task pre-fetch Issues pre-fetch Is pre-fetched Task server computation 2 5 3 8 6 7 4 9 10 11 12 13
Overview Computational model API Architecture Performance Benefit summary Plans
JICOS Speedup:150-City TSP 1 processor: 6 hours, 18 minutes 52 processors: 8 minutes 16K Tasks, average task time: 1.5 seconds
Overview Computational model API Architecture Performance Benefit summary Plans
Benefit Summary API is easy to use Heterogeneous hardware & OS Adaptive parallelism Tolerates faulty hosts Hides/reduces communication latency Small-grain parallel computation
Overview Computational model API Architecture Performance Benefit summary Plans
Possible Projects Application as a Service Non-Java Program Non-Java Interface Jicos application as web service: allows non-Java programs to “use” Jicos. XML-encoded TSP instance WEB SERVER TSP JAVA JICOS
Possible Project Application Servers SAT TSP JICOS ? ILP
Possible Project Web Server Web Server Web Server TSP Server IP Server SAT Server JICOS JICOS JICOS
Thanks! Questions? http://cs.ucsb.edu/projects/jicos
Introduction “Listen to the technology!” Carver Mead
Introduction “Listen to the technology!” Carver Mead What is the technology telling us?
Introduction “Listen to the technology!” Carver Mead What is the technology telling us? Internet’s idle cycles/sec growing rapidly
Introduction “Listen to the technology!” Carver Mead What is the technology telling us? Internet’s idle cycles/sec growing rapidly Bandwidth is increasing & getting cheaper Communication latency is not decreasing
Introduction “Listen to the technology!” Carver Mead What is the technology telling us? Internet’s idle cycles/sec growing rapidly Bandwidth increasing & getting cheaper Communication latency is not decreasing Human technology is getting neither cheaper nor faster.
Introduction Fundamental Issue: Heterogeneity … OS1 OS2 OS3 OS4 OS5 M1 M2 M3 M4 M5 Heterogeneous machine/OS
Plans More performance experiments More NP-hard optimizations Larger (SDSC Data Star 256 processor) Heterogeneous More NP-hard optimizations Investigate upper & lower bounds Use Jini to enable: Secure communication (SSL) Dynamic service discovery Task Servers Fault-tolerant Adaptive network topology More programming models