Download presentation
Presentation is loading. Please wait.
Published byDorcas Cameron Modified over 9 years ago
1
Programming Distributed Systems with High Level Abstractions Douglas Thain University of Notre Dame 23 October 2008
2
Distributed Systems Scale: 2 – 100s – 1000s – millions Domains:Single or Multi Users: 1 – 10 – 100 – 1000 – 10000 Naming:Direct, Virtual Scheduling:Timesharing / Space Sharing Interface:Allocate CPU / Execute Job Security:None / IP / PKI / KRB … Storage: Embedded / External
3
Cloud Computing? Scale: 2 – 100s – 1000s – 10000s Domains:Single or Multi Users: 1 – 10 – 100 – 1000 – 10000 Naming:Direct, Virtual Scheduling:Timesharing / Spacesharing Interface:Allocate CPU / Execute Job Security:None / IP / PKI / KRB … Storage: Embedded / External
4
Grid Computing? Scale: 2 – 100s – 1000s – 10000s Domains:Single or Multi Users: 1 – 10 – 100 – 1000 – 10000 Naming:Direct, Virtual Scheduling:Timesharing / Spacesharing Interface:Allocate CPU / Execute Job Security:None / IP / PKI / KRB … Storage: Embedded / External
5
An Assembly Language of Distributed Computing Fundamental Operations –TransferFile( source, destination ) –ExecuteJob( host, exe, input, output ) –AllocateVM( cpu, mem, disk, opsys ) Semantics of Assembly are Subtle: –When do instructions commit? –Delay slots before control transfers? –What exceptions are valid for each opcode? –Precise or imprecise exceptions? –What is the cost of each instruction?
6
Programming in Assembly Stinks You know the problems: –Stack management. –Garbage collection. –Type checking. –Co-location of data and computation. –Query optimizations. –Function shipping or data shipping? –How many nodes should I harness?
7
Abstractions for Distributed Computing Abstraction: a declarative specification of the computation and data of a workload. A restricted pattern, not meant to be a general purpose programming language. Avoid the really terrible cases. Provide users with a bright path. Data structures instead of file systems.
8
All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j B1 B2 B3 A1A2A3 FFF A1 An B1 Bn F AllPairs(A,B,F) F FF FF F Moretti, Bulosan, Flynn, Thain, AllPairs: An Abstraction… IPDPS 2008
9
Example Application Goal: Design robust face comparison function. F 0.05 F 0.97
10
Similarity Matrix Construction 1.8.100.1 10.1.10 10.1.3 100 1.1 1 F Current Workload: 4000 images 256 KB each 10s per F (five days) Future Workload: 60000 images 1MB each 1s per F (three months)
11
http://www.cse.nd.edu/~ccl/viz
12
Non-Expert User Using 500 CPUs Try 1: Each F is a batch job. Failure: Dispatch latency >> F runtime. HN CPU FFFF F Try 2: Each row is a batch job. Failure: Too many small ops on FS. HN CPU FFFF F F F F F F F F F F F F F F F F Try 3: Bundle all files into one package. Failure: Everyone loads 1GB at once. HN CPU FFFF F F F F F F F F F F F F F F F F Try 4: User gives up and attempts to solve an easier or smaller problem.
13
All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j B1 B2 B3 A1A2A3 FFF A1 An B1 Bn F AllPairs(A,B,F) F FF FF F
17
What is the right metric? Speedup? –Seq Runtime / Parallel Runtime Parallel Efficiency? –Speedup / N CPUs? Neither works, because the number of CPUs varies over time and between runs. Cost Efficiency –Work Completed / Resources Consumed –Person-Miles / Gallon –Results / CPU-hours –Results / $$$
18
All-Pairs Abstraction
19
T2 Classify Abstraction Classify( T, R, N, P, F ) T = testing setR = training set N = # of partitionsF = classifier P T1 T3 F F F T R V1 V2 V3 CV Moretti, Steinhauser, Thain, Chawla, Scaling up Classifiers to Cloud Computers, ICDM 2008.
21
BXGrid Abstractions B1 B2 B3 A1A2A3 FFF F FF FF F Lbrown Lblue Rbrown R S1 S2 S3 eyecolor F F F ROC Curve S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) Bui, Thomas, Kelly, Lyon, Flynn, Thain BXGrid: A Repository and Experimental Abstraction… in review 2008.
22
Implementing Abstractions S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) DBMS Relational Database (2x) Active Storage Cluster (16x) CPU Relational Database CPU Condor Pool (500x)
24
Compatibility of Abstractions? Assembly Language Map-ReduceAll-PairsClassify
25
Compatibility of Abstractions? Assembly Language Map-Reduce All-Pairs Classify ??? Mismatch: MR relies on data partition. AP relies on data re-use. Mismatch: Classify partitions logically. MR partitions physically.
26
Compatibility of Abstractions? Assembly Language Map-ReduceAll-PairsClassify SwiftDryad More General, Less Optimized?
27
From Clouds to Multicore Next Step: AP Implementation that runs well on Single CPU, Multicore, Cloud, or Cloud of Multicores. Assembly Language Map-ReduceAll-PairsClassify DryadSwift CPU Assembly Language Map-ReduceAll-PairsClassify DryadSwift CPU $$$ RAM
28
Acknowledgments Cooperative Computing Lab –http://www.cse.nd.edu/~ccl http://www.cse.nd.edu/~ccl Grad Students: –Chris Moretti –Hoang Bui –Michael Albrecht –Li Yu NSF Grants CCF-0621434, CNS-0643229 Undergraduate Students –Mike Kelly –Rory Carmichael –Mark Pasquier –Christopher Lyon –Jared Bulosan
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.