Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Abstractions to Scale Up Applications to Campus Grids Douglas Thain University of Notre Dame 28 April 2009.

Similar presentations


Presentation on theme: "Using Abstractions to Scale Up Applications to Campus Grids Douglas Thain University of Notre Dame 28 April 2009."— Presentation transcript:

1 Using Abstractions to Scale Up Applications to Campus Grids Douglas Thain University of Notre Dame 28 April 2009

2 Outline What is a Campus Grid? Challenges of Using Campus Grids. Solution: Abstractions Examples and Applications –All-Pairs: Biometrics –Wavefront: Economics –Assembly: Genomics Combining Abstractions Together

3 What is a Campus Grid? A campus grid is an aggregation of all available computing power found in an institution: –Idle cycles from desktop machines. –Unused cycles from dedicated clusters. Examples of campus grids: –600 CPUs at the University of Notre Dame –2000 CPUs at the University of Wisconsin –13,000 CPUs at Purdue University

4 Provides robust batch queueing on a complex distributed system. Resource owners control consumption: –“Only run jobs on this machine at night.” –“Prefer biology jobs over physics jobs.” End users express needs: –“Only run this job where RAM>2GB” –“Prefer to run on machines http://www.cs.wisc.edu/condor

5

6

7

8

9

10 The Assembly Language of Campus Grids User Interface: –N x { run program X with files F and G } System Properties: –Wildly varying resource availability. –Heterogeneous resources. –Unpredictable preemption. Effect on Applications: –Jobs can’t run for too long... –But, they can’t run too quickly, either! –Use file I/O for inter-process communication. –Bad choices cause chaos on the network and heartburn for system administrators.

11 I have 10,000 iris images acquired in my research lab. I want to reduce each one to a feature space, and then compare all of them to each other. I want to spend my time doing science, not struggling with computers. I have a laptop. I own a few machinesI can get cycles from ND and Purdue Now What?

12 Observation In a given field of study, a single person may repeat the same of work many times, making slight changes to the data and algorithms. In a given field of study, a single person may repeat the same pattern of work many times, making slight changes to the data and algorithms. If we knew in advance the intended pattern, then we could do a better job of mapping a complex application to a complex system.

13 Abstractions for Distributed Computing Abstraction: a declarative specification of the computation and data of a workload. A restricted pattern, not meant to be a general purpose programming language. Uses instead of files. Uses data structures instead of files. Provide users with a. Provide users with a bright path. Regular structure makes it tractable to model and predict performance.

14 All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j B1 B2 B3 A1A2A3 FFF A1 An B1 Bn F AllPairs(A,B,F) F FF FF F Moretti, Bulosan, Flynn, Thain, AllPairs: An Abstraction… IPDPS 2008 allpairs A B F.exe

15 Example Application Goal: Design robust face comparison function. F 0.05 F 0.97

16 Similarity Matrix Construction 1.8.100.1 10.1.10 10.1.3 100 1.1 1 F Current Workload: 4000 images 256 KB each 10s per F (five days) Future Workload: 60000 images 1MB each 1s per F (three months)

17 Non-Expert User on a Campus Grid Try 1: Each F is a batch job. Failure: Dispatch latency >> F runtime. HN CPU FFFF F Try 2: Each row is a batch job. Failure: Too many small ops on FS. HN CPU FFFF F F F F F F F F F F F F F F F F Try 3: Bundle all files into one package. Failure: Everyone loads 1GB at once. HN CPU FFFF F F F F F F F F F F F F F F F F Try 4: User gives up and attempts to solve an easier or smaller problem.

18 All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j B1 B2 B3 A1A2A3 FFF A1 An B1 Bn F AllPairs(A,B,F) F FF FF F

19 % allpairs compare.exe adir bdir

20

21 Distribute Data Via Spanning Tree

22

23 An Interesting Twist Send the absolute minimum amount of data needed to each of N nodes from a central server –Each job must run on exactly 1 node. –Data distribution time: O( D sqrt(N) ) Send all data to all N nodes via spanning tree distribution: –Any job can run on any node. –Data distribution time: O( D log(N) ) It is both faster and more robust to send all data to all nodes via spanning tree.

24 Choose the Right # of CPUs

25 What is the right metric?

26 What’s the right metric? Speedup? –Seq Runtime / Parallel Runtime Parallel Efficiency? –Speedup / N CPUs? Neither works, because the number of CPUs varies over time and between runs. Better Choice: Cost Efficiency –Work Completed / Resources Consumed –Cars: Miles / Gallon –Planes: Person-Miles / Gallon –Results / CPU-hours –Results / $$$

27 All-Pairs Abstraction

28 Wavefront ( R[x,0], R[0,y], F(x,y,d) ) R[4,2] R[3,2]R[4,3] R[4,4]R[3,4]R[2,4] R[4,0]R[3,0]R[2,0]R[1,0]R[0,0] R[0,1] R[0,2] R[0,3] R[0,4] F x yd F x yd F x yd F x yd F x yd F x yd F F y y x x d d x FF x ydyd

29 % wavefront func.exe infile outfile

30 The Performance Problem Dispatch latency really matters: a delay in one holds up all of its children. If we dispatch larger sub-problems: –Concurrency on each node increases. –Distributed concurrency decreases. If we dispatch smaller sub-problems: –Concurrency on each node decreases. –Spend more time waiting for jobs to be dispatched. So, model the system to choose the block size. And, build a fast-dispatch execution system.

31 Wavefront ( R[x,0], R[0,y], F(x,y,d) ) R[4,2] R[3,2]R[4,3] R[4,4]R[3,4]R[2,4] R[4,0]R[3,0]R[2,0]R[1,0]R[0,0] R[0,1] R[0,2] R[0,3] R[0,4] F x yd F x yd F x yd F x yd F x yd F x yd F F y y x x d d x FF x ydyd Block Size = 2

32 Model of 1000x1000 Wavefront

33 worker work queue F In.txtout.txt put F.exe put in.txt exec F.exe out.txt get out.txt wavefront queue tasks done 100s of workers dispatched to Notre Dame, Purdue, and Wisconsin

34 worker work queue F In.txtout.txt put F.exe put in.txt exec F.exe out.txt get out.txt wavefront queue tasks done wavefront F 100s of workers dispatched to Notre Dame, Purdue, and Wisconsin

35 500x500 Wavefront on ~200 CPUs

36 Wavefront on a 200-CPU Cluster

37 Wavefront on a 32-Core CPU

38 The Genome Assembly Problem AGTCGATCGATCGATAATCGATCCTAGCTAGCTACGA AGTCGATCGATCGAT AGCTAGCTACGA TCGATAATCGATCCTAGCTA Chemical Sequencing Computational Assembly AGTCGATCGATCGAT AGCTAGCTACGA TCGATAATCGATCCTAGCTA Millions of “reads” 100s bytes long.

39 Sample Genomes ReadsDataPairsSequentialTime A. gambiae scaffold101K80MB738K 12 hours A. gambiae complete180K1.4GB12M 6 days S. bicolor 7.9M5.7GB84M 30 days

40 Assemble( set S, Test(), Align(), Assm() ) 0. AGCCTGCATTA… 1. CATTAACGAAC… 2. GACTGACTAGC… 3, TGACCGATAAA… Candidate Pairs 0 is similar to 1 1 is similar to 3 1 is similar to 4 0. AGCCTGCATTA 1. CATTAACGAAC… Sequence Data AGTCGATCGATCGATAATC… List of Alignments Assembled Sequence CPU Bound I/O Bound RAM Bound Align Test Assem

41 worker work queue in.txtout.txt put align.exe put in.txt exec F.exe out.txt get out.txt 100s of workers dispatched to Notre Dame, Purdue, and Wisconsin align master queue tasks done align.exe detail of a single worker: Distributed Genome Assembly test assemble

42 Small Genome (101K reads)

43 Medium Genome (180K reads)

44 Large Genome (7.9M)

45 From Workstation to Grid

46 What’s the Upshot? We can do full-scale assemblies as a routine matter on existing conventional machines. Our solution is faster (wall-clock time) than the next faster assembler run on 1024x BG/L. You could almost certainly do better with a dedicated cluster and a fast interconnect, but such systems are not universally available. Our solution opens up research in assembly to labs with “NASCAR” instead of “Formula-One” hardware.

47 What Other Abstractions Might Be Useful? Map( set S, F(s) ) Explore( F(x), x: [a….b] ) Minimize( F(x), delta ) Minimax( state s, A(s), B(s) ) Search( state s, F(s), IsTerminal(s) ) Query( properties ) -> set of objects FluidFlow( V[x,y,z], F(v), delta )

48 How do we connect multiple abstractions together? Need a meta-language, perhaps with its own atomic operations for simple tasks: Need to manage (possibly large) intermediate storage between operations. Need to handle data type conversions between almost-compatible components. Need type reporting and error checking to avoid expensive errors. If abstractions are feasible to model, then it may be feasible to model entire programs.

49 Connecting Abstractions in BXGrid B1 B2 B3 A1A2A3 FFF F FF FF F Lbrown Lblue Rbrown R S1 S2 S3 eyecolor F F F ROC Curve S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) Bui, Thomas, Kelly, Lyon, Flynn, Thain BXGrid: A Repository and Experimental Abstraction… poster at IEEE eScience 2008

50 50

51 51

52 Implementing Abstractions S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) DBMS Relational Database (2x) Active Storage Cluster (16x) CPU Relational Database CPU Condor Pool (500x)

53 Largest Combination so Far Complete Select/Transform/All-Pairs biometric experiment on 58,396 irises from the Face Recognition Grand Challenge. To our knowledge, the largest experiment ever run on publically available data. Competing biometric research relies on samples of 100-1000 images, which can miss important population effects. Reduced computation time from 800 days to 10 days, making it feasible to repeat multiple times for a graduate thesis.

54

55 Abstractions Redux Campus grids provide enormous computing power, but are very challenging to use effectively. An abstraction provides a robust, scalable solution to a category of problems. Multiple abstractions can be chained together to solve very large problems. Could a menu of abstractions cover a significant fraction of the application mix?

56 Acknowledgments Cooperative Computing Lab –http://www.cse.nd.edu/~ccl http://www.cse.nd.edu/~ccl Grad Students –Chris Moretti –Hoang Bui –Li Yu –Mike Olson –Michael Albrecht Faculty: –Patrick Flynn –Nitesh Chawla –Kenneth Judd –Scott Emrich NSF Grants CCF-0621434, CNS-0643229 Undergrads –Mike Kelly –Rory Carmichael –Mark Pasquier –Christopher Lyon –Jared Bulosan


Download ppt "Using Abstractions to Scale Up Applications to Campus Grids Douglas Thain University of Notre Dame 28 April 2009."

Similar presentations


Ads by Google