Using Small Abstractions to Program Large Distributed Systems Douglas Thain University of Notre Dame 11 December 2008.

Slides:

Advertisements

Similar presentations

1 Scaling Up Data Intensive Scientific Applications to Campus Grids Douglas Thain University of Notre Dame LSAP Workshop Munich, June 2009.

Advertisements

MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.

BXGrid: A Data Repository and Computing Grid for Biometrics Research Hoang Bui University of Notre Dame 1.

Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.

A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager

Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

Spark: Cluster Computing with Working Sets

1 Condor Compatible Tools for Data Intensive Computing Douglas Thain University of Notre Dame Condor Week 2011.

1 High Throughput Scientific Computing with Condor: Computer Science Challenges in Large Scale Parallelism Douglas Thain University of Notre Dame UAB 27.

1 Opportunities and Dangers in Large Scale Data Intensive Computing Douglas Thain University of Notre Dame Large Scale Data Mining Workshop at SIGKDD August.

1 Scaling Up Data Intensive Science with Application Frameworks Douglas Thain University of Notre Dame Michigan State University September 2011.

1 Models and Frameworks for Data Intensive Cloud Computing Douglas Thain University of Notre Dame IDGA Cloud Computing 8 February 2011.

1 Science in the Clouds: History, Challenges, and Opportunities Douglas Thain University of Notre Dame GeoClouds Workshop 17 September 2009.

Xyleme A Dynamic Warehouse for XML Data of the Web.

1 Scaling Up Data Intensive Science to Campus Grids Douglas Thain Clemson University 25 Septmber 2009.

Programming Distributed Systems with High Level Abstractions Douglas Thain University of Notre Dame Cloud Computing and Applications (CCA-08) University.

Deconstructing Clusters for High End Biometric Applications NSF CCF June Douglas Thain and Patrick Flynn University of Notre Dame 5 August.

Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.

1 Scaling Up Classifiers to Cloud Computers Christopher Moretti, Karsten Steinhaeuser, Douglas Thain, Nitesh V. Chawla University of Notre Dame.

Getting Beyond the Filesystem: New Models for Data Intensive Scientific Computing Douglas Thain University of Notre Dame HEC FSIO Workshop 6 August 2009.

Cooperative Computing for Data Intensive Science Douglas Thain University of Notre Dame NSF Bridges to Engineering 2020 Conference 12 March 2008.

CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.

An Introduction to Grid Computing Research at Notre Dame Prof. Douglas Thain University of Notre Dame

1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.

Using Abstractions to Scale Up Applications to Campus Grids Douglas Thain University of Notre Dame 28 April 2009.

FLANN Fast Library for Approximate Nearest Neighbors

Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.

Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.

Chapter 3 Memory Management: Virtual Memory

Computer System Architectures Computer System Software

1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.

Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

Programming Distributed Systems with High Level Abstractions Douglas Thain University of Notre Dame 23 October 2008.

Introduction to Hadoop and HDFS

1 Dryad Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly of Microsoft.

Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.

1 Computational Abstractions: Strategies for Scaling Up Applications Douglas Thain University of Notre Dame Institute for Computational Economics University.

MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.

Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!

The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.

Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

Parallel Computing.

Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,

Techniques for Preserving Scientific Software Executions: Preserve the Mess or Encourage Cleanliness? Douglas Thain, Peter Ivie, and Haiyan Meng.

CIS250 OPERATING SYSTEMS Chapter One Introduction.

Big data Usman Roshan CS 675. Big data Typically refers to datasets with very large number of instances (rows) as opposed to attributes (columns). Data.

Introduction to Scalable Programming using Work Queue Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012.

MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.

PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.

A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.

ECE 526 – Network Processing Systems Design Programming Model Chapter 21: D. E. Comer.

1 Christopher Moretti – University of Notre Dame 4/30/2008 High Level Abstractions for Data-Intensive Computing Christopher Moretti, Hoang Bui, Brandon.

Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.

Introduction to Makeflow and Work Queue Nicholas Hazekamp and Ben Tovar University of Notre Dame XSEDE 15.

Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi

Fundamental Operations Scalability and Speedup

Big Data is a Big Deal!.

Scaling Up Scientific Workflows with Makeflow

The University of Adelaide, School of Computer Science

课程名编译原理 Compiling Techniques

Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.

Applying Twister to Scientific Applications

Haiyan Meng and Douglas Thain

Ch 4. The Evolution of Analytic Scalability

BXGrid: A Data Repository and Computing Grid for Biometrics Research

What’s New in Work Queue

Introduction to MapReduce

MapReduce: Simplified Data Processing on Large Clusters

Presentation transcript:

Using Small Abstractions to Program Large Distributed Systems Douglas Thain University of Notre Dame 11 December 2008

Using Small Abstractions to Program Large Distributed Systems Douglas Thain University of Notre Dame 11 December 2008 (And multicore computers!)

Clusters, clouds, and grids give us access to zillions of CPUs. How do we write programs that can run effectively in large systems?

I have 10,000 iris images acquired in my research lab. I want to reduce each one to a feature space, and then compare all of them to each other. I want to spend my time doing science, not struggling with computers. I have a laptop. I own a few machinesI can buy time from Amazon or TeraGrid. Now What?

How do a I program a CPU? I write the algorithm in a language that I find convenient: C, Fortran, Python, etc… The compiler chooses instructions for the CPU, even if I don’t know assembly. The operating system allocates memory, moves data between disk and memory, and manages the cache. To move to a different CPU, recompile or use a VM, but don’t change the program.

How do I program the grid/cloud? Split the workload into pieces. –How much work to put in a single job? Decide how to move data. –Demand paging, streaming, file transfer? Express the problem in a workflow language or programming environment. –DAG / MPI / Pegasus / Taverna / Swift ? Babysit the problem as it runs. –Worry about disk / network / failures…

How do I program on 128 cores? Split the workload into pieces. –How much work to put in a single thread? Decide how to move data. –Shared memory, message passing, streaming? Express the problem in a workflow language or programming environment. –OpenMP, MPI, PThreads, Cilk, … Babysit the problem as it runs. –Implement application level checkpoints.

Tomorrow’s distributed systems will be clouds of multicore computers. So, we have to solve both problems with a single model.

Observation In a given field of study, a single person may repeat the same of work many times, making slight changes to the data and algorithms. In a given field of study, a single person may repeat the same pattern of work many times, making slight changes to the data and algorithms. Examples everyone knows: –Parameter sweep on a simulation code. –BLAST search across multiple databases. Are there other examples?.

Abstractions for Distributed Computing Abstraction: a declarative specification of the computation and data of a workload. A restricted pattern, not meant to be a general purpose programming language. Uses instead of files. Uses data structures instead of files. Provide users with a. Provide users with a bright path. Regular structure makes it tractable to model and predict performance.

All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j B1 B2 B3 A1A2A3 FFF A1 An B1 Bn F AllPairs(A,B,F) F FF FF F Moretti, Bulosan, Flynn, Thain, AllPairs: An Abstraction… IPDPS 2008 allpairs A B F.exe

Example Application Goal: Design robust face comparison function. F 0.05 F 0.97

Similarity Matrix Construction F Current Workload: 4000 images 256 KB each 10s per F (five days) Future Workload: images 1MB each 1s per F (three months)

Non-Expert User Using 500 CPUs Try 1: Each F is a batch job. Failure: Dispatch latency >> F runtime. HN CPU FFFF F Try 2: Each row is a batch job. Failure: Too many small ops on FS. HN CPU FFFF F F F F F F F F F F F F F F F F Try 3: Bundle all files into one package. Failure: Everyone loads 1GB at once. HN CPU FFFF F F F F F F F F F F F F F F F F Try 4: User gives up and attempts to solve an easier or smaller problem.

All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j B1 B2 B3 A1A2A3 FFF A1 An B1 Bn F AllPairs(A,B,F) F FF FF F

How Many CPUs on the Cloud?

What is the right metric?

How to measure in clouds? Speedup? –Seq Runtime / Parallel Runtime Parallel Efficiency? –Speedup / N CPUs? Neither works, because the number of CPUs varies over time and between runs. An Alternative: Cost Efficiency –Work Completed / Resources Consumed –Person-Miles / Gallon –Results / CPU-hours –Results / $$$

All-Pairs Abstraction

Wavefront ( R[x,0], R[0,y], F(x,y,d) ) R[4,2] R[3,2]R[4,3] R[4,4]R[3,4]R[2,4] R[4,0]R[3,0]R[2,0]R[1,0]R[0,0] R[0,1] R[0,2] R[0,3] R[0,4] F x yd F x yd F x yd F x yd F x yd F x yd F F y y x x d d x FF x ydyd

Implementing Wavefront Complete Input Complete Output Distributed Master Multicore Master Multicore Master F F F F Input Output Input Output F F F F

The Performance Problem Dispatch latency really matters: a delay in one holds up all of its children. If we dispatch larger sub-problems: –Concurrency on each node increases. –Distributed concurrency decreases. If we dispatch smaller sub-problems: –Concurrency on each node decreases. –Spend more time waiting for jobs to be dispatched. So, model the system to choose the block size. And, build a fast-dispatch execution system.

Model of 1000x1000 Wavefront

500x500 Wavefront on ~200 CPUs

Wavefront on a 200-CPU Grid

Wavefront on a 32-Core CPU

T2 Classify Abstraction Classify( T, R, N, P, F ) T = testing setR = training set N = # of partitionsF = classifier P T1 T3 F F F T R V1 V2 V3 CV Moretti, Steinhauser, Thain, Chawla, Scaling up Classifiers to Cloud Computers, ICDM 2008.

From Abstractions to a Distributed Language

What Other Abstractions Might Be Useful? Map( set S, F(s) ) Explore( F(x), x: [a….b] ) Minimize( F(x), delta ) Minimax( state s, A(s), B(s) ) Search( state s, F(s), IsTerminal(s) ) Query( properties ) -> set of objects FluidFlow( V[x,y,z], F(v), delta )

How do we connect multiple abstractions together? Need a meta-language, perhaps with its own atomic operations for simple tasks: Need to manage (possibly large) intermediate storage between operations. Need to handle data type conversions between almost-compatible components. Need type reporting and error checking to avoid expensive errors. If abstractions are feasible to model, then it may be feasible to model entire programs.

Connecting Abstractions in BXGrid B1 B2 B3 A1A2A3 FFF F FF FF F Lbrown Lblue Rbrown R S1 S2 S3 eyecolor F F F ROC Curve S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) Bui, Thomas, Kelly, Lyon, Flynn, Thain BXGrid: A Repository and Experimental Abstraction… poster at IEEE eScience 2008

Implementing Abstractions S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) DBMS Relational Database (2x) Active Storage Cluster (16x) CPU Relational Database CPU Condor Pool (500x)

What is the Most Useful ABI? Functions can come in many forms: –Unix Program –C function in source form –Java function in binary form Datasets come in many forms: –Set: list of files, delimited single file, or database query. –Matrix: sparse elem list, or binary layout Our current implementations require a particular form. With a carefully stated ABI, abstractions could work with many different user communities.

What is the type system? Files have an obvious technical type: –JPG, BMP, TXT, PTS, … But they also have a logical type: –JPG: Face, Iris, Fingerprint etc. –(This comes out of the BXGrid repository.) The meta-language can easily perform automatic conversions between technical types, and between some logical types: –JPG/Face -> BMP/Face via ImageMagick –JPG/Iris -> BIN/IrisCode via ComputeIrisCode –JPG/Face -> JPG/Iris is not allowed.

Abstractions Redux Mapping general-purpose programs to arbitrary distributed/multicore systems is algorithmically complex and full of pitfalls. But, mapping a single abstraction is a tractable problem that can be optimized, modeled, and re-used. Can we combine multiple abstractions together to achieve both expressive power and tractable performance?

Acknowledgments Cooperative Computing Lab – Grad Students –Chris Moretti –Hoang Bui –Karsten Steinhauser –Li Yu –Michael Albrecht Faculty: –Patrick Flynn –Nitesh Chawla –Kenneth Judd –Scott Emrich NSF Grants CCF , CNS Undergrads –Mike Kelly –Rory Carmichael –Mark Pasquier –Christopher Lyon –Jared Bulosan