Download presentation
Presentation is loading. Please wait.
1
Piccolo – Paper Discussion Big Data Reading Group 9/20/2010
2
Motivation / Goals Rising demand for distributing computation PageRank, K-Means, N-Body simulation Data-centric frameworks simplify programming Existing models (e.g. MapReduce) are insufficient Designed for large scale data analysis as opposed to in-memory computation Make in-memory computations fast Enable asynchronous computation 9/20/2010 Piccolo – Paper Discussion 2
3
Overview Global in-memory key-value tables for sharing state Concurrently running instances of kernel applications modifying global state Locality optimized (user specified policies) Reduced synchronization (accumulation, global barriers) Checkpoint-based recovery 9/20/2010 Piccolo – Paper Discussion 3
4
System Design 9/20/2010 Piccolo – Paper Discussion 4
5
Table interface 9/20/2010 Piccolo – Paper Discussion 5
6
Optimization Ensure Locality Group kernel with partition Group partitions Guarantee: one partition completely on single machine Reduce Synchronization Accumulation to avoid write/write conflicts No pairwise kernel synchronization Global barriers sufficient 9/20/2010 Piccolo – Paper Discussion 6
7
Load balancing Assigning partitions Round robin Optimized for data location Work stealing Biggest task first (master estimates based on number of keys in partition) Master decides Restrictions Cannot kill running task (modifies shared state, restore is very expensive) Partitions need to be moved 9/20/2010 Piccolo – Paper Discussion 7
8
Table migration Migrate table from w a to w b Message M 1 from master to all workers All workers flush to w a All workers send all new requests to w b w b buffers all requests w a sends paused state to w b All workers ackknowledge phase 1 => master sends M 2 to w a and w b w a flushes to w b and leaves “paused” w b first works buffered requests then resumes normal operation 9/20/2010 Piccolo – Paper Discussion 8
9
Fault tolerance User assisted checkpoint / restore Chandy Lamport Asynchronic -> periodic Synchronic -> barrier Problem: When to start barrier checkpoint Replay log might get very long Checkpoint might not use enough free CPU time before barrier Solution: When first worker finished all his jobs No checkpoint during table migration and vice versa 9/20/2010 Piccolo – Paper Discussion 9
10
Applications PageRank, k-means, n-body, matrix multiplication Parallel, iterative computations Local reads + local/remote writes or local/remote reads + local writes Can be implemented as multiple MapReduce jobs Distributed web crawler Idempotent operation Cannot be realized in MapReduce 9/20/2010 Piccolo – Paper Discussion 10
11
Scaling 9/20/2010 Piccolo – Paper Discussion 11 Fixed input size Scaled input size
12
Comparison with Hadoop / MPI 9/20/2010 Piccolo – Paper Discussion 12 PageRank, k-means (Hadoop) Piccolo 4x and 11x faster For PageRank: 50% in sort Join data streams 15% (de)serialization Read/write HDFS Matrix multiplication (MPI) Piccolo 10% faster MPI waits for slowest node many times
13
Work stealing / slow worker / checkpoints 9/20/2010 Piccolo – Paper Discussion 13 Work stealing / slow worker PageRank has skewed partitions One slow worker (50% CPU) Checkpoints Naïve - start after all workers finished Optimized – start after first worker finished
14
Checkpoint limits / scalability 9/20/2010 Piccolo – Paper Discussion 14 Hypothetical data center Typical machine uptime of 1 year Worst-case scenario Optimistic? Looked different on some older slides
15
Distributed Crawler 9/20/2010 Piccolo – Paper Discussion 15 32 Machines saturate 100Mbps There are single servers doing this Piccolo would scale higher
16
Summary Piccolo provides an easy to use distributed shared memory model It applies many restrictions Simple interface Reduced synchronization Relaxed consistency Accumulation Locality But it performs well Iterative computations Saves going to disk compared to MapReduce A specialized tool for data intensive in-memory computing 9/20/2010 Piccolo – Paper Discussion 16
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.