Download presentation
Presentation is loading. Please wait.
Published byAshlynn Powers Modified over 9 years ago
1
Scala Parallel Collections Aleksandar Prokopec, Tiark Rompf Scala Team EPFL
2
Introduction multi-core programming – not straightforward need better higher order abstractions libraries and tools have only begun using these new capabilites collections - everywhere
3
Goals efficient parallel implementations of most collection methods find common abstractions needed to implement them retain consistency with existing collection framework smoothly integrate new methods into existing framework
4
Scala Collection Framework most operations implemented in terms of an abstract method def foreach[U](f: T => U): Unit new collections are created using builders trait Builder[Elem, To]
5
Example the filter method: def filter(p: A => Boolean): Repr = { val b = newBuilder for (x <- this) if (p(x)) b += x b.result } List(1, 2, 3, 4, 5, 6, 7).filter(_ % 2 == 0) 1234567 Nil 246 Builder
6
Parallel operations parallel traversal should be easy for some data structures could filter be parallelized by having a concurrent builder? 3 problems: – order may not be preserved anymore – sequences? – performance concerns – there are more complicated methods such as span
7
11-5996311423217 Method span 731199 214233192263-511-2-71 prefixElemssuffixElems um... not a good idea assume an array (keep it simple) array.span(_ >= 0)
8
Method reduce span seems inherently sequential we’ll get back to this, let’s try something simpler – reduce def reduce[U >: T](op: (U, U) => U): U takes an associative operator and applies it between all the elements (examples: adding, concatenation)
9
family to use Scala.Tell your friends and Scala.andusefriendstoyourfamilyTell Method reduce assume associative operator is concatenation val s = “Tell your friends and family to use Scala.” s.split(“ ”).toArray.reduce(_ + _) TellyourfriendsandfamilytouseScala. +
10
Method reduce we might have more processors this is a well known pattern from parallel programming but, we need a right abstraction 12345678 371115 1026 36 +++ + + + +
11
Method split we can implement methods such as reduce, foreach, count, find and forall assuming we can divide the collection new abstract operation def split: Seq[Repr] returns a non-trivial partition of the collection
12
Method split def split: Seq[Repr] how to implement? – copy elements – produce a wrapper – use data structure properties (e.g. tree)
13
Method filter this abstract method can be used to implement accessor methods for transformer methods such as filter this is not sufficient – collection results should be merged 1, 2, 3, 45, 6, 7, 8 2, 46, 8 2, 4, 6, 8 3, 1, 8, 02, 2, 1, 9 8, 02, 2 8, 0, 2, 2 2, 4, 6, 8, 8, 0, 2, 2
14
Method combine we need another abstraction def combine[Other >: Repr] (that: Other): Other creates a collection that contains all the elements of this collection and that collection
15
Method combine def combine[Other >: Repr] (that: Other): Other how to implement? – copy elements – use lazy evaluation to copy twice – use specialized data structures
16
Lazy collection evaluation merge occurs more than once each processor adds results to its own builder evaluation occurs in the root 1, 2, 3, 45, 6, 7, 8 2, 46, 8 3, 1, 8, 02, 2, 1, 9 8, 02, 2 merge copy allocate 24688022
17
Lazy collection evaluation advantages: – easier to apply to existing collections – for certain data structures copying is cheap (arrays) – merging is very cheap disadvantages: – copying occurs twice – affects cheap operations – garbage collection occurs more often
18
Specialized data structures some data structures such can be merged efficiently (trees, heaps, skiplists…) immutable vectors – immutable sequences with efficient splitting and concatenation
19
Method span each processors keeps 2 builders merge has 2 cases – counterexample in the left partition – no counterexample in the left partition 3924-57324-72 3 9 2 4 -5 2 4 7 3 2 2 -74 2 4 7 3
20
Method find some methods don’t always traverse the entire collection Array(1, 4, 9, 16, 9, 4, 1, 0).find(_ > 10) 149169410Some 16 in a parallel implementation, other processors should be informed that an element was found
21
Signalling trait inherited by all parallel collections allows processors to send signals contains an abort flag which is periodically checked – implemented as a volatile field 149169410 Signalling Some 16
22
Signalling trait the abort flag can be used signal other processors they should stop it can be used for find, exists, forall, sameElements, … what about takeWhile ? array.takeWhile(_ < 100) 149162536496481100121144169196225256
23
Signalling trait need to convey information about where the element has been found atomic index flag using compare and swap changes are monotonic! 149162536496481100121144169196225256 Signalling 9MAX
24
123...750751752753754755 Load balancing processor availability and data processing cost may not be uniform fine grained division – more tasks than processors Done!
25
Work-stealing need to schedule tasks to processors – work stealing each processor has a task queue when it runs out of tasks – it steals from other queues proc 1proc 2 steal!
26
Adaptive work-stealing still, a large number of tasks can lead to an overhead adaptive partitioning
27
Adaptive work-stealing ensures better load balancing proc 1proc 2 steal!
28
Package hierarchy subpackage of collection package collection mutableimmutableparallel mutableimmutable
29
Class hierarchy consistent with existing collections clients can refer to parallel collections transparently Iterable MapSeqSetParallelIterable ParallelMapParallelSeqParallelSet
30
How to use be aware of side-effects var k = 0 array.foreach(k += _) parallel collections are not concurrent collections careful with small collections – cost of setup may be higher
31
How to use parallel ranges – a way to parallelize for-loops for (i <- (0 until 1000).par) yield { var num = i var lst: List[Int] = Nil while (num > 0) { lst ::= num % 2 num = num / 2 } lst }
32
Benchmarks microbenchmarks with low cost per- element operations foreach 12468 Sequential1227 ParallelArray1180797529449421 Extra1661195757544442403 reduce 12468 Sequential949 ParallelArray832551375328297 Extra166890566363300282
33
Benchmarks microbenchmarks with low cost per- element operations filter 12468 Sequential611 ParallelArray476333235216208 Extra166581372296280264 find 12468 Sequential1181 ParallelArray961608410331300 Extra166841602393309294
34
Current state an array - ParallelArray ranges - ParallelRange views - ParallelView working on – ParallelVector and ParallelHashMap
35
Conclusion good performance results nice integration with existing collections more parallel collections worked on will be integrated into Scala 2.8.1
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.