Scala Parallel Collections Aleksandar Prokopec EPFL.

Slides:



Advertisements
Similar presentations
7-Jun-14 Lists. Arrays and Lists Arrays are a fixed length and occupy sequential locations in memory This makes random access (for example, getting the.
Advertisements

ML Lists.1 Standard ML Lists. ML Lists.2 Lists A list is a finite sequence of elements. [3,5,9] ["a", "list" ] [] Elements may appear more than once [3,4]
ML Lists.1 Standard ML Lists. ML Lists.2 Lists  A list is a finite sequence of elements. [3,5,9] ["a", "list" ] []  Elements may appear more than once.
A Third Look At ML 1. Outline More pattern matching Function values and anonymous functions Higher-order functions and currying Predefined higher-order.
F# Overview: Immutable Data + Pure Functions. Acknowledgements Authored by – Thomas Ball, MSR Redmond Includes content from the F# team.
F28PL1 Programming Languages Lecture 14: Standard ML 4.
ML Lists.1 Standard ML Lists. ML Lists.2 Lists  A list is a finite sequence of elements. [3,5,9] ["a", "list" ] []  ML lists are immutable.  Elements.
Chapter 6 Lists and Dictionaries CSC1310 Fall 2009.
Python Mini-Course University of Oklahoma Department of Psychology Day 4 – Lesson 15 Tuples 5/02/09 Python Mini-Course: Day 4 – Lesson 15 1.
Chapter 7 Strings F To process strings using the String class, the StringBuffer class, and the StringTokenizer class. F To use the String class to process.
Finite Automata CPSC 388 Ellen Walker Hiram College.
Strings An extension of types A class that encompasses a character array and provides many useful behaviors Chapter 9 Strings are IMMUTABLE.
Functional Programming. Pure Functional Programming Computation is largely performed by applying functions to values. The value of an expression depends.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
Map and Fold Building Powerful Abstractions. Hello. I’m Zach, one of Sorin’s students.
Spring 2004 ECE569 Lecture ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
Getting Functional. 2 What is Functional Programming (FP)? In FP, Functions are first-class objects. That is, they are values, just like other objects.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
CS2110 Recitation 07. Interfaces Iterator and Iterable. Nested, Inner, and static classes We work often with a class C (say) that implements a bag: unordered.
Symbol Table (  ) Contents Map identifiers to the symbol with relevant information about the identifier All information is derived from syntax tree -
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
Lists in Python.
Functional Programming Shane Carr CSE 232, September 4, 2015.
Information and Computer Sciences University of Hawaii, Manoa
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
Scala Parallel Collections Aleksandar Prokopec, Tiark Rompf Scala Team EPFL.
EECS 110: Lec 5: List Comprehensions Aleksandar Kuzmanovic Northwestern University
Software Testing Input Space Partition Testing. 2 Input Space Coverage Four Structures for Modeling Software Graphs Logic Input Space Syntax Use cases.
Collections Oracle Database PL/SQL 10g Programming Chapter 6.
Functions and Methods. Definitions and types A function is a piece of code that takes arguments and returns a result A pure function is a function whose.
A Third Look At ML Chapter NineModern Programming Languages, 2nd ed.1.
Lists What to do?. Lists  A list is a linear arrangement of data elements.  Items are arranged in sequential (linear) order  Items are therefore ordered.
Scala Parallel Collections Aleksandar Prokopec EPFL.
Arrays. 2 Till now we are able to declare and initialize few variables Reality: need to compute on a large amount of data Arrays are data structures that.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
1 Searching and Sorting Searching algorithms with simple arrays Sorting algorithms with simple arrays –Selection Sort –Insertion Sort –Bubble Sort –Quick.
1 Introduction  Algorithms  Data structures  Abstract data types  Programming with lists and sets © 2008 David A Watt, University of Glasgow Algorithms.
(c) , University of Washington18a-1 CSC 143 Java Searching and Recursion N&H Chapters 13, 17.
Implementation of Database Systems, Jarek Gryz1 Evaluation of Relational Operations Chapter 12, Part A.
CMSC 202 Containers and Iterators. Container Definition A “container” is a data structure whose purpose is to hold objects. Most languages support several.
Getting Functional. Object-Oriented Programming in Scala Scala is object-oriented, and is based on Java’s model An object is a singleton object (there.
Midterm Review Tami Meredith. Primitive Data Types byte, short, int, long Values without a decimal point,..., -1, 0, 1, 2,... float, double Values with.
Haskell Chapter 5, Part II. Topics  Review/More Higher Order Functions  Lambda functions  Folds.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 14, Part A (Joins)
Sequences and for loops. Simple for loops A for loop is used to do something with every element of a sequence scala> for (i
Scala HW5 Part 2 Combine Until Encode Decode. Combine Not sure what a correct test case is Fork(Leaf('d',4),Fork(Leaf('a',2),Leaf('b',3),List('a ', 'b'),5),List('d',
Searching and Sorting Searching algorithms with simple arrays
Types CSCE 314 Spring 2016.
ML: a quasi-functional language with strong typing
Recitation 13 Searching and Sorting.
Dependence Analysis Important and difficult
Concurrency without Actors
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
Getting Functional.
Evaluation of Relational Operations
Building Java Programs
Lists in Python.
CSE 373 Data Structures and Algorithms
Getting Functional.
CSC1018F: Intermediate Python
Building Java Programs
CSE-321 Programming Languages Introduction to Functional Programming
Introduction to Spark.
CSCE 314: Programming Languages Dr. Dylan Shell
List Comprehensions Problem: given a list of prices, generate a new list that has a 20% discount to each. Formally: input: list of old prices; output:
Review Previously in: Lots of language features: functions, lists, records, tuples, variants, pattern matching Today: No new language features New idioms.
Presentation transcript:

Scala Parallel Collections Aleksandar Prokopec EPFL

Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) McDonald

Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) 1040 ms

Scala parallel collections for { s <- surnames n <- names if s endsWith n } yield (n, s)

Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s)

Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 2 cores 575 ms

Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 4 cores 305 ms

for comprehensions surnames.par.flatMap { s => names.par.filter(n => s endsWith n).map(n => (n, s)) }

for comprehensions nested parallelized bulk operations surnames.par.flatMap { s => names.par.filter(n => s endsWith n).map(n => (n, s)) }

Nested parallelism

Nested parallelism parallel within parallel composition surnames.par.flatMap { s => surnameToCollection(s) // may invoke parallel ops }

Nested parallelism going recursive def vowel(c: Char): Boolean =...

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield recursive algorithms

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array(""))

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array("")) 1545 ms

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray(""))

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 1 core 1575 ms

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 2 cores 809 ms

Nested parallelism going recursive def vowel(c: Char): Boolean =... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 4 cores 530 ms

So, I just use par and I’m home free?

How to think parallel

Character count use case for foldLeft val txt: String =... txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }

Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right - not parallelizable! ABCDEF _ + 1

Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right – not really necessary 3210 ABC _ DEF _ + _ 6

Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }

Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3211 ABC _ ABC : (Int, Char) => Int

Character count fold not applicable txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3213 ABC _ + _ ABC ! (Int, Int) => Int

Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _)

3211 ABC Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + _ ABC _ + 1

Character count use case for aggregate aggregation  element 3211 ABC _ + _ ABC txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) B _ + 1

Character count use case for aggregate aggregation  aggregation aggregation  element 3211 ABC _ + _ ABC txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) B _ + 1

Word count another use case for foldLeft txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) }

Word count initial accumulation txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } 0 words so farlast character was a space “Folding me softly.”

Word count a space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character is a space

Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character was a space – a new word

Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character wasn’t a space – no new word

Word count in parallel “softly.““Folding me “ P1P2

Word count in parallel “softly.““Folding me “ wc = 2; rs = 1wc = 1; ls = 0  P1P2

Word count in parallel “softly.““Folding me “ wc = 2; rs = 1wc = 1; ls = 0  wc = 3 P1P2

Word count must assume arbitrary partitions “g me softly.““Foldin“ wc = 1; rs = 0wc = 3; ls = 0  P1P2

Word count must assume arbitrary partitions “g me softly.““Foldin“ wc = 1; rs = 0wc = 3; ls = 0  P1P2 wc = 3

Word count initial aggregation txt.par.aggregate((0, 0, 0))

Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left# spaces on the right#words

Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left# spaces on the right#words ””

Word count aggregation  aggregation... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res “““Folding me“  “softly.“““ 

Word count aggregation  aggregation... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) “e softly.“ “Folding m“ 

Word count aggregation  aggregation... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) “ softly.““Folding me” 

Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) ”_””_” 0 words and a space – add one more space each side

Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) ” m” 0 words and a non-space – one word, no spaces on the right side

Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) ” me_” nonzero words and a space – one more space on the right side

Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) ” me sof” nonzero words, last non-space and current non-space – no change

Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) ” me s” nonzero words, last space and current non-space – one more word

Word count in parallel txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })

Word count using parallel strings? txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })

Word count string not really parallelizable scala> (txt: String).par

Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…)

Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation!

Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray

Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray  copy string contents into an array

Conversions going parallel // par is efficient – no copying mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet}

Conversions going parallel // par is efficient – no copying mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet} most other collections construct a new parallel collection!

Conversions going parallel sequentialparallel Array, ArrayBuffer, ArraySeqmutable.ParArray mutable.HashMapmutable.ParHashMap mutable.HashSetmutable.ParHashSet immutable.Vectorimmutable.ParVector immutable.Rangeimmutable.ParRange immutable.HashMapimmutable.ParHashMap immutable.HashSetimmutable.ParHashSet

Custom collections

Custom collection class ParString(val str: String)

Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] {

Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length

Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str)

Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter: Splitter[Char]

Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter = new ParStringSplitter(0, str.length)

Custom collection splitter definition class ParStringSplitter(var i: Int, len: Int) extends Splitter[Char] {

Custom collection splitters are iterators class ParStringSplitter(i: Int, len: Int) extends Splitter[Char] { def hasNext = i < len def next = { val r = str.charAt(i) i += 1 r }

Custom collection splitters must be duplicated... def dup = new ParStringSplitter(i, len)

Custom collection splitters know how many elements remain... def dup = new ParStringSplitter(i, len) def remaining = len - i

Custom collection splitters can be split... def psplit(sizes: Int*): Seq[ParStringSplitter] = { val splitted = new ArrayBuffer[ParStringSplitter] for (sz <- sizes) { val next = (i + sz) min ntl splitted += new ParStringSplitter(i, next) i = next } splitted }

Word count now with parallel strings new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })

Word count performance txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) }) 100 ms cores: time: 137 ms 70 ms 35 ms

Hierarchy GenTraversable GenIterable GenSeq Traversable Iterable Seq ParIterable ParSeq

Hierarchy def nonEmpty(sq: Seq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }

Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }

Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized!

Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized! ParSeq Seq

Hierarchy def nonEmpty(sq: GenSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res.synchronized { res += s } res }

Thank you! Examples at: git://github.com/axel22/sd.git

Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …

Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … These return collections!

Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders

Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders Parallel collections – combiners

Builders building a sequential collection Nil 246 ListBuilder += result

Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N : To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] }

Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N : To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } Combiner

Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N : To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } either use an efficient merge operation or do lazy evaluation

Parallel arrays 1, 2, 3, 45, 6, 7, 8 2, 46, 8 3, 1, 8, 02, 2, 1, 9 8, 02, 2 merge copy allocate

Parallel hash tables ParHashMap

Parallel hash tables ParHashMap e.g. calling filter

Parallel hash tables ParHashMap ParHashCombiner e.g. calling filter

Parallel hash tables ParHashMap ParHashCombiner

Parallel hash tables ParHashMap ParHashCombiner

Parallel hash tables ParHashMap ParHashCombiner How to merge?

Parallel hash tables buckets! ParHashCombiner ParHashMap 2 0 = = =

Parallel hash tables ParHashCombiner combine

Parallel hash tables ParHashCombiner no copying!

Parallel hash tables ParHashCombiner

Parallel hash tables ParHashMap