Eclipse Collections An Inside Look March 2016 Copyright © 2016 Goldman Sachs. All rights reserved.

Slides:



Advertisements
Similar presentations
Lists and the Collection Interface Chapter 4. Chapter Objectives To become familiar with the List interface To understand how to write an array-based.
Advertisements

JAVA Programming (Session 7) When you are willing to make sacrifices for a great cause, you will never be alone. Instructor:
Containers CMPS Reusable containers Simple data structures in almost all nontrivial programs Examples: vectors, linked lists, stacks, queues, binary.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Map Collections and Custom Collection Classes Chapter 14.
Singleton vs utility class  at first glance, the singleton pattern does not seem to offer any advantages to using a utility class  i.e., a utility class.
The Assembly Language Level
Hashing as a Dictionary Implementation
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
TECHNOLOGY DIVISION 1 GS Collections and Java 8 Functional, Fluent, Friendly & Fun! GS.com/Engineering Fall, 2014 Donald Raab Craig Motlin.
1 Investment Banking Investment Management Securities Our Business.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Lists and the Collection Interface Chapter 4. Chapter 4: Lists and the Collection Interface2 Chapter Objectives To become familiar with the List interface.
Fall 2007CS 2251 Lists and the Collection Interface Chapter 4.
Alice in Action with Java
Hash Tables1 Part E Hash Tables  
CSE373 Optional Section Java Collections 11/12/2013 Luyi Lu.
SEG4110 – Advanced Software Design and Reengineering TOPIC G Java Collections Framework.
Sets and Maps Part of the Collections Framework. The Set interface A Set is unordered and has no duplicates Operations are exactly those for Collection.
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
Liang, Introduction to Java Programming, Sixth Edition, (c) 2007 Pearson Education, Inc. All rights reserved Chapter 22 Java Collections.
Java™ How to Program, 9/e Presented by: Dr. José M. Reyes Álamo © Copyright by Pearson Education, Inc. All Rights Reserved.
This presentation reflects information available to the Technology Division of Goldman Sachs only and not any other part of Goldman Sachs. It should not.
(c) University of Washington14-1 CSC 143 Java Collections.
1 Concrete collections II. 2 HashSet hash codes are used to organize elements in the collections, calculated from the state of an object –hash codes are.
Big Java Chapter 16.
111 © 2002, Cisco Systems, Inc. All rights reserved.
Collections. The Plan ● Why use collections? ● What collections are available? ● How are the collections different? ● Examples ● Practice.
Data structures Abstract data types Java classes for Data structures and ADTs.
Java 5 Part 1 CSE301 University of Sunderland Harry Erwin, PhD.
Data structures and algorithms in the collection framework 1.
Sets, Maps and Hash Tables. RHS – SOC 2 Sets We have learned that different data struc- tures have different advantages – and drawbacks Choosing the proper.
Scala Collections Performance GS.com/Engineering March, 2015 Craig Motlin.
3-February-2003cse Collections © 2003 University of Washington1 Java Collections CSE 403, Winter 2003 Software Engineering
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
The Map ADT and Hash Tables. 2 The Map ADT  Map: An abstract data type where a value is "mapped" to a unique key  Need a key and a value to insert new.
CSC 427: Data Structures and Algorithm Analysis
Author: Takdir, S.ST. © Sekolah Tinggi Ilmu Statistik.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
Building Java Programs Bonus Slides Hashing. 2 Recall: ADTs (11.1) abstract data type (ADT): A specification of a collection of data and the operations.
Collections Data structures in Java. OBJECTIVE “ WHEN TO USE WHICH DATA STRUCTURE ” D e b u g.
Copyright 2010 by Pearson Education Building Java Programs Chapter 10, 11 Lecture 22: 143 Preview optional reading: 10.1,
TECHNOLOGY DIVISION 1 Eclipse Collections by Example GS.com/Engineering Fall, 2015 Hiroshi Ito Donald Raab.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Hash Tables From “Algorithms” (4 th Ed.) by R. Sedgewick and K. Wayne.
STL – Standard Template Library L. Grewe. 2 Goals Lots of important algorithms, data structures in CS using Templates. is a software library partially.
1 the hash table. hash table A hash table consists of two major components …
More Java: Static and Final, Abstract Class and Interface, Exceptions, Collections Framework 1 CS300.
Hash Tables and Hash Maps. DCS – SWC 2 Hash Tables A Set and a Map are both abstract data types – we need a concrete implemen- tation in order to use.
Hashing O(1) data access (almost) -access, insertion, deletion, updating in constant time (on average) but at a price… references: Weiss, Goodrich & Tamassia,
Nov 22, 2010IAT 2651 Java Collections. Nov 22, 2010IAT 2652 Data Structures  With a collection of data, we often want to do many things –Organize –Iterate.
Maps Nick Mouriski.
Week 9 - Friday.  What did we talk about last time?  Collisions  Open addressing ▪ Linear probing ▪ Quadratic probing ▪ Double hashing  Chaining.
CMSC 202 Containers and Iterators. Container Definition A “container” is a data structure whose purpose is to hold objects. Most languages support several.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
Java Methods A & AB Object-Oriented Programming and Data Structures Maria Litvin ● Gary Litvin Copyright © 2006 by Maria Litvin, Gary Litvin, and Skylight.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Collections Dwight Deugo Nesa Matic
Hashing By Emily Nelson. The Official Definition Using a hash function to turn some kind of data in relatively small integers or Strings The “hash code”
Lecture 9:FXML and Useful Java Collections Michael Hsu CSULA.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
Building Java Programs Generics, hashing reading: 18.1.
Lists and the Collection Interface Chapter 4. Chapter 4: Lists and the Collection Interface2 Chapter Objectives To become familiar with the List interface.
Robots for the Kid in All of Us
Sixth Lecture ArrayList Abstract Class and Interface
CSE 373 Separate chaining; hash codes; hash maps
Presentation transcript:

Eclipse Collections An Inside Look March 2016 Copyright © 2016 Goldman Sachs. All rights reserved.

A GENDA Agenda What is Eclipse Collections? Eclipse Collections vs others Eclipse Collections by examples UnifiedMap : The memory saver UnifiedSet : It is a Set not a Map 2

What is Eclipse Collections? Copyright © 2016 Goldman Sachs. All rights reserved.

E CLIPSE C OLLECTIONS What is Eclipse Collections Eclipse Collections −Feature rich Java Collections framework Eclipse Foundation −Eclipse Collections released under EDL and EPLEDLEPL Open for contributionscontributions Eclipse Collections Kata −Tutorial developed to learn the framework 4

Eclipse Collections vs Others Copyright © 2016 Goldman Sachs. All rights reserved.

E CLIPSE C OLLECTIONS VS OTHERS Eclipse Collections vs Others 6 FeaturesEC 7.0.0Java 8GuavaTroveScala Rich API  InterfacesReadable, Mutable, Immutable, FixedSize, Lazy Mutable, StreamMutable, FluentMutableReadable, Mutable, Immutable, Lazy Optimized Set & Map  (+Bag)  Immutable Collections  Primitive Collections  (+Bag, +Immutable)  Multimaps  (+Bag, +SortedBag)  (+Linked)(Multimap trait) Bags (Multisets)  BiMaps  Iteration StylesEager/Lazy, Serial/Parallel Lazy, Serial/Parallel Lazy, Serial Eager, Serial Eager/Lazy, Serial/Parallel (Lazy Only)

E CLIPSE C OLLECTIONS VS J AVA 8 S TREAMS Eclipse Collections vs Java 8 Streams 7 Eclipse Collections  Eager & Lazy, Serial & Parallel  Memory efficient containers  Primitive containers (all 8)  Immutable containers  More container types  More iteration patterns  “With” method patterns  “target” method patterns  Covariant return types  Java 5+ compatible Eclipse Collections  Eager & Lazy, Serial & Parallel  Memory efficient containers  Primitive containers (all 8)  Immutable containers  More container types  More iteration patterns  “With” method patterns  “target” method patterns  Covariant return types  Java 5+ compatible Java 8 Streams -Functional APIs -Lazy only -Single use -Serial & Parallel -Primitive streams (3 types) -Extensible Collectors Java 8 Streams -Functional APIs -Lazy only -Single use -Serial & Parallel -Primitive streams (3 types) -Extensible Collectors

Eclipse Collections by Example Copyright © 2016 Goldman Sachs. All rights reserved.

UnifiedMap: The memory saver Copyright © 2016 Goldman Sachs. All rights reserved.

H ASH M AP HashMap 10 JDK HashMap is backed by an Entry Entry holds key, value, next and hash transient Node [] table Node implements Map.Entry

U NIFIED M AP UnifiedMap EC UnifiedMap is backed by an array protected transient Object[] table Key, value in consecutive slots improves cache locality 11

U NIFIED M AP The Unification /** * Entry objects are not stored in the table * like in java.util.HashMap. Instead of trying * to deal with collisions in the main array * using Entry objects, we put a special object * in the key slot and put a regular Object[] * in the value slot. The array contains the * key value pairs in consecutive slots, just * like the main array, but it's a linear list * with no hashing. * The final result is a Map implementation * that's leaner than java.util.HashMap and * faster than Trove's THashMap. The best of * both approaches unified together, and thus * the name UnifiedMap. */ 12

U NIFIED M AP UnifiedMap public V get(Object key) { int index = this.index(key); Object cur = this.table[index]; if (cur != null) { Object val = this.table[index + 1]; if (cur == CHAINED_KEY) { return this.getFromChain((Object[])val,(K)key); } if (this.nonNullTableObjectEquals(cur, (K) key)) { return (V) val; } } return null; } 13

M EMORY C OMPARISON Comparing Maps Save half the memory 14

P ERFORMANCE C OMPARISON Comparing Maps Map.get() JMH Tests ElementOps/sec – higher the better 15

P ERFORMANCE C OMPARISON Comparing Maps Map.put() JMH Tests (non-presized) ElementOps/sec – higher the better 16

P ERFORMANCE C OMPARISON Comparing Maps Map.put() JMH Tests (presized) ElementOps/sec – higher the better 17

UnifiedSet: It is a Set not a Map Copyright © 2016 Goldman Sachs. All rights reserved.

H ASH S ET HashSet 19 JDK HashSet is backed by HashMap private transient HashMap map Values in each (key, value) pair are a waste of space Key, value pairs in HashMap are wrapped in an Entry leading to additional memory wastage

U NIFIED S ET UnifiedSet 20 EC UnifiedSet is backed by an array protected transient Object[] table Not backed by Map

M EMORY C OMPARISON Comparing Sets Save 4x the memory 21

Learn more at GS.com/Engineering © 2016 Goldman Sachs. This presentation should not be relied upon or considered investment advice. Goldman Sachs does not warrant or guarantee to anyone the accuracy, completeness or efficacy of this presentation, and recipients should not rely on it except at their own risk. This presentation may not be forwarded or disclosed except with this disclaimer intact.

Appendix – More Features Copyright © 2016 Goldman Sachs. All rights reserved.

M ORE F EATURES The “as” methods (O(1) cost) 24

M ORE F EATURES The “to” methods (O(n) cost) 25

M ORE F EATURES Handling Exceptions 26 Using Java Collections Using Eclipse Collections

Appendix – Primitive Specialization Copyright © 2016 Goldman Sachs. All rights reserved.

P RIMITIVE S PECIALIZATION Primitive Object Boxing is expensive −Reference + Header + alignment −Wrapper classes are immutable Unboxing is expensive 28

P RIMITIVE S PECIALIZATION Primitive Object From Oracle Java guide “You can’t put an int (or other primitive value) into a collection. Collection s can only hold object references, so you have to box primitive values into the appropriate wrapper class. An Integer is not a substitute for an int ; autoboxing and unboxing blur the distinction between primitive types and reference types, but they do not eliminate it.” Source:

P RIMITIVE S PECIALIZATION List Java has object and primitive arrays Java 8 has primitive Stream s −They cannot be stored in a field −They are not reusable Java does not have primitive List s, Set s, Map s 30

P RIMITIVE S PECIALIZATION Primitive Lists 31 intList longList shortList floatList doubleList charList byteList booleanList Primitive Sets are similar Primitive Maps have combination of primitives

P RIMITIVE S PECIALIZATION Primitive Lists – Memory Comparison 32

P RIMITIVE S PECIALIZATION Primitive Sets – Memory Comparison 33

P RIMITIVE S PECIALIZATION Primitive Maps – Performance Comparison 34

P RIMITIVE S PECIALIZATION Primitive Maps – Memory Comparison 35

Appendix – Resources Copyright © 2016 Goldman Sachs. All rights reserved.

R ESOURCES Resources 37 Eclipse Collections Home Page Eclipse Collections on GitHub GS Collections Memory Benchmark collections/presentations/GSC_Memory_Tests.pdf Conference Talks and Meetups collections/wiki/Conference-talks-and-meetups