Relativistic Red Black Trees. Relativistic Programming Concurrent reading and writing improves performance and scalability – concurrent readers may disagree.

Slides:



Advertisements
Similar presentations
Chapter 10: Data Structures II
Advertisements

Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Chapter 4: Trees Part II - AVL Tree
Topic 23 Red Black Trees "People in every direction No words exchanged No time to exchange And all the little ants are marching Red and black antennas.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
AA Trees another alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A balanced.
A balanced life is a prefect life.
10/31/20111 Relativistic Red-Black Trees Philip Howard 10/31/2011
BST Data Structure A BST node contains: A BST contains
Self-Balancing Search Trees Chapter 11. Chapter 11: Self-Balancing Search Trees2 Chapter Objectives To understand the impact that balance has on the performance.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Self-Balancing Search Trees Chapter 11. Chapter Objectives  To understand the impact that balance has on the performance of binary search trees  To.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Important Problem Types and Fundamental Data Structures
Binary Search Trees Chapter 7 Objectives
C o n f i d e n t i a l HOME NEXT Subject Name: Data Structure Using C Unit Title: Trees.
Binary Trees Chapter 6.
By : Budi Arifitama Pertemuan ke Objectives Upon completion you will be able to: Create and implement binary search trees Understand the operation.
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
1 Multiway trees & B trees & 2_4 trees Go&Ta Chap 10.
Priority Queues and Heaps Bryce Boe 2013/11/20 CS24, Fall 2013.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
CHAPTER 71 TREE. Binary Tree A binary tree T is a finite set of one or more nodes such that: (a) T is empty or (b) There is a specially designated node.
Data Structures - CSCI 102 Binary Tree In binary trees, each Node can point to two other Nodes and looks something like this: template class BTNode { public:
Compiled by: Dr. Mohammad Alhawarat BST, Priority Queue, Heaps - Heapsort CHAPTER 07.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
AVL Trees Amanuel Lemma CS252 Algoithms Dec. 14, 2000.
Analysis of Red-Black Tree Because of the rules of the Red-Black tree, its height is at most 2log(N + 1). Meaning that it is a balanced tree Time Analysis:
Introduction to Algorithms Jiafen Liu Sept
Balanced Trees (AVL and RedBlack). Binary Search Trees Optimal Behavior ▫ O(log 2 N) – perfectly balanced tree (e.g. complete tree with all levels filled)
CSIT 402 Data Structures II
Beyond (2,4) Trees What do we know about (2,4)Trees? Balanced
Chapter 13 B Advanced Implementations of Tables – Balanced BSTs.
1 Balanced Trees There are several ways to define balance Examples: –Force the subtrees of each node to have almost equal heights –Place upper and lower.
1 Data Structures and Algorithms Searching Red-Black and Other Dynamically BalancedTrees.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Red–black trees.  Define the red-black tree properties  Describe and implement rotations  Implement red-black tree insertion  We will skip red-black.
1 Joe Meehean.  We wanted a data structure that gave us... the smallest item then the next smallest then the next and so on…  This ADT is called a priority.
© 2006 Pearson Education Chapter 10: Non-linear Data Structures Presentation slides for Java Software Solutions for AP* Computer Science A 2nd Edition.
Week 8 - Wednesday.  What did we talk about last time?  Level order traversal  BST delete  2-3 trees.
CS 367 Introduction to Data Structures Lecture 9.
1 Chapter 7 Objectives Upon completion you will be able to: Create and implement binary search trees Understand the operation of the binary search tree.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
Rooted Tree a b d ef i j g h c k root parent node (self) child descendent leaf (no children) e, i, k, g, h are leaves internal node (not a leaf) sibling.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Lecture 10COMPSCI.220.FS.T Binary Search Tree BST converts a static binary search into a dynamic binary search allowing to efficiently insert and.
CS 367 Introduction to Data Structures Lecture 8.
1 Joe Meehean. A A B B D D I I C C E E X X A A B B D D I I C C E E X X  Terminology each circle is a node pointers are edges topmost node is the root.
CS 307 Fundamentals of Computer ScienceRed Black Trees 1 Topic 19 Red Black Trees "People in every direction No words exchanged No time to exchange And.
1 Binary Search Trees  Average case and worst case Big O for –insertion –deletion –access  Balance is important. Unbalanced trees give worse than log.
Read-Copy-Update Synchronization in the Linux Kernel 1 David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis.
Red-Black Trees an alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A.
Binary Search Trees Chapter 7 Objectives
AA Trees.
File Organization and Processing Week 3
Multiway Search Trees Data may not fit into main memory
B+-Trees.
B+-Trees.
Balanced Trees (AVL and RedBlack)
Red-Black Trees.
Binary Tree and General Tree
Kernel Synchronization II
Multi-Way Search Trees
2-3-4 Trees Red-Black Trees
Balanced binary search trees
Presentation transcript:

Relativistic Red Black Trees

Relativistic Programming Concurrent reading and writing improves performance and scalability – concurrent readers may disagree on the order of concurrent updates – orders may be non-linearizable – is this OK? Relativistic programming provides tools for enforcing the order that is necessary for correctness – linearizability is not always necessary!

The Natural World is Relativistic

Implications for Parallel Computing Communication over distance takes time – recipients at different distances will receive information/results at different times – potential to receive information out of order Forcing all recipients to agree on the order can only be done by delaying receipt – i.e. nobody gets it until the slowest has received it – delays slow down computation – the approach is inherently non-scalable

The Natural World is also Causal

Implications for Parallel Computing Scalability is all about allowing local computation to proceed unhindered – but violations of causality are confusing Delays can be introduced to preserve causality – i.e., if two updates are causally related, we can ensure that all readers see them in their correct order (the order the programmer specified) – if they are unrelated, then don’t force all to agree on an order: its unnecessary and expensive

Simple (atomic) Operations List delete operation - readers either see it or they don't - no ordering problems - no incoherency

Complex (non-atomic) Operations List move operation - readers might see before state, after state, or two during states

Dealing with Complex Operations What options do we have for list move? – do we add new before removing old? – do we remove old before adding new? What can concurrent readers observe? – both old and new? – neither old nor new? – old but not new? – new but not old? Which options are OK, and how can we enforce them?

Hiding Incorrect States If its OK to see either old, or new, or both, then we must hide the “neither” state – any reader that fails to find the old must find the new – is it enough to insert the new before removing the old? – if the new appears earlier in the list than the old then we need to wait for readers before we delete the old! If we must only see the before or after state then we need atomic transactions (later)

Relativistic Programming Primitives Generalization of RCU primitives – write-lock, write-unlock for synchronization among writers – start-read, end-read to delimit read sections – wait-for-readers to wait for all pre-existing readers – deferred-free to safely reclaim memory – publish, read to remove reordering problems

Rules for Relativistic Programming Writers must keep data in a continually consistent state – a reader must be able to safely traverse a data structure at any time – individual updates must take effect at a well-defined point with respect to a concurrent reader this is like linearizability but it does not necessarily imply that all readers must agree on the order of unrelated updates!

Rules for Relativistic Programming When updates must be seen in order, writers must insert the appropriate delay – wait-for-readers – sometimes rp-read is enough Readers must delimit their read sections and not hold references between read sections – just like conventional locking rules To simplify CPU and compiler reordering issues readers must use the rp-read to access shared data writers update shared data using the rp-publish

Rules for Relativistic Programming Sometimes writers can reason about read traversal order and avoid using wait-for-readers – readers will naturally see updates in order even without it – example: moving an element from earlier to later in a singly linked list, by copying then removing, guarantees that all readers will see one or the other or both

But How Generally Useful is This? We have some primitives and a few simple rules They work well for simple list operations They also work well for hash-tables But is this enough for more complex data structures?

Red Black Trees RB-trees store sorted pairs They guarantee O(log(N)) performance for inserts, deletes, and lookups – they limit the depth of the tree (partially balanced) – updates require restructuring of the tree They are very difficult to parallelize – difficult to avoid deadlock with per-node locking – most implementations use a single global lock – they are a stiff challenge for Relativistic Programming!

Red Black Tree Properties All nodes on the left branch of a subtree have a key less than that of the root of the subtree All nodes on the right branch of a subtree have a key greater or equal to that of the root Each node has a color (red or black) Both children of a red node are black The black depth of every leaf is the same – this is the balance property

Rebalancing Updates potentially require the tree to be restructured to maintain its balance properties – changes can affect any node between the update and root – locking requires a lock for each affected node acquiring locks on the way up can deadlock with readers that are descending acquiring all possible locks ahead of time degenerates to coarse grain locking – conventional approaches use a single lock for the tree no concurrency

Restructuring is also a concern for RP A thread searching for B could fail to find it

Restructuring is also a concern for RP A thread searching for B could fail to find it

Lookups Readers ignore color and do not access parent pointers Readers stop searching when they find a key that matches Updates can change colors and place multiple copies in the tree so long as they appear in valid sort positions, without affecting readers Relativistic lookups are essentially sequential code!

Inserts A new node is always inserted as a leaf Concurrent readers will see it if they dereference its pointer before the update publishes the pointer If the insert breaks the tree’s color or balance properties it must be recolored or rebalanced – that's the! tricky part

Deletes Nodes only deleted from the bottom of the tree – this may require some restructuring (called a swap) Readers will either see the deleted node, or they will not, depending on when they dereference its pointer The node's memory must not be reclaimed while readers are still accessing it – use rp-free If the delete leaves the tree unbalanced it must be restructured

Swap and Deletion of Node B Before During After Move B's next node (C) to B's position - create new copy of C (C') - assign B's children to C' - give C' B's parent (B is now unreachable) - defer free of B's memory Defer deletion of C - wait for readers, then reassign C's children to C's parent (E) Defer free of C's memory

Code for Swap

Special Case Swap No copy is necessary C adopts B's left child (A) A appears twice in the tree (its temporarily a DAG) - this does not affect readers Defer free of B

Code for Special Case Swap

Diagonal Restructure

Code for Diagonal Restructure

Zig Restructure

Code for Zig Restructure

Read (lookup) Performance

Sequential Write (insert/delete) Performance

Write-side Synchronization Global locking – no concurrent writes Fine-grain locking – deadlock CCAVL – concurrent, but different data structure STM - transactional memory – disjoint access parallelism (to be discussed later)

Concurrent Write (insert/delete) Performance

Concurrent Read (lookup) Performance

Linearizability Lookup, insert and delete operations are linearizable – there is a well defined point at which they take effect – they are primitive operations with only one update (can’t be seen out of order!)

Linearizability Lookups – last rp-read is the linearization point Inserts – rp-publish is the linearization point Deletes – store is the linearization point Traversals are not necessarily linearizable

Traversals Traversals visit the nodes in order – they make use of the next operation Traversals are challenging for RP because they read more than one location – restructures might cause nodes to be missed or duplicated

Traversals Three approaches explored: – treat a traversal as a single indivisible operation acquire a lock and hold it for the duration replace the write lock with a reader-writer lock – O (N log (N)) relativistic traversals can't use parent pointers use relativistic lookups to construct the traversal (traversal take O (N log (N))) updates only wait for lookups traversal is non-linearizable

Traversals Third approach: O(N) relativistic traversal – requires modification of update operations to allow readers to use parent pointers – requires additional node copies to preserve the parent pointers

Traversals

Conclusions Relativistic programming can be applied to a data structure as complex as a Red-Black Tree with excellent performance and scalability results

Spare Slides

Reader-visible States in Swap

Reader-visible States in Diagonal Restructure

Reader-visible States in Zig Restructure