A Transaction-Friendly Dynamic Memory Manager for Embedded Multicore Systems Maurice Herlihy Joint with Thomas Carle, Dimitra Papagiannopoulou Iris Bahar,

Slides:



Advertisements
Similar presentations
Transactional Memory Parag Dixit Bruno Vavala Computer Architecture Course, 2012.
Advertisements

The many faces of TM Tim Harris. Granularity Distributed, large-scale atomic actions Composable shared memory data structures Leaf shared memory data.
Maurice Herlihy (DEC), J. Eliot & B. Moss (UMass)
Raphael Eidenbenz Roger Wattenhofer Roger Wattenhofer Good Programming in Transactional Memory Game Theory Meets Multicore Architecture.
Dynamic Memory Allocation (also see pointers lectures) -L. Grewe.
Steal-on-abort Improving Transactional Memory Performance through Dynamic Transaction Reordering Mohammad Ansari University of Manchester.
Transactional Locking Nir Shavit Tel Aviv University (Joint work with Dave Dice and Ori Shalev)
Transactional Memory Supporting Large Transactions Anvesh Komuravelli Abe Othman Kanat Tangwongsan Hardware-based.
McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science The Implementation of the Cilk-5 Multithreaded Language (Frigo, Leiserson, and.
Pessimistic Software Lock-Elision Nir Shavit (Joint work with Yehuda Afek Alexander Matveev)
Transactional Memory Overview Olatunji Ruwase Fall 2007 Oct
Nested Parallelism in Transactional Memory Kunal Agrawal, Jeremy T. Fineman and Jim Sukha MIT.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation.
TOWARDS A SOFTWARE TRANSACTIONAL MEMORY FOR GRAPHICS PROCESSORS Daniel Cederman, Philippas Tsigas and Muhammad Tayyab Chaudhry.
Lock-free Cuckoo Hashing Nhan Nguyen & Philippas Tsigas ICDCS 2014 Distributed Computing and Systems Chalmers University of Technology Gothenburg, Sweden.
1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
[ 1 ] Agenda Overview of transactional memory (now) Two talks on challenges of transactional memory Rebuttals/panel discussion.
Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei.
University of Michigan Electrical Engineering and Computer Science 1 Parallelizing Sequential Applications on Commodity Hardware Using a Low-Cost Software.
Lecture 36: Programming Languages & Memory Management Announcements & Review Read Ch GU1 & GU2 Cohoon & Davidson Ch 14 Reges & Stepp Lab 10 set game due.
Selfishness in Transactional Memory Raphael Eidenbenz, Roger Wattenhofer Distributed Computing Group Game Theory meets Multicore Architecture.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
Memory Layout C and Data Structures Baojian Hua
Why The Grass May Not Be Greener On The Other Side: A Comparison of Locking vs. Transactional Memory Written by: Paul E. McKenney Jonathan Walpole Maged.
An Integrated Hardware-Software Approach to Transactional Memory Sean Lie Theory of Parallel Systems Monday December 8 th, 2003.
Real-Time Concepts for Embedded Systems Author: Qing Li with Caroline Yao ISBN: CMPBooks.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
Automatic Data Partitioning in Software Transactional Memories Torvald Riegel, Christof Fetzer, Pascal Felber (TU Dresden, Germany / Uni Neuchatel, Switzerland)
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.
Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT.
Storage Bindings Allocation is the process by which the memory cell or collection of memory cells is assigned to a variable. These cells are taken from.
State Teleportation How Hardware Transactional Memory can Improve Legacy Data Structures Maurice Herlihy and Eli Wald Brown University.
JVSTM and its applications João Software Engineering Group.
Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
CS162 Week 5 Kyle Dewey. Overview Announcements Reactive Imperative Programming Parallelism Software transactional memory.
A Methodology for Creating Fast Wait-Free Data Structures Alex Koganand Erez Petrank Computer Science Technion, Israel.
CS510 Concurrent Systems Why the Grass May Not Be Greener on the Other Side: A Comparison of Locking and Transactional Memory.
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
© 2008 Multifacet ProjectUniversity of Wisconsin-Madison Pathological Interaction of Locks with Transactional Memory Haris Volos, Neelam Goyal, Michael.
How D can make concurrent programming a piece of cake Bartosz Milewski D Programming Language.
Hany E. Ramadan and Emmett Witchel University of Texas at Austin TRANSACT 2/15/2009 The Xfork in the Road to Coordinated Sibling Transactions.
Efficient Dynamic Heap Allocation of Scratch-Pad Memory Ross McIlroy, Peter Dickman and Joe Sventek Carnegie Trust for the Universities of Scotland.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
1 Dynamic Memory Allocation. 2 In everything we have done so far, our variables have been declared at compile time. In these slides, we will see how to.
By Anand George SourceLens.org Copyright. All rights reserved. Content Owner - Meera R (meera at sourcelens.org)
On Transactional Memory, Spinlocks and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton (University of Wisconsin Madison)
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
4 November 2005 CS 838 Presentation 1 Nested Transactional Memory: Model and Preliminary Sketches J. Eliot B. Moss and Antony L. Hosking Presented by:
Commutativity and Coarse-Grained Transactions Maurice Herlihy Brown University Joint work with Eric Koskinen and Matthew Parkinson (POPL 10)
Novel Paradigms of Parallel Programming Prof. Smruti R. Sarangi IIT Delhi.
Ran Liu (Fudan Univ. Shanghai Jiaotong Univ.)
Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
Irina Calciu Justin Gottschlich Tatiana Shpeisman Gilles Pokam
Algorithmic Improvements for Fast Concurrent Cuckoo Hashing
PHyTM: Persistent Hybrid Transactional Memory
Faster Data Structures in Transactional Memory using Three Paths
Dynamic Memory Allocation
Jonathan Mak & Alan Mycroft University of Cambridge
Concurrency: Mutual Exclusion and Process Synchronization
Transactions with Nested Parallelism
Concurrency in Smart Contracts
CS703 – Advanced Operating Systems
Dynamic Performance Tuning of Word-Based Software Transactional Memory
Presentation transcript:

A Transaction-Friendly Dynamic Memory Manager for Embedded Multicore Systems Maurice Herlihy Joint with Thomas Carle, Dimitra Papagiannopoulou Iris Bahar, Tali Moreshet

Modern Embedded Systems now many core, distributed memory Parallel data structures Locks don’t scale Transactions do (más o menos) 2

Dynamic Memory Management 3 Software’s « oldest profession> Usually provided by OS/Libraries For parallel data structures on embedded platforms with HTM

High-End Embedded Systems 4 simplicity small memory footprint resource needs roughly known in advance but not always exactly

Dynamic Memory Management Heap is linked list or binary tree Applications explicitly malloc() and free() memory 5 Size: 16 bytes Start: 0x Size: 64 bytes Start: 0x

Principles of dynamic memory management Allocate a 32-byte chunk 6 Size: 16 bytes Start: 0x Size: 64 bytes Start: 0x Too small Large enough

Dynamic Memory Management Allocate a 32-byte chunk 7 Size: 16 bytes Start: 0x Size: 64 bytes Start: 0x

Dynamic Memory Management Allocate a 32-byte chunk 8 Size: 16 bytes Start: 0x Size: 64 bytes Start: 0x Size: 32 Bytes Start: 0x Size: 32 Bytes Start: 0x

Principles of dynamic memory management Allocate a 32-byte chunk 9 Size: 16 bytes Start: 0x Size: 64 bytes Start: 0x Size: 32 Bytes Start: 0x Size: 32 Bytes Start: 0x

Principles of dynamic memory management Allocate a 32-byte chunk 10 Size: 16 bytes Start: 0x Size: 32 Bytes Start: 0x Size: 32 Bytes Start: 0x Freeing is similar …

Dynamic Memory Management is a concurrent data structure problem steps must be atomic 11

Fast Path Thread-local pools successful malloc() and free() need no synchronization Calls happen inside transaction Speculative allocation can speed up the execution 12

Slow Path If local pool exhausted, must allocate from shared heap Allocate within transaction means more conflicts Instead: 1.abort current transaction 2.allocate fresh local pool 3.restart transaction 13

Benefits Transactions have smaller footprints Disentangles application shared heap management Transactions more likely to commit 14

Transaction-friendly memory management At application startup: 1.One thread initializes the heap 2.Each thread allocates its own local pool 15

Transaction-friendly dynamic memory management 16 Enter transaction Next Instruction ? malloc() Allocate from local pool Enough memory? Return allocated chunk yes Abort transaction Enter allocation transaction Free remaining pool and allocate fresh pool Commit allocation transaction no

Transaction-friendly dynamic memory management 17 Enter transaction Next Instruction ? free() Free to local pool

Evaluation STAMP Vacation benchmark 1, 2, 4, 8 and 16 cores Local pools large enough so no refill needed Tested: transactional with local pools lock-based with local pools lock-based without local pools 18

Evaluation Vacation benchmark (Normal run) 19 worse better

Evaluation (2) Vacation benchmark (Refill mechanism on one core) 8 cores local pool sizes: 64, 128, 256, 512, 1024, 2048 bytes Transactional synchronization 20

Evaluation (2) Vacation benchmark (Refill mechanism on one core) 21

Evaluation (3) Vacation benchmark (Refill mechanism on all but one core) 8 cores local pool sizes: 64, 128, 256, 512, 1024 or 2048 bytes Transactional synchronization 22

Evaluation (3) Vacation benchmark (Refill mechanism on all but one core) 23 worst case: 20% increase in execution time « Wall » between 512 and 1024 bytes: refills may induce additional conflicts

Conclusions Dynamic memory management for Embedded TM Results Better than locking, in-transaction malloc More flexible than static allocation Future work More benchmarks Explore new fallback reprovision strategies Memory stealing between threads 24