Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation.

Similar presentations

Presentation on theme: "Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation."— Presentation transcript:

1 Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation

2 Overview  Repeating midterm presentation on the following subjects * Software Transactional Memory abstraction * STM implementation example - TL2 overview * Aborts in STM * Unnecessary aborts in STM * Project goal * Implementation * Overview  Online part – implementation  Online logging  Evaluation  Hardware  Deuce  Benchmarks  Results  Conclusion and analysis  Nice to have  Future work

3 Importance Of Parallel Programming  Frequency barrier – the single core processor’s performance can not improve.  Switch to multi-cores.  Parallel programs allow utilizing multi- core processors.  Need for synchronization for accessing shared data

4 Transactional Memory – why?  Current synchronization – locks  Coarse-grained – limit parallelism  Fine-grained – high programming complexity  Error-prone (deadlocks / livelocks)  Transactional memory solution  Intuitive for a programmer  Provides a “transaction” abstraction for a critical section (operations executed atomically)  Implemented in both software and hardware.

5 Why Do Aborts Happen? OBJECT1 OBJECT2 T1T2T3 T4 T1 T2 T3 Read from O1 T4 Reads from O2 and writes to O1 To maintain consistency if T4 commits T1 T2 & T3 must abort! Aborted Committed T1 T2 T3 write to O2

6 Unnecessary Aborts  Aborts are bad  work is lost, resources are wasted, throughput decreases  Some aborts are necessary  continuing the run would violate correctness  And some aborts are not  Analysis whether the algorithm should is too expensive.  “Unnecessary” abort: it could be avoided  keep more versions, better check of transactional dependencies. o1 o2 C A T1 T2 T3

7 Project Goals  Build a software analysis tool:  measures aborts statistics for a given run  evaluate how many of them were unnecessary  evaluate the damage to performance  “Will it pay off to add designs to stop the unnecessary aborts?”

8 Project Formation  An offline part for analyzing the run:  reads the log of the run.  gathers statistics.  analyzes unnecessary aborts.  An online part for logging the run:  is inserted to a specific algorithm  run in a benchmark  flushes the run info to an XML log file

9 Offline Part Parser  Every log line represents transactional action  represented by LogLine abstract class  Parser responsibility:  iterate over the xml  create appropriate LogLine instances  LogLine factories for different operation types  transactional start  read operation  write operation  transactional commit Analyzer  Gives basic statistics regarding the transactions run.  Counts aborts per reason.  Counts reads, writes  Count transactions  Inserting the Path into Run Descriptor ADT Struct.

10 Transactional Dependencies Run Descriptor is a precedence graph!

11 RUN DESCRIPTOR T1 T4 Reader OBJECT1 OBJECT2 Reader OBJECT1 Version2 OBJECT2 Version2 Writer WaR In order to create the graph we needed to establish A way to make the basic run into a graph 

12 ABORTS ANALYZER  Searches for unnecessary aborts in RUN DESCRIPTOR  Speculatively adds the edges of the aborted transaction to the RUN DESCRIPTOR  Using DFS – Finds circles in the precedence graph.  Circles represent necessary aborts  Removes the edges at the end of analysis.  Built as visitor pattern  Flexible for more complex analysis

13 Online part Our goals:  Run benchmarks to prepare the statistics for offline part.  Be sure that the measurements don’t distort the scheduling picture.

14 Platform Supporting STM  Deuce STM is an open source java STM environment.  With Deuce STM, if the method: public void doThing() {…} is not thread-safe… @Atomic Public void doThing() {…} is!! Introducing: Created By: Guy Korland, Nir Shavit, Pascal Felber, Igor Berman Source Code final public class Context implements org.deuce.transaction.Context { private static String objectId(Object reference, long field) { return Long.toString(System.identityHas hCode(reference) + field); } final static AtomicInteger clock = new AtomicInteger(0);

15 How To Utilize Deuce for Logging  Modified code to call logging utils.  More exceptions type to distinct between different aborts types. Logger Deuce Framework TL2 Algorithm Transactions Code: Start Read Write Commit A Perfectly Scalable Code

16 Online Part Implementation Version 1 Main Problem : Adding to priority queue damages Adding to priority queue damages parallelism and lowers performance parallelism and lowers performance

17 Online Part Implementation Version 2 The Back End Collector The threads don’t do any Extra actions to log the run. The Loglines have ended The program has ended 1 2 3

18 What Do we Check?  Commit rate  Unnecessary aborts (classified by types)  Wasted work

19 Testbenches  SSCA2 – Short transactions, low contention, high memory utilization  Vacation – High contention, Medium length transaction, Mostly reads.  AVL tree – customizable contention, medium length transactions.  Random choice between add, remove or search for a random integer in the tree.  Ability to change integer range for custom contention.  Created by us.  Created by us.

20 Hardware  Benchmarks run on Trinity:  8 quad-cores  132 GB RAM  Machine was idle for our use.

21 Simulation Results – AVL tree Commit Ratio Percentage of Unnecessary Aborts All graphs are a function of the thread amount Amount of Aborts & Unnecessary Aborts Percentage of Wasted Reads

22 Simulation Results – SSCA2 Commit Ratio Percentage of Unnecessary Aborts All graphs are a function of the thread amount Amount of Aborts & Unnecessary Aborts Percentage of Wasted Reads

23 Simulation Results – Vacation Commit Ratio Percentage of Unnecessary Aborts All graphs are a function of the thread amount Amount of Aborts & Unnecessary Aborts Percentage of Wasted Reads

24 Simulation Results – AVL tree All graphs are a function of the thread amount

25 Simulation Results – SSCA2 All graphs are a function of the thread amount

26 Simulation Results – Vacation All graphs are a function of the thread amount

27 Simulation Results – AVL tree All graphs are a function of the thread amount Percentage of Aborts by typesAmount of Aborts by types

28 Simulation Results – SSCA2 All graphs are a function of the thread amount Percentage of Aborts by typesAmount of Aborts by types

29 Simulation Results – Vacation All graphs are a function of the thread amount Percentage of Aborts by typesAmount of Aborts by types

30 Logger impact on performance  Logger access obviously demands more from the Deuce framework.  More memory accesses  More exception types  On every read & write  How much distortion does the logger cause? AVL test with logging – commit ratio

31 Conclusions  Parallelism increases → aborts rate, unnecessary abort rate and the wasted work rate increase as well.  Parallelism increases more aborts are caused by locked objects.  Parallelism increases → more aborts are caused by locked objects.  To improve STM performance over highly parallel workloads, algorithms may be improved to prevent unnecessary aborts.

32 Nice To Have  Drawing the precedence graph automatically to a drawing in Microsoft Visio.  Possibility to analyze according to abort types.  GUI.  Expansion of the simulation to more algorithms and test benches – makes the comparison of performance between algorithms possible.

33 Future Work  Drop in abort rates after 128 threads due to a drop in concurrency – further analysis is required.  Unfit versions cause a lot of aborts.  The new SMV algorithm may solve this problem.

34 BIBLIOGRAPHY  I. Keidar and D. Perelman. On avoiding spare aborts in transactional memory. In Proceedings of the twenty- fi rst annual symposium on Parallelism in algorithms and architectures, pages 59–68, 2009.  I. Keidar and D. Perelman.SMV: Selective Multi-Versioning STM  O. S. D. Dice and N. Shavit. Transactional locking II. In Proceedings of the 20th International Symposium on Distributed Computing, pages 194–208, 2006.  M. Herlihy, V. Luchangco, M. Moir, and W. N. Scherer, III. Soft- ware transactional memory for dynamic-sized data structures. In Pro-ceedings of the twenty-second annual symposium on Principles of distributed computing, pages 92–101, 2003.


Download ppt "Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation."

Similar presentations

Ads by Google