Presentation is loading. Please wait.

Presentation is loading. Please wait.

Håkan Sundell, Chalmers University of Technology 1 Applications of Non-Blocking Data Structures to Real-Time Systems Seminar for the.

Similar presentations


Presentation on theme: "Håkan Sundell, Chalmers University of Technology 1 Applications of Non-Blocking Data Structures to Real-Time Systems Seminar for the."— Presentation transcript:

1 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 1 Applications of Non-Blocking Data Structures to Real-Time Systems Seminar for the degree of Licentiate of Philosophy Håkan Sundell Computing Science Chalmers University of Technology

2 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 2 Background ARTES project: ”Applications of wait/lock- free protocols to real-time systems” Started in March 1999. One active Ph.D.-student. Project leader: Philippas Tsigas

3 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 3 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Shared Register Software engineering part Conclusions & Future Work

4 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 4 Real-Time Systems Uni- or Multi-processor system Interconnection Network –e.g. The Controller Area Network (CAN). CPU

5 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 5 Real-Time Systems Shared Memory CPU Cache Cache bus Memory... - Uniform Memory Access (UMA) - Non-Uniform Memory Access (NUMA)

6 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 6 Real-Time Systems Cooperating Tasks –Timing Constraints Inter-task Communication: Shared Data Objects –Needs Synchronization ? ? ? T1 T2 T3

7 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 7 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Shared Register Software engineering part Conclusions & Future Work

8 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 8 Synchronization Synchronization using Locks –Uses semaphores, spinning, disabling interrupts –Negative Blocking Priority inversion Risk of deadlock –Positive Execution time guarantees easy to do, but pessimistic Take lock... do operation... Release lock

9 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 9 Non-blocking Synchronization Lock-Free Synchronization –Retries until not interfered by other operations Usually detecting interference by using some kind of shared variable indicating busy-state or similar. Change flag to unique value, or remember current state... do the operation while preserving the active structure... Check for same value or state and then validate changes, otherwise retry

10 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 10 Non-blocking Synchronization Lock-Free Synchronization –Negative No execution time guarantees, can continue forever - thus can cause starvation –Positive Avoids blocking and priority inversion Avoids deadlock Fast execution on average

11 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 11 Non-blocking Synchronization –Uses atomic synchronization primitives –Uses shared memory Wait-Free Synchronization –Always finish in a finite number of its own steps –Negative Complex algorithms Memory consuming Test&SetCompare&SwapCopyingHelpingAnnouncingSplitoperation???

12 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 12 Non-blocking Synchronization Wait-Free Synchronization –Positive Execution time guarantees Fast execution Avoids blocking and priority inversion Avoids deadlock Avoids starvation Same implementation on both single- and multiprocessor systems

13 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 13 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Shared Register Software engineering part Conclusions & Future Work

14 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 14 Shared Data Objects Correctness criteria for concurrent operations: linearizability –All concurrent executions can be transformed into an equivalent serial sequence of atomic operations preserving the partial order t Read Write titi tjtj tktk ser

15 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 15 Snapshot –A consistent momentous state of a set of several shared variables that are logically related –One reader (scanner) Reads the whole set of variables in one atomic step –Many writers (updaters) Writes to only one variable each time

16 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 16 Snapshot: Correctness Atomicity / Linearizability criteria t t Write Read Write Read YES cici cici = returned by scanner t Write Read cici NO

17 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 17 Snapshot: Correctness Atomicity / Linearizability criteria t Write Read cici NO = returned by scanner Write cici cjcj t NO

18 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 18 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Register Software engineering part Conclusions & Future Work

19 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 19 Used by writer Used by reader What are we evaluating Wait-free snapshot algorithm by Ermedahl et. al –3 register copies for each component –Uses the Test&Set atomic primitive for synchronization

20 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 20 Analysis Real-Time System: Measured schedulability Created “realistic” scenarios on a theoretic 68020 uni-processor system –Real RTOS parameters –Manual WCET-analysis on cycle level –1 scanner (5 components), 24 updaters (10 real-time tasks, 15 interrupts) –Fixed priority response time analysis –Schedulable without any synchronization –Adding lock/wait-free or semaphore synchronization

21 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 21 Analysis: Schedulability (%)

22 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 22 Experiments Simulation –RT-simulator written in Erlang by Ermedahl and Sjödin. Fixed priority preemptive scheduler Semaphores Messages –Subset of scenarios used in analysis

23 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 23 Experiments: Schedulability (%)

24 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 24 Experiments Multi-node: Simulation of CAN-bus 1 MHz –10 nodes connected using messages –Local snapshots on each node –1 super-snapshot task on 1 node –Subset of scenarios used for single-node analysis

25 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 25 Experiments: R snap for multi-node

26 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 26 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Register Software engineering part Conclusions & Future Work

27 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 27 Timing Information Previously used by Chen and Burns in 1999. –Assuming system with periodic fixed-priority scheduling –Notations from Standard Real-Time Response Time Analysis –Use information about Periods, T Worst-case Computation time, C Worst-case Response times, R

28 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 28 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Register Software engineering part Conclusions & Future Work

29 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 29 Snapshot Back to Basics: Unbounded Memory Protocol –The reader increases global index and scans backwards. t v????wnil v????w v????w c1c1 cici c Snapshotindex ? = previous values / nil w = writer position...

30 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 30 Snapshot Bounded Memory: Cyclical Buffers –Needed buffer length is dependent on how fast the updaters is compared to the scanner –Each component can have different buffer lengths

31 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 31 Timing Information Bounding –Needed buffer length for component k –Can be refined even further where T s is the period for the snapshot task T w is the period for the writer tasks

32 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 32 Experiments Using a Sun Enterprise 10000 multiprocessor computer 1 scanner task and 10 updater tasks, one on each CPU Comparing two wait-free snapshot algorithms –Using timing information –Using Test-and-Set synchronization

33 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 33 Experiments Scenarios with different ratios between scanner/updater: –Measuring response time for scan versus update operations Ratio500/ 50 200/ 50 100/ 50 50/ 50 50/ 100 50/ 200 50/ 500 Buffer length333461022

34 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 34 Experiments Scan operation - Average Response Time

35 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 35 Experiments Update operation – Average Response Time

36 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 36 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Shared Register Software engineering part Conclusions & Future Work

37 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 37 Shared Register Target domain: Shared Memory (Even no cache coherency) Wait-Free Atomic Shared Buffer by Vitanyi et. al –A Matrix of 1-reader 1-writer registers –Each register contains a value/tag pair encoded as one value... R 21 R 22 … R 11 R 12... Readers Writers R ij - written by processor i read by processor j tag value

38 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 38 Shared Register Algorithm: –Readers scans its column for highest tag and returns the corresponding value –Writers scan its column and writes the next tag together with the new value to its row Unbounded maximum size for the tag field in the value/tag pair –Assume 8 writer tasks with 10 ms period Maximum tag after one hour is 2880000 which needs 22 bits!

39 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 39 Timing Information Analyzing the maximum difference between tags possible observable by a task at two consecutive invocations of the algorithm –In any possible execution: T max is the longest period R max is the longest response time T wr is the period of the writer tasks Recycling tags: –Newer tags can restart from zero when we reach a certain tag value –In order to be able to decide if newer tags are newer we need to have: v3v3 v4v4 v1v1 v2v2 0N v3v3 v4v4

40 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 40 Examples Example Task Scenario on 8 processors: Unbounded algorithm would have reached tag 68400 in one hour, needing >16 bits TaskPeriodTaskPeriod Wr11000Rd1500 Wr2900Rd2450 Wr3800Rd3400 Wr4700Rd4350 Wr5600Rd5300 Wr6500Rd6250 Wr7400Rd7200 Wr8300Rd8150

41 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 41 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Register Software engineering part Conclusions & Future Work

42 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 42 Background Multithreaded programming needs communication. Communicating using shared data structures like stacks, queues, lists and so on. This needs synchronization! Locks (Mutual exclusion) has several drawbacks, especially for Real-Time Systems. Non-blocking solutions are often complex to implement and have non-standard interfaces.

43 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 43 NOBLE: A Non-Blocking Inter- Process Communication Library Designed with the following properties: –Functionality – Stacks, Queues, Lists, Snapshot, Register… with clear specifications –Programmer friendly - #include, NBL –Easy to adapt existing solutions – Provides locks as well as non-blocking synchronization

44 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 44 NOBLE: A Non-Blocking Inter- Process Communication Library Designed with the following properties (cont.): –Efficient – Object oriented design “virtual functions and inheritance with base classes” in C –Portable – Modular design, platform-dependent code separated –Adaptable for different programming languages – C, C++, Standard dynamic linked library

45 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 45 Examples #include First create a global variable handling the shared data object, for example a stack: NBLStack *stack; stack=NBLCreateStackLF(10000); When some thread wants to do some operation: NBLStackPush(stack, item); or item=NBLStackPop(stack);

46 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 46 Examples When the data structure is not in use anymore: NBLStackFree(stack); To change the synchronization mechanism, only one line of code has to be changed! stack=NBLStackCreateLF(10000); replaced with stack=NBLStackCreateLB();

47 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 47 Experiment Set of 50000 random operations performed multithreaded on each data structure, with either low or high contention. Comparing the different synchronization mechanisms and implementations available. Varying number of threads from 1 – 30. Performed on multiprocessors: –Sun Enterprise 10000 with 64 CPUs, Solaris –Compaq PC with 2 CPUs, Win32

48 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 48 Experiments: Linked List (high)

49 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 49 Status Multiprocessor support –Sun Solaris (Sparc) –Win32 (Intel x86) –SGI (Mips) – Evaluation stage –Linux (Intel x86) – Evaluation stage Extensive Manual Web site up and running, http://www.cs.chalmers.se/~noble

50 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 50 Schedule Introduction –Real-Time Systems –Synchronization Shared Data Objects: Snapshots –Evaluation The Effect of Using Timing Information –Snapshot –Register Software engineering part Conclusions & Future Work

51 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 51 Conclusions Contributions: –Evaluations of snapshot Non-blocking performs better than lock-based in all cases. Lock-free performs best on uni-processor systems. –The effect of using Timing Information Snapshot and Shared Register Algorithms can be simplified and increase the performance significantly. Efficient recycling of time-stamps is possible

52 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 52 Conclusions Contributions (cont.): –A library of non-blocking protocols Easy to use, efficient and portable Non-blocking protocols always performs better than lock- based, especially on multi-processor systems. Concluding judgment: –Non-blocking protocols are highly applicable to real- time systems. Lock-free protocols seems very promising and will be applicable to real-time systems with applied analysis

53 Håkan Sundell, phs@cs.chalmers.se Chalmers University of Technology 53 Future work NOBLE –Adapt to commercial RTOS (Enea OSE). –Extend to embedded systems Simpler uni- and multi-processor systems including 8-bit processors with/without or different support for atomic synchronization primitives. Timing Information –Create lock-free translations to fulfill real-time systems properties –General time-stamp recycling scheme –More non-blocking protocols


Download ppt "Håkan Sundell, Chalmers University of Technology 1 Applications of Non-Blocking Data Structures to Real-Time Systems Seminar for the."

Similar presentations


Ads by Google