Presentation is loading. Please wait.

Presentation is loading. Please wait.

Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M

Similar presentations


Presentation on theme: "Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M"— Presentation transcript:

1 A Novel Test Coverage Metric for Concurrently-Accessed Software Components
Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Koç University, Istanbul, Turkey Hi all. I’m Tayfun Elmas from Koc University. In this talk I’ll present you a technique for detecting concurrency errors. In this technique we watch for refinement violations at runtime. This is joint work with my advisor Serdar Tasiran and Shaz Qadeer, from Microsoft Research. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

2 Our Focus Widely-used software systems are built on concurrently-accessed software components File systems, databases, internet services Standard Java and C# class libraries Intricate synchronization mechanisms to improve performance Prone to concurrency errors Concurrency errors Data loss/corruption Difficult to detect, reproduce through testing Well, Many widely-used software applications are built on concurrent data structures. Examples are file systems, databases, internet services and some standard Java and C# class libraries. These systems frequently use intricate synchronization mechanisms to get better performance in a concurrent environment. This makes them prone to concurrency errors. Concurrency errors can have serious consequences, such as data loss or corruption. Unfortunately, these errors are typically hard to detect and reproduce through pure testing-based techniques. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

3 The Location Pairs Metric
Goal of metric: To help answer the question “If I am worried about concurrency errors only, what unexamined scenario should I try to trigger?” Coverage metrics: Link between validation tools Communicate partial results, testing goals between tools Direct tools toward unexplored, distinct new executions The “location pairs” (LP) metric Directed at concurrency errors ONLY Focus: “High-level” data races Atomicity violations Refinement violations All variables may be lock-protected, but operations not implemented atomically Well, Many widely-used software applications are built on concurrent data structures. Examples are file systems, databases, internet services and some standard Java and C# class libraries. These systems frequently use intricate synchronization mechanisms to get better performance in a concurrent environment. This makes them prone to concurrency errors. Concurrency errors can have serious consequences, such as data loss or corruption. Unfortunately, these errors are typically hard to detect and reproduce through pure testing-based techniques. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

4 Outline Runtime Refinement Checking
Examples of Refinement/Atomicity Violations The “Location Pairs” Metric Discussion, Ongoing Work 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

5 Refinement as Correctness Criterion
LookUp(3) Thread 1 ..... Insert(3) Thread 2 Insert(4) Thread 3 Delete(3) Thread 4 Client threads invoke operations concurrently Data structure operations should appear to be executed atomically in a linear order to client threads. Component Implementation Call Insert(3) Unlock A[0] A[0].elt=3 Call LookUp(3) Return“success” Unlock A[1] A[1].elt=4 read A[0] Return “true” A[0].elt=null Call Insert(4) Call Delete(3) 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

6 Runtime Refinement Checking
For each execution of Impl there exists an “equivalent”, atomic execution of data structure Spec Spec: “Atomized” version of Impl Client methods run one at a time Obtained from Impl itself Use refinement as correctness criterion More thorough than assertions More observability than pure testing Runtime verification: Check refinement using execution traces Can handle industrial-scale programs Intermediate between testing & exhaustive verification Keywords: Linerizability and atomicy are more restrictive. The flexibility in spec gives us a more powerful method to prove correctness of some tricky implmentations. In our approach to verifying concurrent data structures we use refinement as the correctness criterion. The benefits of this choice are that refinement is a more thorough condition than method local assertions and that it provides more observability than pure testing. Correctness conditions like Linearizability and atomicity require that for each execution of impl in a concurrent environment there exists an equivalent atomic execution of the same Impl. However Refinement uses a separate specification and for each execution of the impl refinement requires existence of an equivalent atomic execution of this spec. The specification we use is more permissive than the impl. For example the spec allows methods to terminate exceptionally to model failure due to resource contention in a concurrent environment. However the impl would not allow some of the method executions to fail. We check refinement at runtime using execution traces of the implementation. We do this in order to be able to handle industrial-scale programs. Our approach can be regarded as intermediate between testing and exhaustive verification with respect to the coverage of the whole execution space explored. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

7 The VYRD Tool ... Impl Implreplay Spec Replay Mechanism Refinement
Multi-threaded test Impl Write to log ... Call LookUp(3) Call Insert(3) A[0].elt=3 Unlock A[0] Call Delete(3) Call Insert(4) read A[0] A[1].elt=4 Return “true” Unlock A[1] Return“success” Return“success” A[0].elt=null Unlock A[0] Return“success” Read from log Execute logged actions Replay Mechanism Run methods atomically Implreplay Abstraction function (for checking view-refinement) An extra method of the data structure For the current data structure state, computes the current state of the view variable Vyrd analyzes execution traces of the impl generated by test programs. Vyrd uses two separate threads for the process. The testing thread runs a test harness that generates test programs. A test program makes concurrent method calls to the impl. During the run of the test program, the corresponding execution trace is recorded in a shared sequential log. The verification thread reads the execution trace from the log. Since the verification thread follows the testing thread from behind, it can not access the instantaneous state of the impl. Thus the replaying module re-executes actions from the log on a separate instance of the impl called impl-replay and executes atomic methods on the spec at commit points. During replaying, the replaying mechanism also computes the view variables when it reaches a commit point and annotates the commit actions along the traces with view variables. The refinement checker module checks the resulting lambda traces of impl and the spec for IO and view refinement. These threads can run in online or offline setting. In online checking both threads simultaneously while in offline checking the verification thread runs after the whole test program finishes its work. Spec Refinement Checker traceImpl traceSpec At certain points for each method, take “state snapshots” Check consistency of data structure contents 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

8 The Vyrd Experience Scalable method: Caught previously undetected, serious but subtle bugs in industrial-scale designs Boxwood (30K LOC) Scan Filesystem (Windows NT) Java Libraries with known bugs Reasonable runtime overhead Key novelty: Checking refinement improves observability Catches bugs that are triggered but not observed by testing Significant improvement 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

9 Replicated Disk Manager
The Boxwood Project Experience BLINKTREE MODULE Root Pointer Node Internal Pointer Node Level n Level n Root Level Leaf Pointer Node ... Level 0 ... Data Node ... Data Nodes ... GlobalDiskAllocator CHUNK MANAGER MODULE Replicated Disk Manager Write Read Read Write CACHE MODULE Dirty Cache Entries ... Clean Cache Entries Cache We verified all modules of this system. We caught an interesting difficult tricky error that has gone undetected. We run Vyrd on the Boxwood Project from Microsoft. The goal of the Boxwood project is building a distributed abtract storage infrastructure for applications with high data storage and retrieval requirements. Here, you see a high level picture of Boxwood. Boxwood has a concurrent blinktree implementation in Blinktree module. The blinktree module uses a cache module to store and retrieve its data at tree nodes quickly. The cache module makes its data persistent using a chunk manager module. The chunk manager implements distributed storage system that abstracts the storage system from the upper layers. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

10 Refinement vs. Testing: Improved Observability
Using Vyrd, caught previously undetected bug in Boxwood Cache Scan File System (Windows NT) Bug manifestation: Cache entry is correct, marked “clean” Permanent storage has corrupted data Hard to catch through testing As long as “Read”s hit in Cache, return value correct Caught through testing only if Cache fills, clean entry in Cache is evicted Not written again to permanent storage since entry is marked “clean” Entry read from permanent storage after eviction With no “Write”s to entry in the meantime 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

11 Outline Runtime Refinement Checking
Examples of Refinement/Atomicity Violations The “Location Pairs” Metric Discussion, Ongoing Work 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

12 Idea behind the LP metric
Observation: Bug occurs whenever Method1 executes up to line X, context switch occurs Method2 starts execution from line Y Provided there is a data dependency between Method1’s code “right before” line X: BlockX Method2’s code “right after” line Y: BlockY Description of bug in the log follows pattern above Only requirement on program state, other threads, etc.: Make the interleaving above possible May require many other threads, complicated program state, ... A “one-bit” data abstraction captures error scenario Depdt: Is there a data dependency between BlockX and BlockY 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

13 public synchronized StringBuffer append(StringBuffer sb) {
public synchronized void setLength(int newLength) { int len = sb.length(); int newCount = count + len; if (newCount > value.length) ensureCapacity(newCount); ... if (count < newLength) ... } else { count = newLength; ... } return this; sb.getChars(0, len, value, count); count = newCount; } return this; } 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

14 Write(handle,AB) starts
Concurrency Bug in Cache Experience Write(handle,AB) starts Flush() starts handle T Z Chunk Manager X Y Cache handle X Z Chunk Manager A Y Cache handle A Y Chunk Manager Cache Write(handle, AB) ends Flush() ends Different byte-arrays for the same handle Corrupted data in persistent storage handle A Y Chunk Manager A B Cache handle A Y Chunk Manager A B Cache In this slide we demonstrate you the error Vyrd caught in Cache module. Think of dirty cache containing the data X.Y. Let the persistent storage for the same handle contains T.Z. At the end corrupted data is written to the persistent storage. It breaks the invariant (i) because for the clean cache entry the byte arrays in cache and chunk manager are not the same. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

15 private static void CpToCache( byte[] buf, CacheEntry te, int lsn, Handle h sb) {
public static void Flush(int lsn) { ... lock (clean) { for (int i=0; i<buf.length; i++) { BoxMain.alloc.Write(h, te.data, te.data.length, , 0, WRITE_TYPE_RAW); te.data[i] = buf[i]; } } ... te.lsn = lsn } } 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

16 Outline Runtime Refinement Checking
Examples of Refinement/Atomicity Violations The “Location Pairs” Metric Discussion, Ongoing Work 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

17 public synchronized StringBuffer append(StringBufer sb) { int len = sb.length(); 2 int newCount = count + len; 3 if (newCount > value.length) { ensureCapacity(newCount); 5 sb.getChars(0, len, value, count); 6 count = newCount; 7 return this; 8 } acquire(this) invoke sb.length() – L int len = sb.length() L int newCount = count + len if (newCount > value.length) expandCapacity(newCount); invoke sb.getChar() sb.getChars(0, len, value, count) – count = newCount return this

18 Coverage FSM State Method 1 Method 2 (LX, pend1, LY, pend2, depdt)
Location in the CFG of Method 2 Location in the CFG of Method 1 Do actions following LX and LY have a data dependency? Is an “interesting” action in Method 2 expected next? Is an “interesting” action in Method 1 is expected next? 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

19 Coverage FSM (L1, !pend1, L3, !pend2, depdt) t1: L1  L2 t2: L3  L4
17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

20 Coverage Goal The “pend1” bit gets set when The depdt bit is TRUE
Method2 takes an action Intuition: Method1’s dependent action must follow Must cover all (reachable) transitions of the form p = (LXp, TRUE, LY, pend2p, depdtp)  q = (LXq, pend1q, LY, pend2q, depdtq) p = (LX, pend1p, LYp, TRUE, depdtp)  q = (LX, pend1q, LYq, pend2q, depdtq) Separate coverage FSM for each method pair: FSM(Method1, Method2) Cover required transitions in each FSM 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

21 Important Details Action: Atomically executed code fragment
Defined by the language Method calls: Call action: Method call, all lock acquisitions Return action: Total net effect of method, atomically executed + lock releases Separate coverage FSM for each method pair: FSM(Method1, Method2) Cover required transitions in each FSM But what if there is interesting concurrency inside called method? Considered separately when that method is considered as one in the method pair If Method1 calls Method3: Considered when FSM(Method3, Method2) is covered 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

22 Outline Runtime Refinement Checking
Examples of Refinement/Atomicity Violations The “Location Pairs” Metric Discussion, Ongoing Work 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

23 Empirical evidence Does this metric correspond well with high-level concurrency errors? Errors captured by metric 100% metric  Bug guaranteed to be triggered Triggered vs. detected: May need view refinement checking to improve observability Preliminary study Bugs in Java class libraries Bug found in Boxwood cache Bug found in Scan file system Bugs categories reported in E. Farchi, Y. Nir, S. Ur Concurrent Bug Patterns and How to Test Them 17th Intl. Parallel and Distributed Processing Symposium (IDPDS ’03) How many are covered by random testing? How does coverage change over time? Don’t know yet. Implementing coverage measurement tool. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

24 Reducing the Coverage FSM
Method-local actions: Basic block consisting of method-local actions considered a single atomic action Pure blocks [Flanagan & Qadeer, ISSTA ’04] A “pure” execution of pure block does not affect global state Example: Acquire lock, read global variable, decide resource not free, release lock Considered a “no-op” Modeled by “bypass transition” in coverage FSM. Does not need to be covered 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

25 Discussion The metric is NOT for deciding when to stop testing/verification Intended use: Testing, runtime verification is applied to program List of non-covered coverage targets provided to programmer Intuition: Given an unexercised scenario, the programmer must have a simple reason to believe that the scenario is not possible, or the scenario is safe Given uncovered coverage target, programmer either provides hints to coverage tool to rule target out or, assumes that coverage target is a possibility, writes test to trigger it or, makes sure that no concurrency error would result if coverage target were to be exercised 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

26 Future Work: Approximating Reachable LP Set
# of locations per method in Boxwood: ~10, after factoring out atomic and pure blocks LP reachability undecidable Metric only intended as aid to programmer What have I tested? What should I try to test? Make sure LP does not lead to error if it looks like it can be exercised. Future work: Better approximate reachable LP set Do conservative reachability analysis of coverage FSM using predicate abstraction. Programmer can add predicates for better FSM reduction In this part of the talk I’ll tell you about our experience using the Vyrd tool. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

27 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

28 Multiset Implementation: LookUp  Multiset data structure
Has highly concurrent implementations of Insert Delete InsertPair LookUp LookUp (x) for i = 1 to n acquire(A[i]) if (A[i].content==x && A[i].valid) release(A[i]) return true else release(A[i]) return false A 9 8 6 5 3 null 2 content valid Our motivating data structure is a multiset. Here is an example of a multiset. Notice that several copies of the same integer can be in the multiset like 3 and 8 in this example. The implementation represents the multiset by an array A with two fields. The content field stores the integer element and the Boolean valid field tells us whether the element is to be included in the multiset or not. For example one representation of the multiset above could be like as the bottom one. On the right you see the implementation for the lookup method. Lookup queries whether a given integer x is in the multiset. It traverses the array A linearly by locking elements one by one and checking if the content is x and the valid field is set. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

29 Multiset Testing Don’t know which happened first
Call Insert(3) Call LookUp(3) Return“success” Call Insert(4) Return “true” Call Delete(3) Unlock A[0] A[0].elt=3 Unlock A[1] A[1].elt=4 read A[0] A[0].elt=null Don’t know which happened first Insert(3) or Delete(3) ? Should 3 be in the multiset at the end? Must accept both possibilities as correct Common practice: Run long multi-threaded test Perform sanity checks on final state In this slide we’ll explain how we check IO refinement. Again we use the insert operation instead of insertpair to simplify the picture. On the right you see 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

30 Multiset I/O Refinement Witness ordering Spec trace M=Ø
Call Insert(3) Call LookUp(3) Return“success” Call Insert(4) Return “true” Call Delete(3) Unlock A[0] A[0].elt=3 Unlock A[1] A[1].elt=4 read A[0] A[0].elt=null M=Ø {3} {3, 4} {4} Spec trace Call Insert(3) Return “success” Call LookUp(3) Call Insert(4) Call Delete(3) M = M U {3} Check 3  M Return “true” M = M U {4} M = M \ {3} Commit Insert(3) Commit LookUp(3) Commit Insert(4) Commit Delete(3) Witness ordering Unlock A[0] A[0].elt=3 Unlock A[1] A[1].elt=4 read A[0] A[0].elt=null M = M U {3} Check 3  M M = M U {4} M = M \ {3} In this slide we’ll explain how we check IO refinement. Again we use the insert operation instead of insertpair to simplify the picture. On the right you see 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

31 View-refinement   View Variables State correspondence
Hypothetical “view” variables must match at commit points “view” variable: Value of variable is abstract data structure state Updated atomically once by each method For A[1..n] Extract content if valid=true viewImpl={3, 3, 5, 5, 8, 8, 9} 3 5 content valid A 9 8 6 The state correspondence is obtained by matching view variables from the impl and the spec at commit points. A view variable is a hypothetical variable that extracts an abstract state of the data structure. This abstract state is updated or observed atomically by each method. The view variables that carry state information of the impl and the spec are denoted by viewimpl and viewspec respectively. The view for multiset data structure is the set of integers stored in the multiset. The view variable for the multiset impl extracts the elements in content fields whose corresponding valid fields are true. Thus the view variable for the multiset in the figure does not contain the first 5 and 6 in the view variable. The view var for the spec gets elements from the set M. 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.

32 View-refinement Witness ordering Spec trace M=Ø Commit Insert(3) {3}
Call LookUp(3) M=Ø {3} {3, 4} {4} Spec trace Call Insert(3) Return “success” Call LookUp(3) Return “true” Call Insert(4) Call Delete(3) M = M U {3} Check 3  M M = M U {4} M = M \ {3} Commit Insert(3) Commit LookUp(3) Commit Insert(4) Commit Delete(3) Witness ordering Call Insert(3) A[0].elt=3 viewImpl = {3} viewSpec = {3} Call Delete(3) Call Insert(4) viewImpl = {3} viewSpec = {3} A[1].elt=4 Return “true” viewImpl = {3,4} viewSpec = {3,4} Say the view are computed by running abst func at this point. The checking procedure is similar to checking IO refinement. Return“success” Return“success” A[0].elt=null viewImpl = {4} viewSpec = {4} Return“success” 17/02/19 PLDI 2005, June 12-15, Chicago, U.S.


Download ppt "Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M"

Similar presentations


Ads by Google