Quality Concurrent SW - Fight the Complexity of SW

Slides:



Advertisements
Similar presentations
1 Verification by Model Checking. 2 Part 1 : Motivation.
Advertisements

Applications of Synchronization Coverage A.Bron,E.Farchi, Y.Magid,Y.Nir,S.Ur Tehila Mayzels 1.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research.
Necessity of Systematic & Automated Testing Techniques Moonzoo Kim CS Dept, KAIST.
1 Formal Methods in SE Qaisar Javaid Assistant Professor Lecture 05.
ECE Synthesis & Verification1 ECE 667 Spring 2011 Synthesis and Verification of Digital Systems Verification Introduction.
CS590 Z Software Defect Analysis Xiangyu Zhang. CS590F Software Reliability What is Software Defect Analysis  Given a software program, with or without.
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared.
Exceptions and Mistakes CSE788 John Eisenlohr. Big Question How can we improve the quality of concurrent software?
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Provable Software Laboratory Moonzoo Kim
© 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig.
CMSC 345 Fall 2000 Unit Testing. The testing process.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Survey on Trace Analyzer (2) Hong, Shin /34Survey on Trace Analyzer (2) KAIST.
1 Introduction to Software Engineering Lecture 1.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Today’s Agenda  Reminder: HW #1 Due next class  Quick Review  Input Space Partitioning Software Testing and Maintenance 1.
The Relational Model1 Transaction Processing Units of Work.
Random Testing of Concurrent Programs Prof. Moonzoo Kim Computer Science, KAIST CS492B Analysis of Concurrent Programs.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Deadlock Bug Detection Techniques Prof. Moonzoo Kim CS KAIST CS492B Analysis of Concurrent Programs 1.
Concurrency Analysis for Correct Concurrent Programs: Fight Complexity Systematically and Efficiently Moonzoo Kim Computer Science, KAIST.
Introduction to Hardware Verification ECE 598 SV Prof. Shobha Vasudevan.
Agenda  Quick Review  Finish Introduction  Java Threads.
Software Testing & Verification Group Moonzoo Kim
Testing Concurrent Programs Sri Teja Basava Arpit Sud CSCI 5535: Fundamentals of Programming Languages University of Colorado at Boulder Spring 2010.
ECE 1747: Parallel Programming Short Introduction to Transactions and Transactional Memory (a.k.a. Speculative Synchronization)
Sub-fields of computer science. Sub-fields of computer science.
Testing Data Structures
Software Testing.
Algorithms and Problem Solving
Software Metrics 1.
Testing and Debugging PPT By :Dr. R. Mall.
Moonzoo Kim SWTV Group CS Dept. KAIST
Formal Methods for Finding Bugs in Concurrent Software
CompSci 230 Software Construction
CSC 591/791 Reliable Software Systems
New trends in parallel computing
Moonzoo Kim CS Dept. KAIST
Types of Testing Visit to more Learning Resources.
CS453: Automated Software Testing
Moonzoo Kim CS Dept. KAIST
Introduction to Concurrent Programming
Software Development Cycle
Introduction to Concurrency
Gabor Madl Ph.D. Candidate, UC Irvine Advisor: Nikil Dutt
Moonzoo Kim SWTV Group CS Dept. KAIST
Threads and Memory Models Hal Perkins Autumn 2011
Lecture Software Process Definition and Management Chapter 3: Descriptive Process Models Dr. Jürgen Münch Fall
Chapter 26 Concurrency and Thread
Fundamental Test Process
Threads and Memory Models Hal Perkins Autumn 2009
Introduction To software engineering
Background and Motivation
Multithreaded Programming
Moonzoo Kim SWTV Group CS Dept. KAIST
Algorithms and Problem Solving
Information Security CS 526
CSE403 Software Engineering Autumn 2000 More Testing
Software Development Cycle
CUTE: A Concolic Unit Testing Engine for C
Software Development Cycle
Chapter 7 Software Testing.
Problems with Locks Andrew Whitaker CSE451.
Intro. To Quality Software - Fight the Complexity of SW
Intro. To Quality Software - Fight the Complexity of SW
From Use Cases to Implementation
Moonzoo Kim Provable Software Laboratory CS Dept. KAIST
Presentation transcript:

Quality Concurrent SW - Fight the Complexity of SW Moonzoo Kim CS Dept. KAIST

Testing is a Complex and Challenging task!!! Object-Oriented Programming, Systems Languages, and Applications, Seattle, Washington, November 8, 2002 “… When you look at a big commercial software company like Microsoft, there's actually as much testing that goes in as development. We have as many testers as we have developers. Testers basically test all the time, and developers basically are involved in the testing process about half the time…” “… We've probably changed the industry we're in. We're not in the software industry; we're in the testing industry, and writing the software is the thing that keeps us busy doing all that testing.” “…The test cases are unbelievably expensive; in fact, there's more lines of code in the test harness than there is in the program itself. Often that's a ratio of about three to one.” 심지어는 Bill Gate 가 말하길, “MS는 소프트웨어 기업이 아니라, 테스팅 기업” 이라고 까지 말할 정도로 테스팅에 많은 시간, 인원, 비용을 소모하고 있다. 그렇다고 완전한 것도 아니다. 2

Ex. Testing a Triangle Decision Program Input : Read three integer values from the command line. The three values represent the length of the sides of a triangle. Output : Tell whether the triangle is • 부등변삼각형 (Scalene) : no two sides are equal • 이등변삼각형(Isosceles) : exactly two sides are equal • 정삼각형 (Equilateral) : all sides are equal Create a Set of Test Cases for this program (3,4,5), (2,2,1), (1,1,1) ? Moonzoo Kim

Precondition (Input Validity) Check Condition 1: a > 0, b > 0, c > 0 Condition 2: a < b + c Ex. (4, 2, 1) is an invalid triangle Permutation of the above condition a < b +c b < a + c c < a + b What if b + c exceeds 232 (i.e. overflow)? long v.s. int v.s. short. v.s. char Developers often fail to consider implicit preconditions Cause of many hard-to-find bugs Moonzoo Kim

# of test cases required? 4 10 50 100 “Software Testing a craftsman’s approach” 2nd ed by P.C.Jorgensen (no check for positive inputs) # of test cases required? 4 10 50 100 # of feasible unique execution paths? 11 paths guess what test cases needed a,b,c = 1,1,1:result=0 a,b,c = 2,1,1:result=2 a,b,c = 2,1,2:result=1 a,b,c = 1,2,1:result=2 a,b,c = 3,2,1:result=2 a,b,c = 2,1,3:result=2 a,b,c = 4,3,2:result=3 a,b,c = 2,3,1:result=2 a,b,c = 3,2,2:result=1 a,b,c = 2,2,1:result=1 a,b,c = 1,1,2:result=2 ~ Moonzoo Kim

Software v.s. Magic Circle (마법진) Written by a human magician line by line Requires magic spell knowledge Summoned monsters are far more powerful than the magic spell itself The summoned demon is often uncontrollable and disaster occurs Written by a software developers line by line Requires programming expertise SW executes complicated tasks which are far more complex than the code itself The software often behaves in unpredicted ways and crash occurs 네이버 웹툰 “그 판타지 세계에서 사는법“ by 촌장

Safety Problems due to Poor Quality of SW

Moonzoo Kim

Research Trends toward Quality Systems Academic research on developing embedded systems has reached stable stage Research focus has moved on to the quality of the systems from the mere functionalities of the systems Energy efficient design, ez-maintenance, dynamic configuration, etc Software reliability is one of the highly pursued qualities USENIX Security 2015 Best paper “Under-Constrained Symbolic Execution: Correctness Checking for Real Code” @ Stanford University ASPLOS 2011 Best paper “S2E: a platform for in-vivo multi-path analysis for software systems” @ EPFL OSDI 2008 Best paper “Klee: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs” @ Stanford NSDI 2007 Best paper “Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code” @ U.C. San Diego

Formal Analysis of Software as a Foundational and Promising CS Research 2007 ACM Turing Awardees Prof. Edmund Clarke, Dr. Joseph Sipfakis, Prof. E. Allen Emerson For the contribution of migrating from pure model checking research to industrial reality 2013 ACM Turing Awardee Dr. Leslie Lamport For fundamental contributions to the theory and practice of distributed and concurrent systems Happens-before relation, sequential consistency, Bakery algorithm, TLA+ (Temporal Logic for Actions), and LaTeX

Motivation for Concurrency Analysis Multithreaded programming are becoming popular more and more 37 % of all open-source C# applications and 87% of large applications in active code repositories use multi-threading [Okur & Dig FSE 2012] However, multithreaded programming remains as error-prone Writing correct multithreaded program is difficult Testing written multithreaded program is difficult Most of my subjects (interviewee) have found that the hardest bugs to track down are in concurrent code … And almost every one seem to think that ubiquitous multi-core CPUs are going to force some serious changes in the way software is written Small (1K-10K) Medium (10K-100K) Large (>100K) # of all projects in the study 6020 1553 205 # of projects with multithreading 1761 916 178 # of projects with parallel library uses 412 203 40 P. Siebel, Coders at work (2009) -- interview with 15 top programmers of our times: Jamie Zawinski, Brad Fitzpatrick, Douglas Crockford, Brendan Eich, Joshua Bloch, Joe Armstrong, Simon Peyton Jones, Peter Norvig, Guy Steele, Dan Ingalls, L Peter Deutsch, Ken Thompson, Fran Allen, Bernie Cosell, Donald Knuth 2019-01-03

Concurrency Concurrent programs have very high complexity due to non-deterministic scheduling Ex. int x=0, y=0, z =0; void p() {x=y+1; y=z+1; z= x+1;} void q() {y=z+1; z=x+1; x=y+1;} p() q() x,y,z 2,1,2 2,1,3 2,2,3 2,3,3 2,4,3 3,2,3 4,3,2 4,3,5 Moonzoo Kim

Concurrent Programming is Error-prone Correctness of concurrent programs is hard to achieve Interactions between threads should be carefully performed A large # of thread executions due to non-deterministic thread scheduling Testing technique for sequential programs do not properly work 4 processes, 55043 states 3 processes, 853 states 2 processes , 30 states Ex. Peterson mutual exclusion (From Dr. Moritz Hammer’s Visualisierung )

An Example of Mutual Exclusion Protocol char cnt=0,x=0,y=0,z=0; void process() { char me=_pid +1; /* me is 1 or 2*/ again: x = me; If (y ==0 || y== me) ; else goto again; z =me; If (x == me) ; y=me; If(z==me); /* enter critical section */ cnt++; assert( cnt ==1); cnt --; goto again; } Process 0 x = 1 If(y==0 || y == 1) z = 1 If(x == 1) y = 1 If(z == 1) cnt++ Process 1 x = 2 If(y==0 || y ==2) z = 2 If(x==2) y=2 If (z==2) Counter Example Violation detected !!! Software locks Critical section Mutual Exclusion Algorithm

Research on Concurrent Program Analysis We have developed static/dynamic techniques for finding bugs in large concurrent programs effectively Coverage-guided testing is a promising technique for effective/efficient concurrent program quality assurance CUVE [ISSTA’12] Industry-size concurrent programs MOKERT [MBT’08] Precision Coverage-guided thread scheduling generation Model- based testing Model Checking Testing Code pattern-based analysis Sync. pattern based bug detection COBET [JSS’12] Race bug classification Bug pattern + static analysis Scalability

Techniques to Test Concurrent Programs Pattern-based bug detection Systematic testing Find predefined patterns of suspicious synchronizations [Eraser, Atomizer, CalFuzzer] Limitation: focused on specific bug Explore all possible thread scheduling cases [CHESS, Fusion] Limitation: limited scalability Random testing Generate random thread scheduling [ConTest] Limitation: may not investigate new interleaving Direct thread scheduling for high test coverage

Challenge: Thread Schedule Generation 5 4 3 2 1 Thread scheduler Interleaved execution-1 Thread-2 5 4 3 2 1 5 … 4 3 2 1 Interleaved execution-2 Thread-3 5 4 3 2 1 5 … 4 3 2 1 Interleaved execution-3 Testing environment 5 … 4 3 2 1 The basic thread scheduler repeats similar schedules in a fixed environment Testing with the basic thread scheduler is not effective to generate diverse schedules, possible for field environments 2019-01-03

Limitation of Random Testing Technique Noise injector Thread-1 5 4 3 2 1 Interleaved execution-1 Thread scheduler 5 … 2 3 1 Thread-2 5 4 3 2 1 Interleaved execution-2 5 … 2 3 1 Thread-3 5 4 3 2 1 5 4 3 … 2 1 Testing environment Limitation The probability to cover subtle behavior may be very low No guarantee that the technique keeps discover new behaviors 2019-01-03

Systematic Testing Technique Thread-1 5 4 3 2 1 Interleaved execution-1 Systematic testing technique 5 4 3 … 2 1 Thread-2 5 4 3 2 1 Interleaved execution-2 5 4 3 … 2 1 Thread-3 5 4 3 2 1 5+5+5 ! 5!5!5! : Testing environment Interleaved execution-10 5 4 3 … 2 1 Limitation Not efficient to check diverse behaviors A test might be skewed for checking local behaviors 2019-01-03

Pattern-directed Testing Technique Pattern-based bug predictor Thread-1 Interleaved execution-1 5 4 3 2 1 5 4 3 … 2 1 Bug-driven scheduler Thread-2 Interleaved execution-2 5 4 3 2 1 5 4 3 … 2 1 Thread-3 Interleaved execution-3 5 4 3 2 1 5 4 3 … 2 1 Testing environment Limitation Not effective to discover diverse behaviors The technique may miss concurrency faults not predicted No guarantee to have a chance to induce predicted faults 2019-01-03

Coverage-based Testing Technique metric Coverage info. Coverage measure Thread-1 5 4 3 2 1 Coverage- based scheduler Interleaved execution-1 Thread-2 5 4 … 2 3 1 5 4 3 2 1 : Thread-3 5 4 3 2 1 Testing environment Advantage High co-relation between concurrent coverage and fault-finding Engineer can measure testing progress explicitly Note that coverage-based testing is a standard testing process in sequential program testing 2019-01-03