New Ideas Track: Testing MapReduce-Style Programs Christoph Csallner, Leonidas Fegaras, Chengkai Li csallner@uta.edu, fegaras@uta.edu, cli@uta.edu Computer.

Slides:



Advertisements
Similar presentations
Leonardo de Moura Microsoft Research. Z3 is a new solver developed at Microsoft Research. Development/Research driven by internal customers. Free for.
Advertisements

Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
Finding bugs: Analysis Techniques & Tools Symbolic Execution & Constraint Solving CS161 Computer Security Cho, Chia Yuan.
MapReduce.
Satisfiability Modulo Theories (An introduction)
Automatic test case generation for programs that are coded against interfaces and annotations or use native code Mainul Islam Supervisor: Dr. Christoph.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Pexxxx White Box Test Generation for
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael Ernst, Jake Cockrell, William Griswold, David Notkin Presented by.
Automatically Extracting and Verifying Design Patterns in Java Code James Norris Ruchika Agrawal Computer Science Department Stanford University {jcn,
Presenter: Joshan V John Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan & Tien N. Nguyen Iowa State University, USA Instructor: Christoph Csallner 1 Joshan.
Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Guide to Assignment 3 Programming Tasks 1 CSE 2312 Computer Organization and Assembly Language Programming Vassilis Athitsos University of Texas at Arlington.
School of Computing and Mathematics, University of Huddersfield Computing Science: WEEK 17 Announcement: next few weeks… 9 nd Feb: Comparative Programming.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
DySy: Dynamic Symbolic Execution for Invariant Inference.
A Simple Method for Extracting Models from Protocol Code David Lie, Andy Chou, Dawson Engler and David Dill Computer Systems Laboratory Stanford University.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
jFuzz – Java based Whitebox Fuzzing
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
A Test Case + Mock Class Generator for Coding Against Interfaces Mainul Islam, Christoph Csallner Software Engineering Research Center (SERC) Computer.
4.3 Functions. Functions Last class we talked about the idea and organization of a function. Today we talk about how to program them.
Guide to Assignment 3 and 4 Programming Tasks 1 CSE 2312 Computer Organization and Assembly Language Programming Vassilis Athitsos University of Texas.
Grigore Rosu Founder, President and CEO Professor of Computer Science, University of Illinois
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
The Potential of Sampling for Dynamic Analysis Joseph L. GreathouseTodd Austin Advanced Computer Architecture Laboratory University of Michigan PLAS, San.
HW7: Due Dec 5th 23:59 1.Describe test cases to reach full path coverage of the triangle program by completing the path condition table below. Also, draw.
Debugging and Testing Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Testing and Debugging UCT Department of Computer Science Computer Science 1015F Hussein Suleman March 2009.
Lecture 9 Symbol Table and Attributed Grammars
A Review of Software Testing - P. David Coward
By Laksman Veeravagu and Luis Barrera
Lazy Preemption to Enable Path-Based Analysis of Interrupt-Driven Code
Control Flow Testing Handouts
Introduction to the C Language
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
CSE 3302 Programming Languages
LESSON 20.
Outline of the Chapter Basic Idea Outline of Control Flow Testing
A Test Case + Mock Class Generator for Coding Against Interfaces
runtime verification Brief Overview Grigore Rosu
TriggerScope Towards detecting logic bombs in android applications
+ + Testing Web-Based Applications With the
White-Box Testing Using Pex
New Ideas Track: Testing MapReduce-Style Programs
High Coverage Detection of Input-Related Security Faults
It is great that we automate our tests, but why are they so bad?
Introduction to the C Language
Chapter 4 LOOPS © Bobby Hoggard, Department of Computer Science, East Carolina University / These slides may not be used or duplicated without permission.
Tuan Anh Nguyen, Christoph Csallner
Dynamic Symbolic Data Structure Repair
Program Slicing Baishakhi Ray University of Virginia
SAT-Based Area Recovery in Technology Mapping
Data Structures & Algorithms Union-Find Example
RECURSION Annie Calpe
Guest Lecture by David Johnston
Data Structure and Algorithms
POWERPOINT PRESENTATION
CS639: Data Management for Data Science
CSC-682 Advanced Computer Security
Diagnostic Evaluation
CUTE: A Concolic Unit Testing Engine for C
Basic Concepts of Algorithm
Priority Queues CSE 373 Data Structures.
CSE 1020:Software Development
Pointer analysis John Rollinson & Kaiyuan Li
SOFTWARE ENGINEERING INSTITUTE
Presentation transcript:

New Ideas Track: Testing MapReduce-Style Programs Christoph Csallner, Leonidas Fegaras, Chengkai Li csallner@uta.edu, fegaras@uta.edu, cli@uta.edu Computer Science and Engineering Department, University of Texas at Arlington (UTA), USA MapReduce = New programming model New programming model  New bugs /* Report avg of top-3 salaries, if avg>100k */ public void reduce(String dept, Iterator<Integer> salaries) { int sum = 0; int i = 0; while (salaries.hasNext() && i<3) { sum += salaries.next(); i += 1; } emit( (i>0 && sum/i > 100000)? sum/i : -1); I just implemented this lovely piece of MapReduce code. Life is great  Q: For the same input: Will your program always return the same result? A: No! Why did nobody warn me?  Can we build tools for finding such bugs automatically? Example: MapReduce programs have to satisfy certain Correctness Conditions: Map: (key;value)* Group By Key Reduce: avg. of first 3 Input Output For example: Reduce must not rely on a particular order of input values: For each input list of values L, for each permutation P: reduce(key, L) == reduce(key, P(L)) ...;dept; salary ...; A; 250,000 ...; X; 220,000 ...; F; 220,000 ...; Z; 210,000 ...; O; 150,000 ...; T; 150,000 ...; A; 150,000 ...; A; 140,000 ...; E; 100,000 ...; S; 100,000 ...; A; 95,000 ...; C; 95,000 ... A; 250,000 X; 220,000 F; 220,000 Z; 210,000 A; 250,000 A; 150,000 A; 140,000 A; 95,000 ... 180,000 Observations: 165,000 MapReduce programs are small, have little control-flow, few execution paths Dynamic symbolic execution systematically explores execution paths, precisely Approach: O; 150,000 T; 150,000 A; 150,000 A; 140,000 Derive path condition, return value, store in indexed execution tree Index leaf nodes by length of input list SiblingPath: Triggered by input list of same length Encode potential violation of correctness condition in constraint system Encode as variables: L = (L[0], L[1], ..); P(L) = (L[p[0]], L[p[1]], ..) Assert PathCond; Assert SubstituteIndices(SiblingPath, Permutation); Assert Result ≠ SubstituteIndices(SiblingResult, Permutation); Infer input values L and permutation P with off-the-shelf constraint solver Convert solution to test case, run, confirm violation B; ... ... ... ? C; ... ... E; 100,000 S; 100,000 A; 95,000 C; 95,000 ... Values may arrive in any order, depending on network, etc. ...