A Comprehensive Study on Real World Concurrency Bugs in Node.js

Slides:



Advertisements
Similar presentations
Executional Architecture
Advertisements

Channel Access Enhancements J. Hill. R3.14 Enhancements Large array support in the portable server –nearly complete –a priority for SNS Port syntax for.
Race Detection for Event-driven Mobile Applications
Race Conditions. Isolated & Non-Isolated Processes Isolated: Do not share state with other processes –The output of process is unaffected by run of other.
Intro to Threading CS221 – 4/20/09. What we’ll cover today Finish the DOTS program Introduction to threads and multi-threading.
Microsoft Research Faculty Summit Yuanyuan(YY) Zhou Associate Professor University of Illinois, Urbana-Champaign.
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared.
 A JavaScript runtime environment running Google Chrome’s V8 engine ◦ a.k.a. a server-side solution for JS ◦ Compiles JS, making it really fast  Runs.
Exceptions and Mistakes CSE788 John Eisenlohr. Big Question How can we improve the quality of concurrent software?
Cooperative Task Management without Manual Stack Management Or, Event-driven Programming is not the Opposite of Thread Programming Atul Adya, John Howell,
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Node.js - What is Node.js? -
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
Pallavi Joshi* Mayur Naik † Koushik Sen* David Gay ‡ *UC Berkeley † Intel Labs Berkeley ‡ Google Inc.
What Change History Tells Us about Thread Synchronization RUI GU, GUOLIANG JIN, LINHAI SONG, LINJIE ZHU, SHAN LU UNIVERSITY OF WISCONSIN – MADISON, USA.
Use of Coverity & Valgrind in Geant4 Gabriele Cosmo.
ABSTRACT The real world is concurrent. Several things may happen at the same time. Computer systems must increasingly contend with concurrent applications.
Fast Reproducing Web Application Errors Jie Wang, Wensheng Dou, Chushu Gao, Jun Wei Institute of Software Chinese Academy of Sciences
(1) Introduction to Java GUIs Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of Hawaii Honolulu.
CS533 – Spring Jeanie M. Schwenk Experiences and Processes and Monitors with Mesa What is Mesa? “Mesa is a strongly typed, block structured programming.
CAPP: Change-Aware Preemption Prioritization Vilas Jagannath, Qingzhou Luo, Darko Marinov Sep 6 th 2011.
Is Spreadsheet Ambiguity Harmful? Detecting and Repairing Spreadsheet Smells due to Ambiguous Computation Wensheng Dou 1, Shing-Chi Cheung 2, Jun Wei 1.
Node.JS introduction. What is Node.JS? v8 JavaScript runtime Event driven Non-blocking standard libraries Most APIs speak streams Provides a package manager.
Code improvement: Coverity static analysis Valgrind dynamic analysis GABRIELE COSMO CERN, EP/SFT.
Content Coverity Static Analysis Use cases of Coverity Examples
Contents. Goal and Overview. Ingredients. The Page Model.
NFV Compute Acceleration APIs and Evaluation
NodeJS and MEAN cs6320.
Node.Js Server Side Javascript
Advanced Operating Systems CIS 720
Event Handling Patterns Asynchronous Completion Token
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Why Events Are A Bad Idea (for high-concurrency servers)
VEnron A Versioned Spreadsheet Corpus and Related Evolution Analysis
PA1 Discussion.
CSE 775 – Distributed Objects Submitted by: Arpit Kothari
Detecting Table Clones and Smells in Spreadsheets
Mobile Application Test Case Automation
Processes and Threads Processes and their scheduling
NodeJS and MEAN Prof. L. Grewe.
Chapter 8 – Software Testing
Testing and Debugging.
CS399 New Beginnings Jonathan Walpole.
Chapter 2 Processes and Threads Today 2.1 Processes 2.2 Threads
Async or Parallel? No they aren’t the same thing!
Lecture 25 More Synchronized Data and Producer/Consumer Relationship
Understanding Real World Data Corruptions in Cloud Systems
Web Programming– UFCFB Lecture 17
Node.Js Server Side Javascript
Shuai Wang, Wensheng Dou, Chushu Gao, Jun Wei, Tao Huang
2017, Fall Pusan National University Ki-Joune Li
Authors: Khaled Abdelsalam Mohamed Amr Kamel
The Active Object Pattern
Building responsive apps and sites with HTML5 web workers
CS510 Operating System Foundations
Half-Sync/Half-Async (HSHA) and Leader/Followers (LF) Patterns
How Are Spreadsheet Templates Used in Practice: A Case Study on Enron
Monitor Object Pattern
Introduction to Static Analyzer
Jonathan Walpole Computer Science Portland State University
Speculative execution and storage
CS5220 Advanced Topics in Web Programming More Node.js
Why Threads Are A Bad Idea (for most purposes)
JavaScript CS 4640 Programming Languages for Web Applications
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Understanding Real-World Concurrency Bugs in Go
CS5220 Advanced Topics in Web Programming More Node.js
Threads CSE 2431: Introduction to Operating Systems
Presentation transcript:

A Comprehensive Study on Real World Concurrency Bugs in Node.js The 32nd IEEE/ACM international conference on Automated Software Engineering (ASE 2017) A Comprehensive Study on Real World Concurrency Bugs in Node.js Jie Wang, Wensheng Dou, Yu Gao, Chushu Gao, Feng Qin, Jun Wei Institute of software, Chinese academy of sciences The Ohio State University

Node.js Server-side event-driven framework for JavaScript Many apps are built with Node.js

Event-driven model in Node.js Example Event looper thread Example.js 1. print (‘Hello’); 2. writeFile (‘foo’, data, ); 3. readFile (‘foo’, ); Task queue cb1 cb2 Worker pool cb1 cb2 Event queue

Concurrency bugs in Node.js Two errors may happen Event looper thread Example.js 1. print (‘Hello’); 2. writeFile (‘foo’, data, cb1); 3. readFile (‘foo’, cb2); Task queue Expected: writeFile → readFile Buggy: readFile → writeFile cb1 r w cb2 w r worker pool File not exist w r Event queue

Concurrency bugs in Node.js Two errors may happen Event looper thread Example.js 1. print (‘Hello’); 2. writeFile (‘foo’, data, cb1); 3. readFile (‘foo’, cb2); Task queue cb1 cb2 w worker pool File not exist r Buggy: cb1→ cb2 Expected: cb1→ cb2 cb2 4. content.toString() cb1 5. content = data cb2 cb1 cb2 cb1 content is null Event queue cb2 4. content.toString() cb1 5. content = data

Motivation Concurrency bugs in Node.js are different from those in multi-thread programs Single thread vs multiple thread Event-driven vs shared memory Lack of understanding about concurrency bugs in Node.js

Bug characteristics and findings Methodology Search bugs by keywords in GitHub - concurrent, race …. Manually confirm bugs Real concurrency bugs? Do they have fixes? Analyze and categorize What are their bug patterns? How are they fixed? 1,583 bug reports 57 bugs from 52 projects Bug characteristics and findings

Distribution of subjects Different functionality 52 projects (12 server apps + 6 desktop apps + 34 libraries) High popularity Well-maintained 2,466 30,633 1,541 14,213 49 #stars #revisions 5 1,171 8,017 #issues

Research questions RQ1: What are common bug patterns and root causes? RQ2: Do concurrency bugs have severe impacts? RQ3: How are concurrency bugs triggered? RQ4: How do developers fix concurrency bugs?

RQ1. Bug patterns 1. Order Violation (17/57 = 30%) The order intention between two events is violated Intended order Wrong order

RQ1. Bug patterns 2. Atomic Violation (37/57 = 65%) The atomic intention among several events is violated “on” 1. var run = (daylayTime) => { 2. client = initClient(); 3. … 4. if (dayTime > 0) { 5. return setTimeout ( () => { 6. if(client.sync(callback) === true) return run(0) 7. }); 8. } 9. } “off” 10. client = null “setTimeout” NullPointer

RQ1 – Bug patterns Finding 1. Atomicity violation bugs dominate

RQ1 – Bug patterns Atomicity violation bugs in different programming paradigms Implication. New approaches for detecting atomicity violations in a single thread are needed Node.js Multi-thread p=v p=null *p *p p=null p=v Involve one thread Involve multiple threads

RQ1. Why are bugs introduced? Finding 2. Event triggering (74%) and asynchronous operations (33%) are two main sources of concurrency bugs Implication. New bug detection approaches should cover races in the worker pool task queue 2. Asynchronous operations (19/57=33%) cb2 cb1 Worker pool event queue cb1 1. Event triggering (42/57=74%) cb2 cb3

RQ1. How are racing events scheduled? Wrong usage of API protocols Example Promise.all([ self.User.findOrCreate({id: 42}, {val: 47}, { ts: t }), self.User.findOrCreate({id: 42}, {val: 49}, { ts: t }) ]); “I would expect findOrCreate() method to be atomic. However …” Finding 6. API usage misunderstanding is the main reason of concurrency bugs. The misunderstanding includes order misassumption, atomic misassumption, and functionality/parameter misusage. Implication. About the API document/specifications, the library may provide good specifications about the happen-before relation of in their event protocol, the atomicity of an API, beside the general method signature and corresponding description Find() findOrCreate-1 create() findOrCreate-2

RQ1. How are racing events scheduled? Finding 3. About a half of bugs use high-level API protocols in an improper way Implication. Asynchronous API protocols can help resolve concurrency bugs in Node.js Schedule Strategy Cases #Bugs Native API setTimeout setInterval setImmediate process.nextTick 7 (11%) 1 (3%) 1 (2%) 2 (2%) API protocol 27 (47%)

RQ2: Bug impacts Finding 4. Almost all bugs cause severe consequences, e.g., crashes

RQ3: Triggering conditions Finding 5. Simple input conditions can trigger most bugs Implication. Developers can focus on testing applications with simple inputs first Preconditions Cases #Bugs External requests 1 2 >=3 22 30 (20) 3 Config/Param   4 Deploy environment 77% need simple inputs

RQ3: Racing resources Finding 6. About 40% bugs contend for databases and files, rather than variables Implication. New bug detection approaches should also concern races on databases and files

RQ3: Triggering scope Finding 7. Most bugs require no more than 4 racing events Implication. Testing can be simplified to check no more than 4 events Triggering scopes Cases #Bugs Racing events 2 3 4 >4 19 14 20 Involved processes 1 54 Racing resources 49 96% need <=4 events Most concurrency bugs

RQ4: Bug fixing Finding 8. Most bugs are not fixed by adding synchronization Fix strategies Order Atomicity Starvation Total Add synchronization 7   14 (25%) Bypassing 5 9 Tolerance 1 4 5 (9%) Ignoring/retrying 2 3 (5%) Switching to atomic APIs 4 (7%) Moving code 2 (4%) Data privatization Changing priority 3 Other 10 (18%) 17 37 57 Avoid buggy event order Do not change the event order

Bug fixing example Tolerance Bug free in both order onend onheaders

RQ4: Bug fixing Finding 8. Most bugs are not fixed by adding synchronization Implication. Automated bug fixing approaches should consider other fix strategies Fix strategies Order Atomicity Starvation Total Add synchronization 7   14 (25%) Bypassing 5 9 Tolerance 1 4 5 (9%) Ignoring/retrying 2 3 (5%) Switching to atomic APIs 4 (7%) Moving code 2 (4%) Data privatization Changing priority 3 Other 10 (18%) 17 37 57 Avoid buggy event order Do not change the event order

Summary A comprehensive study on concurrency bugs in Node.js 57 concurrency bugs from 52 popular projects Many interesting findings and implications: Atomicity violation bugs dominate A half of bugs use high-level API protocols improperly … The subjects and their categorization are available: http://www.tcse.cn/~wsdou/project/NodeCB

Questions? Thank you!