Understanding Real-World Concurrency Bugs in Go

Slides:



Advertisements
Similar presentations
Software Assurance Metrics and Tool Evaluation (SAMATE) Michael Kass National Institute of Standards and Technology
Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Understanding and Detecting Real-World Performance Bugs
Programming in Occam-pi: A Tutorial By: Zain-ul-Abdin
Effective Static Deadlock Detection
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.
Concurrency Important and difficult (Ada slides copied from Ed Schonberg)
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Structured Thread Models Kahn Process Networks, CSP, Go 1Dennis Kafura – CS5204 – Operating Systems.
Go Language * Go - Routines * Channels. New Concepts Do not communicate by sharing memory; instead, share memory by communicating. creating shared memory.
Peeking into Your App without Actually Seeing It: UI State Inference and Novel Android Attacks Qi Alfred Chen, Zhiyun Qian†, Z. Morley Mao University of.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
Asynchronous Assertions Eddie Aftandilian and Sam Guyer Tufts University Martin Vechev ETH Zurich and IBM Research Eran Yahav Technion.
1 School of Computing Science Simon Fraser University, Canada PCP: A Probabilistic Coverage Protocol for Wireless Sensor Networks Mohamed Hefeeda and Hossein.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
Microsoft Research Faculty Summit Yuanyuan(YY) Zhou Associate Professor University of Illinois, Urbana-Champaign.
An Integrated Framework for Dependable Revivable Architectures Using Multi-core Processors Weiding Shi, Hsien-Hsin S. Lee, Laura Falk, and Mrinmoy Ghosh.
1 Concurrency: Deadlock and Starvation Chapter 6.
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared.
Replay Debugging for Distributed Systems Dennis Geels, Gautam Altekar, Ion Stoica, Scott Shenker.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Discussion Week 3 TA: Kyle Dewey. Overview Concurrency overview Synchronization primitives Semaphores Locks Conditions Project #1.
Exceptions and Mistakes CSE788 John Eisenlohr. Big Question How can we improve the quality of concurrent software?
1 Concurrency: Deadlock and Starvation Chapter 6.
CSE 486/586 CSE 486/586 Distributed Systems PA Best Practices Steve Ko Computer Sciences and Engineering University at Buffalo.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Workflow Early Start Pattern and Future's Update Strategies in ProActive Environment E. Zimeo, N. Ranaldo, G. Tretola University of Sannio - Italy.
Pallavi Joshi* Mayur Naik † Koushik Sen* David Gay ‡ *UC Berkeley † Intel Labs Berkeley ‡ Google Inc.
What Change History Tells Us about Thread Synchronization RUI GU, GUOLIANG JIN, LINHAI SONG, LINJIE ZHU, SHAN LU UNIVERSITY OF WISCONSIN – MADISON, USA.
CS 152: Programming Language Paradigms May 7 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak
Use of Coverity & Valgrind in Geant4 Gabriele Cosmo.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
1 Announcements The fixing the bug part of Lab 4’s assignment 2 is now considered extra credit. Comments for the code should be on the parts you wrote.
1 Advanced Behavioral Model Part 1: Processes and Threads Part 2: Time and Space Chapter22~23 Speaker: 陳 奕 全 Real-time and Embedded System Lab 10 Oct.
Real Time Programming Language. Intro A programming language represents the nexus of design and structure. But misuse of the programming language can.
MMAC: A Mobility- Adaptive, Collision-Free MAC Protocol for Wireless Sensor Networks Muneeb Ali, Tashfeen Suleman, and Zartash Afzal Uzmi IEEE Performance,
13-1 Chapter 13 Concurrency Topics Introduction Introduction to Subprogram-Level Concurrency Semaphores Monitors Message Passing Java Threads C# Threads.
CAPP: Change-Aware Preemption Prioritization Vilas Jagannath, Qingzhou Luo, Darko Marinov Sep 6 th 2011.
Eraser: A dynamic Data Race Detector for Multithreaded Programs Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, Thomas Anderson Presenter:
Reachability Testing of Concurrent Programs1 Reachability Testing of Concurrent Programs Richard Carver, GMU Yu Lei, UTA.
Testing Concurrent Programs Sri Teja Basava Arpit Sud CSCI 5535: Fundamentals of Programming Languages University of Colorado at Boulder Spring 2010.
PRES: Probabilistic Replay with Execution Sketching on Multiprocessors Soyeon Park and Yuanyuan Zhou (UCSD) Weiwei Xiong, Zuoning Yin, Rini Kaushik, Kyu.
Scaling For (Almost) Free
Programming in Go(lang)
Deterministic Communication with SpaceWire
Last Class: Introduction
Healing Data Races On-The-Fly
Presented by: Daniel Taylor
Rewriting the LBD DNS Load Balancer with concurrency in GO
CS240: Advanced Programming Concepts
Outline Other synchronization primitives
Outline What does the OS protect? Authentication for operating systems
Chapter 8 – Software Testing
Threads Threads.
Other Important Synchronization Primitives
Lazy Diagnosis of In-Production Concurrency Bugs
New trends in parallel computing
INTER-PROCESS COMMUNICATION
Outline What does the OS protect? Authentication for operating systems
Real-time Software Design
Setac: A Phased Deterministic Testing Framework for Scala Actors
Inter-Process Communication and Synchronization
Transactional Events Kevin Donnelly, Boston University
A Comprehensive Study on Real World Concurrency Bugs in Node.js
Threading And Parallel Programming Constructs
Go.
Lecture Topics: 11/1 Hand back midterms
Concurrency in GO CS240 23/8/2017.
Mitigating the Effects of Flaky Tests on Mutation Testing
Presentation transcript:

Understanding Real-World Concurrency Bugs in Go Tengfei Tu1, Xiaoyu Liu2, Linhai Song1, and Yiying Zhang2 1Pennsylvania State University 2Purdue University

Golang A young but widely-used programming lang. Designed for efficient and reliable concurrency Provide lightweight threads, called goroutines Support both message passing and shared memory message Lightweight threads (goroutines) Memory

Massage Passing vs. Shared Memory Thread 1 Thread 2 Thread 1 Thread 2 Memory Message Passing Shared Memory Concurrency Bug

Does Go Do Better? Message passing better than shared memory? How well does Go prevent concurrency bugs?

171 Real-World Go Concurrency Bugs The 1st Empirical Study Collect 171 Go concurrency bugs from 6 apps through manually inspecting GitHub commit log How we conduct the study? Taxonomy based on two orthogonal dimensions Root causes and fixing strategies Evaluate two built-in concurrency bug detectors Cause BlotDB × × shared memory × × message passing 171 Real-World Go Concurrency Bugs Behavior blocking non-blocking

Highlighted Results Message passing can make a lot of bugs sometimes even more than shared memory 9 observations for developers’ references 8 insights to guide future research in Go

Outline Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions

Outline Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions

Message Passing in Go How to pass messages across goroutines? Channel: unbuffered channel vs. buffered channel Select: waiting for multiple channel operations Goroutine 1 Goroutine 2 Goroutine 1 Goroutine 2 select { case <- ch1: … case <- ch2: … } Non-deterministic ch <- m ch <- m m <- ch m <- ch unbuffered channel buffered channel select

An Example of Go Concurrency Bug Parent Goroutine func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 failure

An Example of Go Concurrency Bug Parent Goroutine func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 failure

An Example of Go Concurrency Bug Parent Goroutine Child Goroutine func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 go func() result := fn() ch <- result }() failure

An Example of Go Concurrency Bug Parent Goroutine Child Goroutine func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 blocking and goroutine leak go func() result := fn() ch <- result }() timeout signal failure

An Example of Go Concurrency Bug Parent Goroutine Child Goroutine func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 , 1) go func() result := fn() ch <- result }() not blocking any more failure

New Concurrency Features in Go func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 buffered channel vs. unbuffered channel anonymous function use select to wait for multiple channels failure

New Concurrency Features in Go func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 buffered channel vs. unbuffered channel anonymous function use select to wait for multiple channels failure

Outline Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions

Bug Taxonomy Categorize bugs based on two dimensions Root cause: shared memory vs. message passing func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 Cause shared memory channel message passing

Bug Taxonomy × × × × Categorize bugs based on two dimensions Root cause: shared memory vs. message passing Behavior: blocking vs. non-blocking func finishRequest(t sec) r object { ch := make(chan object) go func() result := fn() ch <- result }() select { case result = <- ch: return result case <- time.timeout(t): return nil } } //Kubernetes#5316 Cause shared memory × × blocking × × message passing Behavior blocking non-blocking

Bug Taxonomy × × × × Categorize bugs based on two dimensions Root cause: shared memory vs. message passing Behavior: blocking vs. non-blocking Cause Behavior blocking non-blocking shared memory 36 69 message passing 49 17 Cause shared memory × × 105 × × message passing 66 Behavior 85 86 blocking non-blocking

Concurrency Usage Study Observation: Share memory synchronizations are used more often in Go applications.

Outline Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions

Root Causes Conducting blocking operations to protect shared memory accesses to pass message across goroutines Message Passing Shared Memory

(mis)Protecting Shared Memory Observation: Most blocking bugs caused by shared memory synchronizations have the same causes as traditional languages.

Misuse of Channel ch <- m m <- ch Goroutine 1 Goroutine 2 blocking

Misuse of Channel with Lock func goroutine1() { m.Lock() ch <- request m.Unlock() } func goroutine2() { for { m.Lock() m.Unlock() request <- ch }

Observation Observation: more blocking bugs in our studied Go applications are caused by wrong message passing. 20% More

Implication Implication: we call for attention to the potential danger in programming with message passing. 20% More

Outline Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions Introduction A real bug example Go concurrency bug study Taxonomy Blocking Bug Non-blocking Bug Conclusions

Root Causes Failing to protect shared memory Errors during message passing Shared Memory Message Passing

Traditional Bugs > 50%

Misusing Channel close(ch) close(ch) ch <- m close(ch) Thread 1 panic! panic!

Implication Implication: new concurrency mechanisms Go introduced can themselves be the reasons of more concurrency bugs.

Conclusions 1st empirical study on go concurrency bugs Future works shared memory vs. message passing blocking bugs vs. non-blocking bugs paper contains more details (contact us for more) Future works Statically detecting go concurrency bugs checkers built based on identified buggy patterns Already found concurrency bugs in real applications

Thanks a lot!

171 Real-World Go Concurrency Bugs Questions? Cause × × shared memory BlotDB × × message passing Behavior 171 Real-World Go Concurrency Bugs blocking non-blocking Data Set: https://github.com/system-pclub/go-concurrency-bugs