Tanakorn Leesatapornwongsa Haryadi S. Gunawi. ISSTA ’15 2 node1node2node3 TCP/UDP.

Slides:



Advertisements
Similar presentations
Efficient Kerberized Multicast Olga Kornievskaia University of Michigan Giovanni Di Crescenzo Telcordia Technologies.
Advertisements

COS 461 Fall 1997 Group Communication u communicate to a group of processes rather than point-to-point u uses –replicated service –efficient dissemination.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 14: Simulations 1.
CS 542: Topics in Distributed Systems Diganta Goswami.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
The Case for Drill-Ready Cloud Computing Vision Paper Tanakorn Leesatapornwongsa and Haryadi S. Gunawi 1.
IS333, Ch. 26: TCP Victor Norman Calvin College 1.
Modelling and Analysing of Security Protocol: Lecture 10 Anonymity: Systems.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed systems Programming with threads. Reviews on OS concepts Each process occupies a single address space.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
CS533 - Concepts of Operating Systems 1 Remote Procedure Calls - Alan West.
Performance Comparison of Existing Leader Election Algorithms for Dynamic Networks Mobile Ad Hoc (Dynamic) Networks: Collection of potentially mobile computing.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
4: Network Layer4a-1 Miscellaneous Last Modified: 6/28/2015 7:54:21 PM.
Clock Synchronization and algorithm
Chapter 23: ARP, ICMP, DHCP IS333 Spring 2015.
Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.
Introduction to Computer Networks 09/23 Presenter: Fatemah Panahi.
DISTRIBUTED PROCESS IMPLEMENTAION BHAVIN KANSARA.
Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
ECCP A Formally-Verified Migration Protocol For Mobile, Multi-Homed Hosts Matvey Arye Joint work with: Erik Nordström, Robert Kiefer Jennifer Rexford, Michael.
Distributed Process Implementation Hima Mandava. OUTLINE Logical Model Of Local And Remote Processes Application scenarios Remote Service Remote Execution.
6 Steps of the Programming Process
Mapping Internet Addresses to Physical Addresses (ARP)
1 Internet Protocol: Forwarding IP Datagrams Chapter 7.
Map Reduce: Simplified Data Processing On Large Clusters Jeffery Dean and Sanjay Ghemawat (Google Inc.) OSDI 2004 (Operating Systems Design and Implementation)
Networked File System CS Introduction to Operating Systems.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
Chapter 1 What is Programming? Lecture Slides to Accompany An Introduction to Computer Science Using Java (2nd Edition) by S.N. Kamin, D. Mickunas, E.
REVIEW On Friday we explored Client-Server Applications with Sockets. Servers must create a ServerSocket object on a specific Port #. They then can wait.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
Switching breaks up large collision domains into smaller ones Collision domain is a network segment with two or more devices sharing the same Introduction.
1 Debugging. 2 A Lot of Time is Spent Debugging Programs Debugging. Cyclic process of editing, compiling, and fixing errors. n Always a logical explanation.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Working with arrays (we will use an array of double as example)
Cloud Testing Haryadi Gunawi Towards thousands of failures and hundreds of specifications.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
1 Process migration n why migrate processes n main concepts n PM design objectives n design issues n freezing and restarting a process n address space.
Toward Fault-tolerant P2P Systems: Constructing a Stable Virtual Peer from Multiple Unstable Peers Kota Abe, Tatsuya Ueda (Presenter), Masanori Shikano,
The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.
TANGO TANGO ALTERNATE NETWORK GRAPH ORGANIZER Olof Hellqvist Zak Blacher.
AN ASPECT-ORIENTED ARCHITECTURE FOR HANDLING VARIATION ACROSS MOBILE PLATFORMS Proposed by: Chokchai Phatharamalai Advisor: Dr. Paul Janecek.
Sensor Network Simulation Kevin Driver, Russell Glasser, Oswin Housty.
Distributed System Concepts and Architectures Services
Silberschatz, Galvin and Gagne  Operating System Concepts UNIT II Operating System Services.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 University of Maryland Runtime Program Evolution Jeff Hollingsworth © Copyright 2000, Jeffrey K. Hollingsworth, All Rights Reserved. University of Maryland.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
1 Software Reliability in Wireless Sensor Networks (WSN) -Xiong Junjie
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
Tanakorn Leesatapornwongsa, Jeffrey F. Lukman, Shan Lu, Haryadi S. Gunawi.
Pitfalls: Time Dependent Behaviors CS433 Spring 2001 Laxmikant Kale.
Kernel Design & Implementation
Presented by: Daniel Taylor
Model and complexity Many measures Space complexity Time complexity
Chapter 2: System Structures
Lecture 17: Leader Election
Two phase commit.
Storage Virtualization
Automated Parser Generation for High-Speed NIDS
Fault-Tolerant State Machine Replication
FlyMC: Highly Scalable Testing of Complex Interleavings in Distributed Systems Jeffrey F. Lukman, Huan Ke, Cesar Stuardo, Riza Suminto, Daniar Kurniawan,
Chapter 13: I/O Systems.
Virtual LAN (VLAN).
Presentation transcript:

Tanakorn Leesatapornwongsa Haryadi S. Gunawi

ISSTA ’15 2 node1node2node3 TCP/UDP

ISSTA ’15 3 node1node2node3 C B A Message processing order 1. Node 2 processes A 2. Node 3 processes B 3. Node 2 processes C

ISSTA ’15 4 node1node2node3 C B A Message processing order 1. Network delays A 2. Node 3 processes B 3. Node 2 processes C 4. Node 2 processes A

ISSTA ’15 5 node1node2node3 C B A Message processing order 1. Node 2 processes A 2. Node 3 processes B 3. Node 2 processes C 1. Node 3 processes B 2. Node 2 processes A 3. Node 2 processes C 1. Node 3 processes B 2. Node 2 processes C 3. Node 2 processes A

ISSTA ’15 6 Model Checking Server node1node3node2 AB C D A, BC, D A, B, C, D Interposition layer

ISSTA ’15 7 Model Checking Server node1node3node2 AB C D A, BC, D D, A, C, B Interposition layer A, B, D, C D, C, B, A...

ISSTA ’15 8  SAMC demo  Integration of SAMC  Real integration  Conclusion

ISSTA ’15 9  SAMC demo  Integration of SAMC  Real integration  Conclusion

ISSTA ’15 10  Demo program  Leader election  Find which node has the BIGGEST ID at the election time  Have only one leader!

ISSTA ’15 11 node1node2node3 V=1 V=2 Support = 2 Support = 3 V=3  When start up, it supports itself  Broadcast support  If receiving ID is smaller, do nothing  If bigger, change support  After support change, broadcast again  Stop when majority agree Leader = 3

ISSTA ’15 12  Run SAMC with 2 exploration algorithms  Brute force  Slow and inefficient  Local-message independent (LMI)  Fast white-box testing  Requires semantic information  Message semantic and system state

ISSTA ’15 13  Replaying buggy execution path again  Use execution path output to replay  Debug the execution until the desired step Very easy for developers to debug code and fix bugs

ISSTA ’15 14  Re-order all messages as we want  Report execution path and execution result  SAMC is semantic-aware  Supporting semantic-aware exploration algorithms  Fast model checking  SAMC with LMI can catch 2-leader bug in 3 executions!!!  Execution replay function

ISSTA ’15 15  SAMC demo  Integration of SAMC  Real integration  Conclusion

ISSTA ’15 16  Aspect-oriented programming for interposition layer  Written separately, not clutter with system code  Intercept at message sending method  Inform message semantic to the server via key-value pairs LeaderElectionAspect.aj

ISSTA ’15 17  Basic algorithms  Brute force, random, etc.  Extendable dynamic-partial order reduction (DPOR)  Implement LMI by adding application-specific logic to DPOR

ISSTA ’15 18  Extends abstract class WorkloadDriver  How to start / stop / reset the system  How to start workload we want to check

ISSTA ’15 19 Start Java processes that run SampleSys with given config files

ISSTA ’15 20  Extend abstract class SpecificationVerifier  Does system behave as specification? How many leader? Does everyone agree on one leader?

ISSTA ’15 21  SAMC demo  Integration of SAMC  Real integration  Conclusion

ISSTA ’15 22  Non-determinism  Network communication  Disk I/O  Machine crash / machine restart  Model check 5 versions  Reproduce 7 old bugs  Leader election and atomic broadcast protocol  Some require multiple crashes and reboots  Find 1 new bug

ISSTA ’15 23 Issue#ProtocolBrute forceRandomSemantic-Aware ZK-335ZAB ZK-790ZLE ZK-975ZLE ZK-1075ZLE ZK-1419ZLE ZK-1492ZLE ZK-1653ZAB ZAB = ZooKeeper atomic broadcast protocol ZLE = ZooKeeper leader election protocol Number of execution to run to reproduce old bugs

ISSTA ’15 24  SAMC demo  Integration of SAMC  Real integration  Conclusion

ISSTA ’15 25  Semantic awareness for fast model checking  AOP for interposition layer  SAMC server is extendable and comes with replay function  Able to integrate to real systems

ISSTA ’15 26  Timeout interposition  Catching performance bugs  Step-by-step replay function

27 ISSTA ’15 Code can be found at

ISSTA ’15 28 Model Checking Server node1node3node2 A B C D A, BC, D A, B, C, D

29 A A B B Alloc Req X1X1 X1X1 Some code here Some texts here Test B ddafdafa abcc metadata New text L ISSTA ’15

30  Come with extendable dynamic-partial order reduction (DPOR)  Implement LMI by adding application-specific logic to DPOR  Testers write workload driver  What workload to feed to the system  How to check the correctness of the system

ISSTA ’15 31  AOP for interposition layer  Written separately, not clutter with system code  Intercept at sending method  Forward message semantic to model checking server pointcut write(Sender sender, ElectionMessage msg) : call(public void Sender.write(ElectionMessage)) && this(sender) &&...; void around(Sender sender, ElectionMessage msg) : write(sender, msg) { LeaderElectionPacket packet = new LeaderElectionPacket(...); packet.addKeyValue(LeaderElectionPacket.EVENT_ID_KEY, hash(msg, sender.otherId)); packet.addKeyValue(LeaderElectionPacket.SOURCE_KEY, id); packet.addKeyValue(LeaderElectionPacket.DESTINATION_KEY, sender.otherId); packet.addKeyValue(LeaderElectionPacket.LEADER_KEY, msg.getRole()); packet.addKeyValue(LeaderElectionPacket.ROLE_KEY, msg.getLeader()); nodeSenderMap.put(packet.getId(), packet); msgSenderMap.put(packet.getId(), sender); try { modelCheckingServer.offerPacket(packet); } catch (RemoteException e) { e.printStackTrace(); }