Team 6: Ali Nickparsa, Yoshimichi Nakatsuka, Yuya Shiraki

Slides:



Advertisements
Similar presentations
GOC resilience John Gordon, STFC GridPP 22 microtalk.
Advertisements

Wait-free coordination for Internet-scale systems
Presented by Mr.Vihang S. Kathe IBC High availability Solution High performing IT Solutions.
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
1 Cycle Detection in Publish/Subscribe Overlay Networks Reza Sherafat Alex Cheung Prof. Cristiana Amza ECE1747 – Course Project University of Toronto.
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
ITIS 3110 Jason Watson. Replication methods o Primary/Backup o Master/Slave o Multi-master Load-balancing methods o DNS Round-Robin o Reverse Proxy.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Zookeeper at Facebook Vishal Kathuria.
Making Services Fault Tolerant
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Dynamo Kay Ousterhout. Goals Small files Always writeable Low latency – Measured at 99.9 th percentile.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Chris Shuster 4/29/2009 1Chris Shuster.  Application Servers ◦ Backend processing platform. ◦ Multiple platforms, operating system and architecture.
Murtadha Al Hubail Project Team:. Motivation & Goals NC 1 Cluster Controller NC2 NC3 AsterixDB typical cluster consists of a master node (Cluster Controller)
1 Making Services Fault Tolerant Pat Chan, Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong Miroslaw Malek.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Copyright © 2002 Wensong Zhang. Page 1 Free Software Symposium 2002 Linux Virtual Server: Linux Server Clusters for Scalable Network Services Wensong Zhang.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
SQL Server Replication By Karthick P.K Technical Lead, Microsoft SQL Server.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
1 The Google File System Reporter: You-Wei Zhang.
Implementing Multi-Site Clusters April Trần Văn Huệ Nhất Nghệ CPLS.
Module 12: Designing High Availability in Windows Server ® 2008.
High-Availability Linux.  Reliability  Availability  Serviceability.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Computing in the RAIN: A Reliable Array of Independent Nodes Group A3 Ka Hou Wong Jahanzeb Faizan Jonathan Sippel.
Slide 1 Physical Architecture Layer Design Chapter 13.
Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Databases Illuminated
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Fault Tolerant Services
CERN IT Department CH-1211 Genève 23 Switzerland PES 1 Ermis service for DNS Load Balancer configuration HEPiX Fall 2014 Aris Angelogiannopoulos,
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Replication Store it in multiple places.... Literature Colouris, Dollimore, Kindberg, 2000 –Gets deep into the details of reliable communication, byzantine.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
Introduction to Performance Testing Performance testing is the process of determining the speed or effectiveness of a computer, network, software program.
Virtual Machine Movement and Hyper-V Replica
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
REPLICATION & LOAD BALANCING
High Availability 24 hours a day, 7 days a week, 365 days a year…
Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng
N-Tier Architecture.
Maximum Availability Architecture Enterprise Technology Centre.
Cluster Communications
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
Apache Hadoop YARN: Yet Another Resource Manager
CHAPTER 3 Architectures for Distributed Systems
Replication Middleware for Cloud Based Storage Service
EECS 498 Introduction to Distributed Systems Fall 2017
GARRETT SINGLETARY.
Distributed P2P File System
Multiple Processor Systems
Distributed computing deals with hardware
Wait-free coordination for Internet-scale systems
Fault-Tolerant State Machine Replication
Database System Architectures
Load Balancing in BAD Project
Presentation transcript:

Team 6: Ali Nickparsa, Yoshimichi Nakatsuka, Yuya Shiraki Implementation of Failover Handling of Broker Network in Big Active Data Team 6: Ali Nickparsa, Yoshimichi Nakatsuka, Yuya Shiraki

Motivation & Goals Project Goal continuous feeding of information to subscribers in BAD architecture Obstacles broker’s potential to fail due to disasters, hardware issues, software errors, or human errors Mission implement a failover mechanism of a broker to have resiliency in the broker network Failure detection Failure handling via client reconnection Side Benefits reusability of proposed mechanism to other functions including load-balancing and handoffs of clients failure broker broker broker

Related Work Components for Failover handling schemes Replication Copying DB of master node to all nodes and propagate update [Chubby] Store data to multiple nodes on virtual ring [Dynamo DB] Each data is stored to multiple nodes of the Placement Group [Ceph] Failure Detection Send periodic messages to each other [Chubby, Dynamo DB, Ceph] Create a data structure of current active nodes [Zookeeper] Recovery from failure Re-elect a master node using a election algorithm [Chubby] Old node is removed from group and a new node is added [Dynamo DB, Ceph]

Design Specifics Broker↔BCS(Broker Coordinating Service) BCS checks broker status periodically BCS updates brokers list Clients↔BCS Client sends broker discovery message to BCS on broker-client disconnection Clients↔Broker Client reconnects to new broker Client sends last received message sequence number to tell its state Broker pulls clients subscription from AsterixDB broker broker broker connection request periodic heartbeat BCS Broker discovery brokers List

Testing & Evaluation Plan Environment Docker Containers 1 AsterixDB, 1BCS, 2 Brokers, Clients Evaluation Delay for reconnection vs. Timeout for periodic Websocket status check Delay for reconnection vs. Number of users to move