A Collaborative Monitoring Mechanism for Making a Multitenant Platform Accoutable HotCloud 10 By Xuanran Zong.

Slides:

Advertisements

Similar presentations

Key distribution and certification In the case of public key encryption model the authenticity of the public key of each partner in the communication must.

Advertisements

Dynamo: Amazon’s Highly Available Key-value Store

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.

Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.

Gefördert durch das Kompetenzzentrenprogramm DI Alfred Wertner 19. September 2014 Ubiquitous Personal Computing © Know-Center Security.

Click a NOTUS Suite- product for a short description NOTUS REGIONAL NOTUS Regional helps regions perform the tasks related to the reimbursement of providers.

Function Point Analysis example. Function point FP is defined as one end-user business function FPA evaluates the system from a user perspective.

Information Security Policies and Standards

CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.

Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,

CS 582 / CMPE 481 Distributed Systems

On Reducing Communication Cost for Distributed Query Monitoring Systems. Fuyu Liu, Kien A. Hua, Fei Xie MDM 2008 Alex Papadimitriou.

SRG PeerReview: Practical Accountability for Distributed Systems Andreas Heaberlen, Petr Kouznetsov, and Peter Druschel SOSP’07.

Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.

Chapter 13 Replica Management in Grids

1 Lecture 22: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA’03, Wisconsin A Low Overhead Fault Tolerant Coherence.

Overview Distributed vs. decentralized Why distributed databases

Cumulative Violation For any window size  t  Communication-Efficient Tracking for Distributed Cumulative Triggers Ling Huang* Minos Garofalakis.

Learning from the Past for Resolving Dilemmas of Asynchrony Paul Ezhilchelvan and Santosh Shrivastava Newcastle University England, UK.

Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.

New Challenges in Cloud Datacenter Monitoring and Management

H-1 Network Management Network management is the process of controlling a complex data network to maximize its efficiency and productivity The overall.

Client/Server Software Architectures Yonglei Tao.

Construction of efficient PDP scheme for Distributed Cloud Storage. By Manognya Reddy Kondam.

WORKFLOW IN MOBILE ENVIRONMENT. WHAT IS WORKFLOW ?  WORKFLOW IS A COLLECTION OF TASKS ORGANIZED TO ACCOMPLISH SOME BUSINESS PROCESS.  EXAMPLE: Patient.

Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

E-Science Meeting April Trusted Coordination in Dynamic Virtual Organisations Santosh Shrivastava School of Computing Science Newcastle University,

Fault Tolerance via the State Machine Replication Approach Favian Contreras.

Database Design – Lecture 16

1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.

M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.

Replication Mechanisms for a Distributed Time Series Storage and Retrieval Service Mugurel Ionut Andreica Politehnica University of Bucharest Iosif Charles.

Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.

Computer Science iBigTable: Practical Data Integrity for BigTable in Public Cloud CODASPY 2013 Wei Wei, Ting Yu, Rui Xue 1/40.

Department of Information Engineering The Chinese University of Hong Kong A Framework for Monitoring and Measuring a Large-Scale Distributed System in.

Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.

CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.

Towards Proactive Context-Aware Service Selection in the Geographically Distributed Remote Patient Monitoring System P. Pawar, B. J. F. van Beijnum, H.

VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.

Distributed Database Systems Overview

Chapter 2: System Models. Objectives To provide students with conceptual models to support their study of distributed systems. To motivate the study of.

Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.

Architectural Design of Distributed Applications Chapter 13 Part of Design Analysis Designing Concurrent, Distributed, and Real-Time Applications with.

Distributed Databases DBMS Textbook, Chapter 22, Part II.

Merkle trees Introduced by Ralph Merkle, 1979 An authentication scheme

1 Lecture 24: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA’03, Wisconsin A Low Overhead Fault Tolerant Coherence.

Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.

Geo-distributed Messaging with RabbitMQ

A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,

Department of Computing, School of Electrical Engineering and Computer Sciences, NUST - Islamabad KTH Applied Information Security Lab Secure Sharding.

Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,

Feb 15, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements.

Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.

Protection & Security Greg Bilodeau CS 5204 October 13, 2009.

Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.

Copyright © 2004, Keith D Swenson, All Rights Reserved. OASIS Asynchronous Service Access Protocol (ASAP) Tutorial Overview, OASIS ASAP TC May 4, 2004.

CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.

VPN. CONFIDENTIAL Agenda Introduction Types of VPN What are VPN Tokens Types of VPN Tokens RSA How tokens Work How does a user login to VPN using VPN.

Lecture 3 Page 1 CS 236 Online Security Mechanisms CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.

1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.

Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.

Distributed Systems Lecture 6 Global states and snapshots 1.

Distributed Computing

Dynamo: Amazon’s Highly Available Key-value Store

Peer-to-peer networking

Introduction to Static Timing Analysis:

Physical clock synchronization

Distributed Databases

Presentation transcript:

A Collaborative Monitoring Mechanism for Making a Multitenant Platform Accoutable HotCloud 10 By Xuanran Zong

Background Applications are moving to cloud – Pay-as-you-go basis – Resource multiplexing – Reduce over-provisioning cost Cloud service uncertainty – How do the clients know if the cloud provider handles their data and logic correctly Logic correctness Consistency constraints Performance

Service level agreement (SLA) To ensure data and logic are handled correctly, service provider offers service level agreement to clients – Performance e.g. One EC2 compute unit has the computation power of GHz – Availability e.g. the service would up 99.9% of the time

SLA Problems – Few means are provided to clients to make a SLA accountable when problem occurs Accountable means we know who is responsible when things go wrong Monitoring is provided by provider – Clients are often required to furnish evidence all by themselves to be eligible to claim credit for SLA violation

EC2 SLA Reference:

Accountability service Provided by third party Responsibility – Collect evidence based on SLA – Runtime compliance check and problem detection

Problem description Clients has a set of end-points {ep 0, ep 1, …, ep n-1 } that operate on data stored in multitenancy environment Many things can go wrong – Data is modified without owner’s permission – Consistency requirement is broken The accountability service should detect these issues and provide evidence.

System architecture Wrapper provided by third party Wrapper captures input/ouput from ep i and send to accountability service

Accountability service The accountability service maintains a view of the data state – Reflects what data should be from users’ perspective – Aggregates data updating requests of users to calculate the data state – Authenticates query results based on the calculated data state

Evidence collection and processing Logging service, wep, extract operation information and send log message to accountability service W – If it is a update service, W updates MB-tree – If it is a query service, W authenticates the result with MB-tree and ensures correctness and completeness – MB-tree maintains the data state

Data state calculation Use Merkle B-tree to calculate data state By combining the items in VO, we can recalculate the root of the MB-tree and compare it with the root to reveal the correctness and completeness of the query result

Consistency issue What if the log messages arrive out-of-order? – Assume eventual consistency – Clocks are synchronized – Maintains a sliding window of sorted log messages based on timestamp – Time window size is determined by the maximum delay of passing a log message from client to W

Collaborative monitoring mechanism Current approach – Centralized: availability, scalability, trustworthy Let’s make it distributed – Data state is maintained by a set of services – Each service maintains a view of the data state

Design choice I Log send to one data state service and the service then propagate the log to other services in a synchronous manner – Pros Strong consistency Request can be answered by any service – Cons Large overhead due to synchronous communication

Design choice II Log send to one service and the service propagate the log asynchronously – Pros Better logging performance – Cons Uncertainty in answering an authentication request

Their design Somewhere in between of the two extremes Partition the key range into a few disjoint regions Log message only sends to its designated region Log message is propagate synchronously within the region and asynchronously across regions Authentication request is directed to service whose region overlaps most with request range – Answer with certainty if request range falls inside service region – Wait, if not

Evaluation Overhead – Centralized design – Where does the overhead come from?

Evaluation VO calculation overhead

Evaluation Performance improvement with multiple data state service

Discussion Articulate the problem clearly and show one solution that employs third party to make the data state accountable Which part is the main overhead? – Communication? VO calculation? Distributed design does not help much when query range is large Do people want to sacrifice their performance(at least double the time) in order to make the service accountable? Can we use similar design to make other parts accountable? For instance, performance?