Wiera: Towards Flexible Multi-Tiered Geo-Distributed Cloud Storage Instances Zhe Zhang.

Slides:



Advertisements
Similar presentations
Split Databases. What is a split database? Two databases Back-end database –Contains tables (data) only –Resides on server Front-end database –Contains.
Advertisements

Remus: High Availability via Asynchronous Virtual Machine Replication
IBM Software Group ® Integrated Server and Virtual Storage Management an IT Optimization Infrastructure Solution from IBM Small and Medium Business Software.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
MADFS: The Mobile Agent- based Distributed Network File system Presented by : Hailong Hou Instructor: Yanqing Zhang.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Data Sharing in OSD Environment Dingshan He September 30, 2002.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
Peer-to-peer Multimedia Streaming and Caching Service by Won J. Jeon and Klara Nahrstedt University of Illinois at Urbana-Champaign, Urbana, USA.
1 Proxy-Assisted Techniques for Delivering Continuous Multimedia Streams Lixin Gao, Zhi-Li Zhang, and Don Towsley.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Cloud Data Center/Storage Power Efficiency Solutions Junyao Zhang 1.
Lecture 15 – Amazon Network as a Service. Recall IaaS Server as a Service Storage as a Service Network as a Service.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 2.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Cloudifying Source Code Repositories: How much does it cost? 1 Hadi Salimi, Distributed Systems Labaratory, School of Computer Engineering, Iran University.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
Serverless Network File Systems Overview by Joseph Thompson.
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Caching Consistency and Concurrency Control Contact: Dingshan He
Geo-distributed Messaging with RabbitMQ
Introduction to ZooKeeper. Agenda  What is ZooKeeper (ZK)  What ZK can do  How ZK works  ZK interface  What ZK ensures.
COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited.
4/26/2017 Use Cloud-Based Load Testing Service to Find Scale and Performance Bottlenecks Randy Pagels Sr. Developer Technology Specialist © 2012 Microsoft.
1 Querying the Physical World Son, In Keun Lim, Yong Hun.
Microsoft Windows Server 2012 R2. What’s NEW in Windows Server 2012 R2.
CubicRing ENABLING ONE-HOP FAILURE DETECTION AND RECOVERY FOR DISTRIBUTED IN- MEMORY STORAGE SYSTEMS Yiming Zhang, Chuanxiong Guo, Dongsheng Li, Rui Chu,
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
100% Exam Passing Guarantee & Money Back Assurance
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
Fault – Tolerant Distributed Multimedia Streaming Web Application By Nirvan Sagar – Srishti Ganjoo – Syed Shahbaaz Safir
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Course: Cluster, grid and cloud computing systems Course author: Prof
Results: Market Presence
100% Exam Passing Guarantee & Money Back Assurance
Curator: Self-Managing Storage for Enterprise Clusters
ETHANE: TAKING CONTROL OF THE ENTERPRISE
Agenda Backup Storage Choices Backup Rule
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
HPE SimpliVity 380 Our differentiated approach to hyperconvergence.
Introduction to NewSQL
Memory Management for Scalable Web Data Servers
Veeam Backup Repository
Consistency in Distributed Systems
Replication Middleware for Cloud Based Storage Service
EECS 498 Introduction to Distributed Systems Fall 2017
Outline Midterm results summary Distributed file systems – continued
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
AWS Cloud Computing Masaki.
User-level Distributed Shared Memory
Cloud Computing Architecture
Using the Cloud for Backup, Archiving & Disaster Recovery
Last Class: Web Caching
Global Distribution.
Microsoft Virtual Academy
06 | SQL Server and the Cloud
Windows Azure SDK 1.7 and New Features
Presentation transcript:

Wiera: Towards Flexible Multi-Tiered Geo-Distributed Cloud Storage Instances Zhe Zhang

CAP Theorem For distributed system, “P” is an imperative property Tradeoff between “A” and “C”

Cost Model in AWS Cloud providers use cost model to charge customers There exists a tradeoff between cost and speed

Agenda Introduction Dynamic Consistency Dynamic Primary Locations Reducing cost with Multi-tier storage Exploiting Remote Storage Tiers Conclusions

Agenda Introduction Dynamic Consistency Dynamic Primary Locations Reducing cost with Multi-tier storage Exploiting Remote Storage Tiers Conclusions

Wiera Architecture Wiera User Interface: coordination Global Policy Manager: user-defined policies Tiera Server Manager: mange Tiera instances

Agenda Introduction Dynamic Consistency Dynamic Primary Locations Reducing cost with Multi-tier storage Exploiting Remote Storage Tiers Conclusions

Strong Consistency and Weak Availability put a C c User a sends put request

Strong Consistency and Weak Availability lock B b A put a C c User a sends put request Get a global lock

Strong Consistency and Weak Availability lock B b A put a C c User a sends put request Get a global lock Synchronously updating the object in B and C

Strong Consistency and Weak Availability response a C c User a sends put request Get a global lock Synchronously updating the object in B and C Release the lock and response to user

Strong Consistency and Weak Availability get b A get a C get c User a sends put request Get a global lock Synchronously updating the object in B and C Release the lock and response to user Users start reading from all replications

Weak Consistency and Strong Availability put a C c User a sends put request

Weak Consistency and Strong Availability put a response C c User a sends put request User a immediately gets response

Weak Consistency and Strong Availability get b A outdated object get a C get c User a sends put request User a immediately gets response Users start reading the object but b, c get outdated version of the object

Weak Consistency and Strong Availability get update b A get a C get c update User a sends put request User a immediately gets response Users start reading the object but b, c get outdated version of the object Updates eventually propagate to B and C

Weak Consistency and Strong Availability get b A get a C get c User a sends put request User a immediately gets response Users start reading the object but b, c get outdated version of the object Updates eventually propagate to B and C Users read consistent object

Medium Consistency and Availability put a C c User a get response after all servers receive the object

Medium Consistency and Availability get b A get a C get c User a get response after all servers receive the object All get requests are forwarded to A

Wiera Solution A tradeoff between latency and consistency Dynamically switch consistency models MultiPrimaries PrimaryBackup EventualConsistency High Latency Strong Consistency Medium Latency Medium Consistency Low Latency Weak Consistency

Dynamic Consistency Reducing latency once MultiplePrimaries goes beyond threshold Recovering consistency when latency meet the requirement

Agenda Introduction Dynamic Consistency Dynamic Primary Locations Reducing cost with Multi-tier storage Exploiting Remote Storage Tiers Conclusions

Imbalanced Traffic on Primary Locations get get USA Asia Primary forwared get

Imbalanced Traffic on Primary Locations get get USA Asia Primary forwared get

Imbalanced Traffic on Primary Locations get get USA Asia forwared get Primary

Changing Primary Maintain good consistency Reduce latency

Agenda Introduction Dynamic Consistency Dynamic Primary Locations Reducing cost with Multi-tier storage Exploiting Remote Storage Tiers Conclusions

Motivation Data popularity follows Zipfian distribution Only a small portion of data are considered as hot data Cold data are rarely accessed Using different storage media for different popularity Hot data need fast access speed Cold data can tolerate low access speed

Tradeoff between Cost and Latency Expensive but Fast Tier1(EBS with SSD) Tier2(EBS with HHD) Tier3(S3) Tier4(S3-IA) Cheap but Slow

Reducing Cost Using Multi-tier Storage Performance of accessing local tier storage Performance of accessing centralized cold storage

Reducing Cost Using Multi-tier Storage Performance of accessing local tier storage Money Saver! 10TB can save $300 Performance of accessing centralized cold storage

Agenda Introduction Dynamic Consistency Dynamic Primary Locations Reducing cost with Multi-tier storage Exploiting Remote Storage Tiers Conclusions

Test Setup get get Azure AWS Primary Backup forwared get Create an Wiera instance on Azure as PrimaryBackupConsistency

Exploiting Remote Storage Tiers Azure VM Azure Azure Disk AWS VM Amzon Azure Memory Amazon AWS Azure

Conclusions Wiera is a robust geo-distributed cloud storage Wiera dynamically change configurations to meet desired metrics: Latency Consistency Cost Criticism: When exploiting remote storage tiers, the author did not try to create PrimaryBackup instance on AWS

Questions ?