Amazon Simple Storage Service (S3)

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Dynamo: Amazon’s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
© 2013 A. Haeberlen, Z. Ives Cloud Storage & Case Studies NETS 212: Scalable & Cloud Computing Fall 2014 Z. Ives University of Pennsylvania 1.
PROVENANCE FOR THE CLOUD (USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES(FAST `10)) Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer Harvard.
AMAZON’S KEY-VALUE STORE: DYNAMO DeCandia,Hastorun,Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels: Dynamo: Amazon's highly available.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - V ALUE S TORE Presented By Roni Hyam Ami Desai.
CC P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2014 Aidan Hogan Lecture X: 2014/05/12.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Dynamo: Amazon's Highly Available Key-value Store Guiseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
Dynamo: Amazon’s Highly Available Key-value Store Adopted from slides and/or materials by paper authors (Giuseppe DeCandia, Deniz Hastorun, Madan Jampani,
Data Grids Jon Ludwig Leor Dilmanian Braden Allchin Andrew Brown.
1 Dynamo Amazon’s Highly Available Key-value Store Scott Dougan.
NoSQL Databases: MongoDB vs Cassandra
Dynamo: Amazon’s Highly Available Key- value Store (SOSP’07) Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman,
GentleRain: Cheap and Scalable Causal Consistency with Physical Clocks Jiaqing Du | Calin Iorgulescu | Amitabha Roy | Willy Zwaenepoel École polytechnique.
Farsite: Ferderated, Available, and Reliable Storage for an Incompletely Trusted Environment Microsoft Reseach, Appear in OSDI’02.
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
CC P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2015 Lecture 9: NoSQL I Aidan Hogan
Dynamo: Amazon’s Highly Available Key-value Store Presented By: Devarsh Patel 1CS5204 – Operating Systems.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Databases with Scalable capabilities Presented by Mike Trischetta.
Dynamo: Amazon’s Highly Available Key-value Store COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
Depot: Cloud Storage with minimal Trust COSC 7388 – Advanced Distributed Computing Presentation By Sushil Joshi.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Dynamo: Amazon's Highly Available Key-value Store Dr. Yingwu Zhu.
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
Dynamo: Amazon’s Highly Available Key-value Store
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
CAP + Clocks Time keeps on slipping, slipping…. Logistics Last week’s slides online Sign up on Piazza now – No really, do it now Papers are loaded in.
Presenters: Rezan Amiri Sahar Delroshan
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
DYNAMO: AMAZON’S HIGHLY AVAILABLE KEY-VALUE STORE GIUSEPPE DECANDIA, DENIZ HASTORUN, MADAN JAMPANI, GUNAVARDHAN KAKULAPATI, AVINASH LAKSHMAN, ALEX PILCHIN,
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Big Data Yuan Xue CS 292 Special topics on.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Course: Cluster, grid and cloud computing systems Course author: Prof
CSCI5570 Large Scale Data Processing Systems
and Big Data Storage Systems
Column-Based.
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
CS 440 Database Management Systems
Amazon Storage- S3 and Glacier
Distributed Systems – Paxos
CSE 486/586 Distributed Systems Distributed Hash Tables
Dynamo: Amazon’s Highly Available Key-value Store
NOSQL.
Lecture 9: Dynamo Instructor: Weidong Shi (Larry), PhD
NOSQL databases and Big Data Storage Systems
Providing Secure Storage on the Internet
CS 440 Database Management Systems
Dynamo Recap Hadoop & Spark
CRDTs and Coordination Avoidance (Lecture 8, cs262a)
Presented By: Aarushi Chawla ( ) Shiv Kandikuppa ( )
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Web Server Design Week 16 Old Dominion University
Web Server Design Week 16 Old Dominion University
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Distributed Hash Tables
A Scalable Peer-to-peer Lookup Service for Internet Applications
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Presentation transcript:

Amazon Simple Storage Service (S3)

Outline Concepts How to Access S3 Scalability Availability Versioning Consistency

Concepts of S3 – Bucket Container for objects stored in S3 http://doc.s3.amazonaws.com/2006-03-01/AmazonS3.wsdl Concepts of S3 – Bucket Container for objects stored in S3 Unlimited size Organize the namespace at the highest level Internet accessible storage via HTTP/HTTPS Global unique name From: http://techblogsearch.com/a/amazon-announced-new-s3-usability-enhancements.html

Concepts of S3 – Object Similar to files. No hierarchy. http://doc.s3.amazonaws.com/2006-03-01/AmazonS3.wsdl Concepts of S3 – Object Similar to files. No hierarchy. Objects are immutable. Size up to 5 TB Uniquely identified within a bucket by a key(name) and a version ID

Concepts of S3 – Key Name of an object http://doc.s3.amazonaws.com/2006-03-01/AmazonS3.wsdl Concepts of S3 – Key Name of an object Unique identifier for an object within a bucket Use the object key to retrieve the object

Concepts of S3 – Namespace Globally Unique Bucket Name + Object Name (key)

How to Access S3 – CLI & REST AWS Command Put Object aws s3api put-object bucket key Get Object aws s3api get-object bucket key outfile REST API From: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectOps.html

How to Access S3 – AWS SDK SKD for Java Get Object AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider()); S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key)); Put Object AmazonS3 s3client = new AmazonS3Client(new ProfileCredentialsProvider()); s3client.putObject(new PutObjectRequest(bucketName, keyName, file)); From: http://docs.aws.amazon.com/AmazonS3/latest/dev/RetrievingObjectUsingJava.html

CAP Theorem In the presence of a network partition, strong consistency and high availability can not be achieved at the same time Amazon S3 (AP) High availability Eventual consistency Read-after-write consistency From: http://berb.github.io/diploma-thesis/original/061_challenge.html

Scalability – Consistent Hashing Avoid re-hashing everything Object key mapped to a point on the edge of ring Map machine to many random points on the edge of ring Machine responsible for arc before its nodes If a machine joins, it adds new nodes If a machine leaves, its range moves to after machine

Availability – Replication Pick n next nodes (different machines!) n -- Replication factor If one node fails, the other nodes still have the key information

Versioning – Basics keep multiple versions of an object in one bucket protect objects from unintended overwrites and deletions retrieve previous versions of them Write faster - PUT doesn’t overwrite: pushes version - GET returns most recent version DELETE doesn’t wipe GET will return not found From: http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#overview

Versioning --D: object --Sx,Sy,Sz: nodes Using vector clock to capture causality between different versions of the same object --D: object --Sx,Sy,Sz: nodes --[Sx, 1]: vector clock [node, count] One vector clock is associate with every version of every object Determine the versions of an object by examining their vector clocks If the counters on D1’s clock is less than or equal to all of the nodes in the second clock, D1 is ancestor of the second, can be forgotten Otherwise, the two changes require reconciliation D1([Sx, 1]) D2([Sx, 2]) D3([Sx, 2], [Sy, 1]) D4([Sx, 2], [Sz, 1]) D3 ? D4 Version evolution of an object over time

Consistency Read-after-write consistency Eventual consistency PUTS of new objects in S3 bucket Guarantee visibility of new data to all applications Allow you to retrieve objects immediately after creation Eventual consistency Overwrite PUTS and DELETES of objects in S3 bucket Weak consistency, to achieve high availability If no additional updates are made to a given data item, all reads to that item will eventually return the same value. Updates to a single key are atomic

Eventual Consistency Both W1 (write 1) and W2 (write 2) complete before the start of R1 (read 1) and R2 (read 2) Consistent read R1 and R2 both return color = ruby. Eventually consistent read R1 and R2 might return color = red, color = ruby, or no results, depending on the amount of time that has elapsed. From: http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html

Consistency Protocol A consistency protocol similar to those used in quorum system is used to maintain consistency among its replicas. This protocol has two key configurable values: R and W. R is the minimum number of nodes that must participate in a successful read operation. W is the minimum number of nodes that must participate in a successful write operation. N is the total number of nodes. Setting R and W. R+W ? N

Reference [1] Giuseppe DeCandia, Deniz Hastorun, etc. Dynamo: Amazon’s Highly Available Key-value Store [2] SHLOMO SWIDLER, Read-After-Write Consistency in Amazon S3 [3] Amazon Simple Storage Service Documentation [4] Peter Bailis, Ali Ghodsi UC Berkeley Eventual consistency today: limitations, extensions, and beyond [5] https://en.wikipedia.org/wiki/CAP_theorem [6] Mayur Palankar, Ayodele Onibokun, etc. Amazon S3 for Science Grids: a Viable Solution? [7] David Bermbach and Stefan Tai Eventual Consistency: How soon is eventual? [8] Jinesh Varia Cloud Architectures [9] Garfinkel, Simson L. An Evaluation of Amazon’s Grid Computing Services: EC2, S3, and SQS [10] Amazon S3 Masterclass [11] https://en.wikipedia.org/wiki/Consistent_hashing