PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.

Slides:



Advertisements
Similar presentations
High throughput chain replication for read-mostly workloads
Advertisements

PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno Jacobsen,
Author: Yang Zhang[SOSP’ 13] Presentator: Jianxiong Gao.
ZHT 1 Tonglin Li. Acknowledgements I’d like to thank Dr. Ioan Raicu for his support and advising, and the help from Raman Verma, Xi Duan, and Hui Jin.
1 Web-Scale Data Serving with PNUTS Adam Silberstein Yahoo! Research.
PNUTS: Yahoo’s Hosted Data Serving Platform Jonathan Danaparamita jdanap at umich dot edu University of Michigan EECS 584, Fall Some slides/illustrations.
PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen,
PNUTS: Yahoo!’s Hosted Data Serving Platform Yahoo! Research present by Liyan & Fang.
NoSQL Databases: MongoDB vs Cassandra
Distributed components
Web Data Management Raghu Ramakrishnan Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Large Scale Sharing GFS and PAST Mahesh Balakrishnan.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Object Naming & Content based Object Search 2/3/2003.
Wide-area cooperative storage with CFS
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
1 The Google File System Reporter: You-Wei Zhang.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
Where in the world is my data? Sudarshan Kadambi Yahoo! Research VLDB 2011 Joint work with Jianjun Chen, Brian Cooper, Adam Silberstein, David Lomax, Erwin.
PNUTS: Y AHOO !’ S H OSTED D ATA S ERVING P LATFORM B RIAN F. C OOPER, R AGHU R AMAKRISHNAN, U TKARSH S RIVASTAVA, A DAM S ILBERSTEIN, P HILIP B OHANNON,
Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,
Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
IMDGs An essential part of your architecture. About me
Alireza Angabini Advanced DB class Dr. M.Rahgozar Fall 88.
Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
PNUTS PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
Serverless Network File Systems Overview by Joseph Thompson.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Presenter: Seikwon KAIST The Google File System 【 Ghemawat, Gobioff, Leung 】
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
EMTTS UAT Day1 & Day2 Powered by:. Topics CoversTopics Remaining Comparison Network Infrastructure Separate EP Hosting Fault Tolerance.
CSci8211: Distributed System Techniques & Case Studies: I 1 Detour: Distributed Systems Techniques & Case Studies I  Distributing (Logically) Centralized.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
Dsitributed File Systems
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
CSCI5570 Large Scale Data Processing Systems NoSQL Slide Ack.: modified based on the slides from Adam Silberstein James Cheng CSE, CUHK.
Web-Scale Data Serving with PNUTS
Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore
NOSQL.
CSCI5570 Large Scale Data Processing Systems
PNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving Platform
Introduction to NewSQL
Chapter 19: Distributed Databases
Plethora: Infrastructure and System Design
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Massively Parallel Cloud Data Storage Systems
Building a Database on S3
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
EECS 498 Introduction to Distributed Systems Fall 2017
Consistency and Replication
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Benchmarking Cloud Serving Systems with YCSB
by Mikael Bjerga & Arne Lange
Global Distribution.
Chapter 21: Parallel and Distributed Storage
Presentation transcript:

PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG

CONTENT REQUIREMENTS OF WEB APPLICATIONS PNUTS ARCHITECTURE EXPERIMENT CONCLUSION

REQUIREMENTS OF WEB APPLICATIONS SCALABILITY ARCHITECTURAL SCALABILITY, SCALE LINEARLY GEOGRAPHIC SCOPE DATA REPLICAS ON MULTIPLE CONTINENTS HIGH AVAILABILITY FAILURES, APPS WILL STILL BE ABLE TO READ RELAXED CONSISTENCY GUARANTEES TOLERATE STALE OR REORDERED DATA

WHAT IS PNUTS ? PNUTS: A MASSIVELY PARALLEL AND GEOGRAPHICALLY DISTRIBUTED DATABASE SYSTEM FOR YAHOO!’S WEB APPLICATIONS. PNUTS PROVIDES:  DATA STORAGE ORGANIZED AS HASHED OR ORDERED TABLES  LOW LATENCY FOR LARGE NUMBERS OF CONCURRENT REQUESTS INCLUDING UPDATES AND QUERIES  NOVEL PER-RECORD CONSISTENCY GUARANTEES

DETAILED ARCHITECTURE

TABLET SPLITTING

DATA STORAGE AND RETRIEVAL DATA STORAGE ORGANIZED AS HASHED OR ORDERED TABLES IN ORDER TO DETERMINE WHICH STORAGE UNIT IS RESPONSIBLE FOR A GIVEN RECORD TO BE READ OR WRITTEN BY THE CLIENT, WE MUST FIRST DETERMINE WHICH TABLET CONTAINS THE RECORD, AND THEN DETERMINE WHICH STORAGE UNIT HAS THAT TABLET. BOTH OF THESE FUNCTIONS ARE CARRIED OUT BY THE ROUTER.

DATA STORAGE AND RETRIEVAL

ROUTERS CONTAIN ONLY A CACHED COPY OF THE INTERVAL MAPPING THE MAPPING IS OWNED BY THE TABLET CONTROLLER THE TABLET CONTROLLER DETERMINES WHEN TO MOVE A TABLET BETWEEN STORAGE UNITS AND WHEN A LARGE TABLET MUST BE SPLIT ROUTERS PERIODICALLY POLL THE TABLET CONTROLLER TO GET ANY CHANGES TO THE MAPPING

QUERY PROCESSING ACCESSING DATA

MULTI-RECORD REQUEST

UPDATES

ASYNCHRONOUS REPLICATION AND CONSISTENCY EXAMPLE OF EVENTUAL CONSISTENCY A USER WISHES TO DO A SEQUENCE OF 2 UPDATES TO HIS RECORD: U1: REMOVE HIS MOTHER FROM THE LIST OF PEOPLE WHO CAN VIEW HIS PHOTOS U2: POST SPRING-BREAK PHOTOS A USER IS ABLE TO READ A STATE OF THE RECORD THAT NEVER SHOULD HAVE EXISTED: THE PHOTOS HAVE BEEN POSTED BUT THE CHANGE IN ACCESS CONTROL HAS NOT TAKEN PLACE.

RECORD TIMELINE CONSISTENCY RECORD-LEVEL MASTERING:  ONE OF THE REPLICAS IS DESIGNATED AS THE MASTER, INDEPENDENTLY FOR EACH RECORD, AND ALL UPDATES TO THAT RECORD ARE FORWARDED TO THE MASTER.  THE REPLICA RECEIVING THE MAJORITY OF WRITE REQUESTS FOR A PARTICULAR RECORD BECOMES THE MASTER FOR THAT RECORD PER-RECORD TIMELINE CONSISTENCY  ALL REPLICAS OF A GIVEN RECORD APPLY ALL UPDATES TO THE RECORD IN THE SAME ORDER.  THE RECORD CARRIES A SEQUENCE NUMBER THAT IS INCREMENTED ON EVERY WRITE

PER-RECORD TIMELINE CONSISTENCY WE (CURRENTLY) KEEP ONLY ONE VERSION OF A RECORD AT EACH REPLICA.

RECORD TIMELINE CONSISTENCY TRANSACTIONS: ALICE CHANGES STATUS FROM “SLEEPING” TO “AWAKE” ALICE CHANGES LOCATION FROM “HOME” TO “WORK

TIMELINE CONSISTENCY COMES AT A PRICE WRITES NOT ORIGINATING IN RECORD MASTER REGION FORWARD TO MASTER AND HAVE LONGER LATENCY WHEN MASTER REGION DOWN, RECORD IS UNAVAILABLE FOR WRITE

EXPERIMENTAL SETUP THREE PNUTS REGIONS 2 WEST COAST, 1 EAST COAST 5 STORAGE UNITS, 2 MESSAGE BROKERS, 1 ROUTER WEST: DUAL 2.8 GHZ XEON, 4GB RAM, 6 DISK RAID 5 ARRAY EAST: QUAD 2.13 GHZ XEON, 4GB RAM, 1 SATA DISK WORKLOAD REQUESTS/SECOND 0-50% WRITES 80% LOCALITY

INSERT INSERTS REQUIRED 75.6 MS PER INSERT IN WEST 1 (TABLET MASTER) MS PER INSERT INTO THE NON-MASTER WEST 2, AND MS PER INSERT INTO THE NON-MASTER EAST.

SCALABILITY

CONCLUSION AND ONGOING WORK PNUTS IS AN INTERESTING RESEARCH PRODUCT RESEARCH: CONSISTENCY, PERFORMANCE, FAULT TOLERANCE, RICH FUNCTIONALITY PRODUCT: MAKE IT WORK, KEEP IT (RELATIVELY) SIMPLE, LEARN FROM EXPERIENCE AND REAL APPLICATIONS ONGOING WORK INDEXES AND MATERIALIZED VIEWS BUNDLED UPDATES BATCH QUERY PROCESSING

THANK YOU ! QUESTIONS ?