PNUTS: Yahoo!’s Hosted Data Serving Platform

Slides:



Advertisements
Similar presentations
Cloud Computing and Scalable Data Management
Advertisements

2 Google GFS Bigtable Mapreduce Yahoo Hadoop.
An Overview of Cloud Computing Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many discussions.
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
2 Proprietary & Confidential What is Sharding Benefits of Sharding Alternatives of Sharding When to start Sharding Agenda.
PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno Jacobsen,
ICDE2009 Keynotes Summary Shanghai, China, Li Yukun.
1 Web-Scale Data Serving with PNUTS Adam Silberstein Yahoo! Research.
PNUTS: Yahoo’s Hosted Data Serving Platform Jonathan Danaparamita jdanap at umich dot edu University of Michigan EECS 584, Fall Some slides/illustrations.
PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen,
PNUTS: Yahoo!’s Hosted Data Serving Platform Yahoo! Research present by Liyan & Fang.
Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears Yahoo! Research Presenter.
Web Data Management Raghu Ramakrishnan Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Managing Data in the Cloud
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
1 An Overview of Cloud Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many.
1 Cloud Data Serving: From Key-Value Stores to DBMSs Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Brian Cooper Adam Silberstein Utkarsh.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
Where in the world is my data? Sudarshan Kadambi Yahoo! Research VLDB 2011 Joint work with Jianjun Chen, Brian Cooper, Adam Silberstein, David Lomax, Erwin.
Scaling Jena in a commercial environment The Ingenta MetaStore Project Purpose ● Give an example of a big, commercial app using Jena. ● Share experiences.
Security and Replication … and Course Wrap-up Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems September 13, 2015 PNUTS.
PNUTS: Y AHOO !’ S H OSTED D ATA S ERVING P LATFORM B RIAN F. C OOPER, R AGHU R AMAKRISHNAN, U TKARSH S RIVASTAVA, A DAM S ILBERSTEIN, P HILIP B OHANNON,
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
Alireza Angabini Advanced DB class Dr. M.Rahgozar Fall 88.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
D 3 S: Debugging Deployed Distributed Systems Xuezheng Liu et al, Microsoft Research, NSDI 2008 Presenter: Shuo Tang,
PNUTS PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore Jun Rao, Eugene J. Shekita, Sandeep Tata IBM Almaden Research Center PVLDB,
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
Bigtable: A Distributed Storage System for Structured Data
1 Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan and Russell Sears Yahoo! Research.
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
CSCI5570 Large Scale Data Processing Systems NoSQL Slide Ack.: modified based on the slides from Adam Silberstein James Cheng CSE, CUHK.
Web-Scale Data Serving with PNUTS
Plan for Final Lecture What you may expect to be asked in the Exam?
CSE-291 (Distributed Systems) Winter 2017 Gregory Kesden
CS 405G: Introduction to Database Systems
Cloud Computing and Architecuture
Cassandra - A Decentralized Structured Storage System
Lecture 18: Scalable Web Services
Real-time analytics using Kudu at petabyte scale
Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore
Open Source distributed document DB for an enterprise
CSE-291 (Cloud Computing) Fall 2016
NOSQL.
CSCI5570 Large Scale Data Processing Systems
PNUTS: Yahoo!’s Hosted Data Serving Platform
Introduction to NewSQL
CHAPTER 3 Architectures for Distributed Systems
Chapter 19: Distributed Databases
NOSQL databases and Big Data Storage Systems
Google and Cloud Computing
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
Massively Parallel Cloud Data Storage Systems
Distributed P2P File System
Benchmarking Cloud Serving Systems with YCSB
April 13th – Semi-structured data
Small-Scale Peer-to-Peer Publish/Subscribe
DBMS Physical Design Physical design is concerned with the placement of data and selection of access methods for efficiency and ongoing maintenance.
Chapter 21: Parallel and Distributed Storage
Presentation transcript:

PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and Ramana Yerneni Yahoo! Research

How do I build a cool new web app? Option 1: Code it up! Make it live! Scale it later It gets posted to slashdot Scale it now! Flickr, Twitter, MySpace, Facebook, …

How do I build a cool new web app? Option 2: Make it industrial strength! Evaluate scalable database backends Evaluate scalable indexing systems Evaluate scalable caching systems Architect data partitioning schemes Architect data replication schemes Architect monitoring and reporting infrastructure Write application Go live Realize it doesn’t scale as well as you hoped Rearchitect around bottlenecks 1 year later – ready to go!

Example: social network updates Brian Sonja Jimi Brandon Kurt What are my friends up to? Sonja: Brandon:

Example: social network updates 6 Jimi <ph.. 8 Mary <re.. 12 Sonja <ph.. 15 Brandon <po.. <photo> <title>Flower</title> <url>www.flickr.com</url> </photo> 16 Mike <ph.. 17 Bob <re..

What do we need from our DBMS? Web applications need: Scalability And the ability to scale linearly Geographic scope High availability Web applications typically have: Simplified query needs No joins, aggregations Relaxed consistency needs Applications can tolerate stale or reordered data

What is PNUTS?

What is PNUTS? Indexes and views CREATE TABLE Parts ( ID VARCHAR, E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Indexes and views CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E Geographic replication Parallel database Structured, flexible schema Hosted, managed infrastructure

Query model Per-record operations Multi-record operations Get Set Delete Multi-record operations Multiget Scan Getrange Web service (RESTful) API

Detailed architecture Clients Data-path components REST API Routers Message Broker Tablet controller Storage units

Detailed architecture Local region Remote regions Clients REST API Routers YMB Tablet controller Storage units

Tablet splitting and balancing Each storage unit has many tablets (horizontal partitions of the table) Storage unit may become a hotspot Storage unit Tablet Overfull tablets split Tablets may grow over time Shed load by moving tablets to other servers

Query processing

Range queries Grapefruit…Pear? Storage unit 1 Storage unit 2 Apple Avocado Banana Blueberry SU1 Strawberry-MAX SU2 Lime-Strawberry SU3 Canteloupe-Lime MIN-Canteloupe MIN-Canteloupe SU1 Canteloupe-Lime SU3 Lime-Strawberry SU2 Strawberry-MAX Grapefruit…Pear? Grapefruit…Lime? Lime…Pear? Canteloupe Grape Kiwi Lemon Router Lime Mango Orange Storage unit 1 Storage unit 2 Storage unit 3 Strawberry Tomato Watermelon

Updates SU SU SU Routers Message brokers 1 8 Sequence # for key k Write key k Routers Message brokers 2 Write key k 3 Write key k 7 4 Sequence # for key k 5 SUCCESS SU SU SU 6 Write key k

Asynchronous replication and consistency

Asynchronous replication

Consistency model Goal: make it easier for applications to reason about updates and cope with asynchrony What happens to a record with primary key “Brian”? Record inserted Update Update Update Update Update Update Update Delete v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Time Generation 1

Consistency model Read Stale version Stale version Current version Time Generation 1

Consistency model Read up-to-date Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1

Consistency model Read ≥ v.6 Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1

Consistency model Write Stale version Stale version Current version Time Generation 1

Consistency model Write if = v.7 Stale version Stale version ERROR Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1

Consistency model Mechanism: per record mastership Write if = v.7 ERROR Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1

Experiments

Experimental setup Production PNUTS code Three PNUTS regions Workload Enhanced with ordered table type Three PNUTS regions 2 west coast, 1 east coast 5 storage units, 2 message brokers, 1 router West: Dual 2.8 GHz Xeon, 4GB RAM, 6 disk RAID 5 array East: Quad 2.13 GHz Xeon, 4GB RAM, 1 SATA disk Workload 1200-3600 requests/second 0-50% writes 80% locality

Scalability

Request skew

Size of range scans

Related work Distributed and parallel databases Especially query processing and transactions BigTable, Dynamo, S3, SimpleDB, SQL Server Data Services, Cassandra Distributed filesystems Ceph, Boxwood, Sinfonia Distributed (P2P) hash tables Chord, Pastry, … Database replication Master-slave, epidemic/gossip, synchronous…

Conclusions and ongoing work PNUTS is an interesting research product Research: consistency, performance, fault tolerance, rich functionality Product: make it work, keep it (relatively) simple, learn from experience and real applications Ongoing work Indexes and materialized views Bundled updates Batch query processing

Thanks! cooperb@yahoo-inc.com research.yahoo.com