Presentation is loading. Please wait.

Presentation is loading. Please wait.

Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.

Similar presentations


Presentation on theme: "Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana."— Presentation transcript:

1 Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni Presenters Daniel Burgener, Gautam Bhawsar

2 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

3 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

4 Introduction PNUTS is Massively parallel Geographically distributed database system Designed Yahoo! Used by their web application Shared between several applications

5 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

6 Background * taken from http://msdn.microsoft.com/en-us/library/ms978603.aspx Pub/Sub Model Sending Applications (Publishers) Receiving Applications (Subscribes) Communicate through asynchronous messaging paradigm

7 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

8 Requirements PNUTS is designed to meet the following requirements: Scalability Response Time and Geographic Scope High Availability & Fault Tolerance Relaxed Consistency Grantees

9 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

10 PNUTS Overview & Functionality Data Model & Features Fault Tolerance Pub-Sub Message System Record-level Mastering Hosting

11 PNUTS Overview & Functionality (Cont’d) Functionality Data & Query Model Consistency Model

12 Data & Query Model Simplified relational data model Organizes data into tables of records with attributes Allows arbitrary structure inside a record – “blob” Schema are flexible New attribute is added without halting query or update activity Allow to have empty attribute in the record Query language Supports selection and projection in single table Updates & deletes with primary key only

13 Consistency Model Hide the complexity of replication Considered between general serializability & eventual consistency Per-record timeline consistency “All replica of given record apply all updates to the record in the same order”

14 Consistency Model (Cont’d) Support range of API calls with different levels of consistency Read-any Read-critical(required_version) Read-latest Write Test-and-set-write(required_version)

15 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

16 System Arch. & App. Data tables are horizontally partitioned into groups of records called tablets

17 System Arch. & App. (Cont’d) Data Storage & Retrieval Replication & Consistency PNUTS Applications

18 Data Storage & Retrieval Ordered table Primary-key space of a table is divided into intervals Each interval corresponds to one tablet The router stores interval mapping For a given PMK, binary search is used to find the tablet

19 Data Storage & Retrieval (Cont’d) Hash-organized table n-bit hash function H(), 0 ≤ H() < 2 n [0... 2 n ) is divided into intervals Each interval corresponds to single tablet To map a key to a tablet, 1. Hash the key 2. Search set of interval using binary search

20 Replication & Consistency No redo log The system uses asynchronous replication To ensure low-latency updates Yahoo! Message Broker (YMB) Used for replication & logging because: 1.Multiple steps are applied before committed to DB 2.YMB is designed for wide-area replication

21 Replication & Consistency (Cont’d) Consistency via YMB & mastership Per-record timeline consistency One copy of a record considered as master Direct all updates to the master copy This is called Record-level mechanism Mastership is assigned on a record-by-record basis Different master records in the same table can be in different clusters All updates are propagated to non-master replicas by publishing them to YMB and delivered as commit order

22 Replication & Consistency (Cont’d) Recovery from failure (3 Steps) 1. the tablet controller requests a copy from the source tablet3. the source tablet is copied to the destination region2. “checkpoint message” is published to YMB

23 PNUTS Applications User Database Social Applications Content Meta-Data Listings Management Session Data

24 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

25 Experimental Results 3 regions PNUTS cluster 2 on the west coast and 1 on the east coast Storage engine for hash table “Yahoo! propriety disk-based hashtable” Storage engine for ordered tables MySQL using InnoDB Written primarily in C++ Some components written in PHP & Perl

26 Experimental Results (Cont’d) Experimental parameters: The coming experiments show The impact of several factors on the average latency for request

27 Varying Load

28 Varying Read/Write Ratio

29 Varying Skew

30 Varying Number of Storage Units

31 Varying Size of Range Scan

32 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

33 Comparison to Competitors Google BigTable Geographic replication Secondary indexes Materialized views Create multiple tables Hash organized tables

34 Comparison to Competitors Amazon Dynamo Eventual consistency too weak No support for ordered tables

35 Comparison to Competitors Sharding No automated data migration No shard splitting

36 Comparison to Competitors DFS Hard to scale Less rich database functionality

37 Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work

38 Conclusion PNUTS is Massively parallel Geographically distributed database system Designed Yahoo! to be used by their web application Yahoo!s Hosted Data Serving Platform Architecture of PNUTS is based on record-level Consistency model Delivers the data management as hosted service

39 Future Work Improving query functionality Enforce Constraints such as referential integrity Complex ad hoc queries such as join & group-by Query optimization techniques Provide better technique than simple incremental scanning Add more API calls in consistency model: Bundled Update Relaxed Consistency

40 Thank You


Download ppt "Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana."

Similar presentations


Ads by Google