Download presentation
Presentation is loading. Please wait.
Published byVanessa Chase Modified over 9 years ago
1
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni Presenters Daniel Burgener, Gautam Bhawsar
2
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
3
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
4
Introduction PNUTS is Massively parallel Geographically distributed database system Designed Yahoo! Used by their web application Shared between several applications
5
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
6
Background * taken from http://msdn.microsoft.com/en-us/library/ms978603.aspx Pub/Sub Model Sending Applications (Publishers) Receiving Applications (Subscribes) Communicate through asynchronous messaging paradigm
7
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
8
Requirements PNUTS is designed to meet the following requirements: Scalability Response Time and Geographic Scope High Availability & Fault Tolerance Relaxed Consistency Grantees
9
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
10
PNUTS Overview & Functionality Data Model & Features Fault Tolerance Pub-Sub Message System Record-level Mastering Hosting
11
PNUTS Overview & Functionality (Cont’d) Functionality Data & Query Model Consistency Model
12
Data & Query Model Simplified relational data model Organizes data into tables of records with attributes Allows arbitrary structure inside a record – “blob” Schema are flexible New attribute is added without halting query or update activity Allow to have empty attribute in the record Query language Supports selection and projection in single table Updates & deletes with primary key only
13
Consistency Model Hide the complexity of replication Considered between general serializability & eventual consistency Per-record timeline consistency “All replica of given record apply all updates to the record in the same order”
14
Consistency Model (Cont’d) Support range of API calls with different levels of consistency Read-any Read-critical(required_version) Read-latest Write Test-and-set-write(required_version)
15
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
16
System Arch. & App. Data tables are horizontally partitioned into groups of records called tablets
17
System Arch. & App. (Cont’d) Data Storage & Retrieval Replication & Consistency PNUTS Applications
18
Data Storage & Retrieval Ordered table Primary-key space of a table is divided into intervals Each interval corresponds to one tablet The router stores interval mapping For a given PMK, binary search is used to find the tablet
19
Data Storage & Retrieval (Cont’d) Hash-organized table n-bit hash function H(), 0 ≤ H() < 2 n [0... 2 n ) is divided into intervals Each interval corresponds to single tablet To map a key to a tablet, 1. Hash the key 2. Search set of interval using binary search
20
Replication & Consistency No redo log The system uses asynchronous replication To ensure low-latency updates Yahoo! Message Broker (YMB) Used for replication & logging because: 1.Multiple steps are applied before committed to DB 2.YMB is designed for wide-area replication
21
Replication & Consistency (Cont’d) Consistency via YMB & mastership Per-record timeline consistency One copy of a record considered as master Direct all updates to the master copy This is called Record-level mechanism Mastership is assigned on a record-by-record basis Different master records in the same table can be in different clusters All updates are propagated to non-master replicas by publishing them to YMB and delivered as commit order
22
Replication & Consistency (Cont’d) Recovery from failure (3 Steps) 1. the tablet controller requests a copy from the source tablet3. the source tablet is copied to the destination region2. “checkpoint message” is published to YMB
23
PNUTS Applications User Database Social Applications Content Meta-Data Listings Management Session Data
24
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
25
Experimental Results 3 regions PNUTS cluster 2 on the west coast and 1 on the east coast Storage engine for hash table “Yahoo! propriety disk-based hashtable” Storage engine for ordered tables MySQL using InnoDB Written primarily in C++ Some components written in PHP & Perl
26
Experimental Results (Cont’d) Experimental parameters: The coming experiments show The impact of several factors on the average latency for request
27
Varying Load
28
Varying Read/Write Ratio
29
Varying Skew
30
Varying Number of Storage Units
31
Varying Size of Range Scan
32
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
33
Comparison to Competitors Google BigTable Geographic replication Secondary indexes Materialized views Create multiple tables Hash organized tables
34
Comparison to Competitors Amazon Dynamo Eventual consistency too weak No support for ordered tables
35
Comparison to Competitors Sharding No automated data migration No shard splitting
36
Comparison to Competitors DFS Hard to scale Less rich database functionality
37
Road Map Introduction Background Requirements PNUTS Overview & Functionality System Architecture & Applications Experimental Results Comparison to Competitors Conclusion & Future Work
38
Conclusion PNUTS is Massively parallel Geographically distributed database system Designed Yahoo! to be used by their web application Yahoo!s Hosted Data Serving Platform Architecture of PNUTS is based on record-level Consistency model Delivers the data management as hosted service
39
Future Work Improving query functionality Enforce Constraints such as referential integrity Complex ad hoc queries such as join & group-by Query optimization techniques Provide better technique than simple incremental scanning Add more API calls in consistency model: Bundled Update Relaxed Consistency
40
Thank You
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.