Presentation is loading. Please wait.

Presentation is loading. Please wait.

PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.

Similar presentations


Presentation on theme: "PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG."— Presentation transcript:

1 PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG

2 CONTENT REQUIREMENTS OF WEB APPLICATIONS PNUTS ARCHITECTURE EXPERIMENT CONCLUSION

3 REQUIREMENTS OF WEB APPLICATIONS SCALABILITY ARCHITECTURAL SCALABILITY, SCALE LINEARLY GEOGRAPHIC SCOPE DATA REPLICAS ON MULTIPLE CONTINENTS HIGH AVAILABILITY FAILURES, APPS WILL STILL BE ABLE TO READ RELAXED CONSISTENCY GUARANTEES TOLERATE STALE OR REORDERED DATA

4 WHAT IS PNUTS ? PNUTS: A MASSIVELY PARALLEL AND GEOGRAPHICALLY DISTRIBUTED DATABASE SYSTEM FOR YAHOO!’S WEB APPLICATIONS. PNUTS PROVIDES:  DATA STORAGE ORGANIZED AS HASHED OR ORDERED TABLES  LOW LATENCY FOR LARGE NUMBERS OF CONCURRENT REQUESTS INCLUDING UPDATES AND QUERIES  NOVEL PER-RECORD CONSISTENCY GUARANTEES

5 DETAILED ARCHITECTURE

6 TABLET SPLITTING

7 DATA STORAGE AND RETRIEVAL DATA STORAGE ORGANIZED AS HASHED OR ORDERED TABLES IN ORDER TO DETERMINE WHICH STORAGE UNIT IS RESPONSIBLE FOR A GIVEN RECORD TO BE READ OR WRITTEN BY THE CLIENT, WE MUST FIRST DETERMINE WHICH TABLET CONTAINS THE RECORD, AND THEN DETERMINE WHICH STORAGE UNIT HAS THAT TABLET. BOTH OF THESE FUNCTIONS ARE CARRIED OUT BY THE ROUTER.

8 DATA STORAGE AND RETRIEVAL

9 ROUTERS CONTAIN ONLY A CACHED COPY OF THE INTERVAL MAPPING THE MAPPING IS OWNED BY THE TABLET CONTROLLER THE TABLET CONTROLLER DETERMINES WHEN TO MOVE A TABLET BETWEEN STORAGE UNITS AND WHEN A LARGE TABLET MUST BE SPLIT ROUTERS PERIODICALLY POLL THE TABLET CONTROLLER TO GET ANY CHANGES TO THE MAPPING

10 QUERY PROCESSING ACCESSING DATA

11 MULTI-RECORD REQUEST

12 UPDATES

13 ASYNCHRONOUS REPLICATION AND CONSISTENCY EXAMPLE OF EVENTUAL CONSISTENCY A USER WISHES TO DO A SEQUENCE OF 2 UPDATES TO HIS RECORD: U1: REMOVE HIS MOTHER FROM THE LIST OF PEOPLE WHO CAN VIEW HIS PHOTOS U2: POST SPRING-BREAK PHOTOS A USER IS ABLE TO READ A STATE OF THE RECORD THAT NEVER SHOULD HAVE EXISTED: THE PHOTOS HAVE BEEN POSTED BUT THE CHANGE IN ACCESS CONTROL HAS NOT TAKEN PLACE.

14 RECORD TIMELINE CONSISTENCY RECORD-LEVEL MASTERING:  ONE OF THE REPLICAS IS DESIGNATED AS THE MASTER, INDEPENDENTLY FOR EACH RECORD, AND ALL UPDATES TO THAT RECORD ARE FORWARDED TO THE MASTER.  THE REPLICA RECEIVING THE MAJORITY OF WRITE REQUESTS FOR A PARTICULAR RECORD BECOMES THE MASTER FOR THAT RECORD PER-RECORD TIMELINE CONSISTENCY  ALL REPLICAS OF A GIVEN RECORD APPLY ALL UPDATES TO THE RECORD IN THE SAME ORDER.  THE RECORD CARRIES A SEQUENCE NUMBER THAT IS INCREMENTED ON EVERY WRITE

15 PER-RECORD TIMELINE CONSISTENCY WE (CURRENTLY) KEEP ONLY ONE VERSION OF A RECORD AT EACH REPLICA.

16 RECORD TIMELINE CONSISTENCY TRANSACTIONS: ALICE CHANGES STATUS FROM “SLEEPING” TO “AWAKE” ALICE CHANGES LOCATION FROM “HOME” TO “WORK

17 TIMELINE CONSISTENCY COMES AT A PRICE WRITES NOT ORIGINATING IN RECORD MASTER REGION FORWARD TO MASTER AND HAVE LONGER LATENCY WHEN MASTER REGION DOWN, RECORD IS UNAVAILABLE FOR WRITE

18 EXPERIMENTAL SETUP THREE PNUTS REGIONS 2 WEST COAST, 1 EAST COAST 5 STORAGE UNITS, 2 MESSAGE BROKERS, 1 ROUTER WEST: DUAL 2.8 GHZ XEON, 4GB RAM, 6 DISK RAID 5 ARRAY EAST: QUAD 2.13 GHZ XEON, 4GB RAM, 1 SATA DISK WORKLOAD 1200-3600 REQUESTS/SECOND 0-50% WRITES 80% LOCALITY

19 INSERT INSERTS REQUIRED 75.6 MS PER INSERT IN WEST 1 (TABLET MASTER) 131.5 MS PER INSERT INTO THE NON-MASTER WEST 2, AND 315.5 MS PER INSERT INTO THE NON-MASTER EAST.

20

21 SCALABILITY

22 CONCLUSION AND ONGOING WORK PNUTS IS AN INTERESTING RESEARCH PRODUCT RESEARCH: CONSISTENCY, PERFORMANCE, FAULT TOLERANCE, RICH FUNCTIONALITY PRODUCT: MAKE IT WORK, KEEP IT (RELATIVELY) SIMPLE, LEARN FROM EXPERIENCE AND REAL APPLICATIONS ONGOING WORK INDEXES AND MATERIALIZED VIEWS BUNDLED UPDATES BATCH QUERY PROCESSING

23 THANK YOU ! QUESTIONS ?


Download ppt "PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG."

Similar presentations


Ads by Google