PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG
CONTENT REQUIREMENTS OF WEB APPLICATIONS PNUTS ARCHITECTURE EXPERIMENT CONCLUSION
REQUIREMENTS OF WEB APPLICATIONS SCALABILITY ARCHITECTURAL SCALABILITY, SCALE LINEARLY GEOGRAPHIC SCOPE DATA REPLICAS ON MULTIPLE CONTINENTS HIGH AVAILABILITY FAILURES, APPS WILL STILL BE ABLE TO READ RELAXED CONSISTENCY GUARANTEES TOLERATE STALE OR REORDERED DATA
WHAT IS PNUTS ? PNUTS: A MASSIVELY PARALLEL AND GEOGRAPHICALLY DISTRIBUTED DATABASE SYSTEM FOR YAHOO!’S WEB APPLICATIONS. PNUTS PROVIDES: DATA STORAGE ORGANIZED AS HASHED OR ORDERED TABLES LOW LATENCY FOR LARGE NUMBERS OF CONCURRENT REQUESTS INCLUDING UPDATES AND QUERIES NOVEL PER-RECORD CONSISTENCY GUARANTEES
DETAILED ARCHITECTURE
TABLET SPLITTING
DATA STORAGE AND RETRIEVAL DATA STORAGE ORGANIZED AS HASHED OR ORDERED TABLES IN ORDER TO DETERMINE WHICH STORAGE UNIT IS RESPONSIBLE FOR A GIVEN RECORD TO BE READ OR WRITTEN BY THE CLIENT, WE MUST FIRST DETERMINE WHICH TABLET CONTAINS THE RECORD, AND THEN DETERMINE WHICH STORAGE UNIT HAS THAT TABLET. BOTH OF THESE FUNCTIONS ARE CARRIED OUT BY THE ROUTER.
DATA STORAGE AND RETRIEVAL
ROUTERS CONTAIN ONLY A CACHED COPY OF THE INTERVAL MAPPING THE MAPPING IS OWNED BY THE TABLET CONTROLLER THE TABLET CONTROLLER DETERMINES WHEN TO MOVE A TABLET BETWEEN STORAGE UNITS AND WHEN A LARGE TABLET MUST BE SPLIT ROUTERS PERIODICALLY POLL THE TABLET CONTROLLER TO GET ANY CHANGES TO THE MAPPING
QUERY PROCESSING ACCESSING DATA
MULTI-RECORD REQUEST
UPDATES
ASYNCHRONOUS REPLICATION AND CONSISTENCY EXAMPLE OF EVENTUAL CONSISTENCY A USER WISHES TO DO A SEQUENCE OF 2 UPDATES TO HIS RECORD: U1: REMOVE HIS MOTHER FROM THE LIST OF PEOPLE WHO CAN VIEW HIS PHOTOS U2: POST SPRING-BREAK PHOTOS A USER IS ABLE TO READ A STATE OF THE RECORD THAT NEVER SHOULD HAVE EXISTED: THE PHOTOS HAVE BEEN POSTED BUT THE CHANGE IN ACCESS CONTROL HAS NOT TAKEN PLACE.
RECORD TIMELINE CONSISTENCY RECORD-LEVEL MASTERING: ONE OF THE REPLICAS IS DESIGNATED AS THE MASTER, INDEPENDENTLY FOR EACH RECORD, AND ALL UPDATES TO THAT RECORD ARE FORWARDED TO THE MASTER. THE REPLICA RECEIVING THE MAJORITY OF WRITE REQUESTS FOR A PARTICULAR RECORD BECOMES THE MASTER FOR THAT RECORD PER-RECORD TIMELINE CONSISTENCY ALL REPLICAS OF A GIVEN RECORD APPLY ALL UPDATES TO THE RECORD IN THE SAME ORDER. THE RECORD CARRIES A SEQUENCE NUMBER THAT IS INCREMENTED ON EVERY WRITE
PER-RECORD TIMELINE CONSISTENCY WE (CURRENTLY) KEEP ONLY ONE VERSION OF A RECORD AT EACH REPLICA.
RECORD TIMELINE CONSISTENCY TRANSACTIONS: ALICE CHANGES STATUS FROM “SLEEPING” TO “AWAKE” ALICE CHANGES LOCATION FROM “HOME” TO “WORK
TIMELINE CONSISTENCY COMES AT A PRICE WRITES NOT ORIGINATING IN RECORD MASTER REGION FORWARD TO MASTER AND HAVE LONGER LATENCY WHEN MASTER REGION DOWN, RECORD IS UNAVAILABLE FOR WRITE
EXPERIMENTAL SETUP THREE PNUTS REGIONS 2 WEST COAST, 1 EAST COAST 5 STORAGE UNITS, 2 MESSAGE BROKERS, 1 ROUTER WEST: DUAL 2.8 GHZ XEON, 4GB RAM, 6 DISK RAID 5 ARRAY EAST: QUAD 2.13 GHZ XEON, 4GB RAM, 1 SATA DISK WORKLOAD REQUESTS/SECOND 0-50% WRITES 80% LOCALITY
INSERT INSERTS REQUIRED 75.6 MS PER INSERT IN WEST 1 (TABLET MASTER) MS PER INSERT INTO THE NON-MASTER WEST 2, AND MS PER INSERT INTO THE NON-MASTER EAST.
SCALABILITY
CONCLUSION AND ONGOING WORK PNUTS IS AN INTERESTING RESEARCH PRODUCT RESEARCH: CONSISTENCY, PERFORMANCE, FAULT TOLERANCE, RICH FUNCTIONALITY PRODUCT: MAKE IT WORK, KEEP IT (RELATIVELY) SIMPLE, LEARN FROM EXPERIENCE AND REAL APPLICATIONS ONGOING WORK INDEXES AND MATERIALIZED VIEWS BUNDLED UPDATES BATCH QUERY PROCESSING
THANK YOU ! QUESTIONS ?