Rim Moussa University Paris 9 Dauphine Experimental Performance Analysis of LH* RS Parity Management Workshop on Distributed Data Structures: WDAS 2002
2 1. Contribute towards improving effictiveness of the 1st prototype [M. Ljungström] : Data Bucket Split 2. Proposal of Scenarios High Availability Bucket Recovery Objectives
3 Overview 1. Why SDDS ? 2. LH* RS Data Structure 3. File Creation 4. High Availability 5. Recovery 6. Conclusion & Future Work
4 Motivation Information Volume of 30% / year Bottleneck of disk access and CPUs Failures Are frequent Scalability High Performance High Availability
5 Hardware architecture Modular Architecture Modular Architecture Best cost/ Performance ratio Best cost/ Performance ratio Need Network-based Storage Systems SDDS Multicomputers
6 SDDS Dynamic file growth Client Network Client … Data Buckets Inserts … … Coordinator … Overloaded Split Records Transfert
7 SDDS (ctnd.) No Centralized directory access Client Network Query IAM … … … … Data Buckets Query
8 High Availability ? Distribution Nodes’ Failure Parity Calculus High storage cost Data Replication Data Replication
9 LH* RS in a few words SDDS Distribution –Hashing Function Parity Calculus –Reed Salomon Codes
10 LH* RS – File Structure r [ e -1 … -1 ] P Insert Rank er 4 Data Buckets Key Data Field Rank Key list Parity Field
11 LH* RS : Split Scenario Splitting Data Bucket New Data Bucket e r Delete e of rank r r* e Insert e in rank r* e r* Insert e in rank r* Delete e of rank r
12 Why the use of TCP/IP ? Flow Control No more loss of messages even if parity sites are overloaded even if parity sites are overloaded In opposition to UDP – Ljungström thesis In opposition to UDP – Ljungström thesis Parity Buckets coherence Serialize Communication at PBs Critical sections Critical sections
13 Hardware Testbed 6 Pentium III, 730 MHz, 128 Mb Machines Ethernet network: max bandwidth of 100 Mbps 1 entity: (bucket, client)/ Machine Configuration tested: 1 Client 1 Client A group of 4 Data Buckets A group of 4 Data Buckets K Parity Buckets, k {0, 1, 2} K Parity Buckets, k {0, 1, 2}
14 File Creation Performances 0,40 ms 0,44 ms 0,48 ms Insert Time/ record +10% 4,9ms 6,5 ms 8 ms Ack Key 10001
15 File Size High Availabilty Degradation of the High availability of the file Solution Add a Parity Bucket/ Group
16 Parity Bucket Creation New Parity Bucket Coordinator Data Bucket’s group Insertion Your Content ? Autogenerate
17 Parity Bucket Creation Perf %79.93%65.71%57.89% Connection Time/ Total Time
18 Data Bucket Recovery UDP TCP/IP 2 Scenarios
19 Recovery Client Group g of Data Buckets Coordinator Query Group g Failure !
20 Coordinator Probe Data Buckets Parity Buckets Recovery Manager
* Recovery Manager Available Buckets r [ k 1 k 2 k 3 k 4 ] P 2 1* 2 Spare Buckets Deduce Record/ key=k 3 Deduce Record/ key=k 2 Insert Record/ key = k 2 ! Insert Record/ key = k 3 ! 3 Record/ key = k 1 ? Record/ key = k 4 ? Record/ key = r ? UDP-based recovery scenario
22 UDP-based Recovery Scenario 0.55 ms 0.65 ms 0.76 ms Just 0.1ms to compute a record/ iteration Just 0.1ms to compute a record/ iteration
* Recovery Manager Available Buckets 1* 2 Spare Buckets Deduces Records having rank [r, slice-1] 3 Buffer of Records to Insert ! Records of rank [r, slice-1] ? Records of key [r, slice-1] ? TCP/IP-based recovery scenario
24 TCP/IP-based Recovery Scenario Communication Time >> Process Time. Communication Time >> Process Time. Slice increases implies better performance results. Slice increases implies better performance results. b = records file of recs records/B
25 Discussion TCP/IP vs UDP Reliability Reliability Performance gain Performance gain Best improvements when slice = entire bucket content (31250 recs). Indeed, UDPTCP/IPGain 1 DB s 6.7 s160%
26 Conclusion Implementation of a new split algorithm Use TCP/IP instead of UDP Use TCP/IP instead of UDP Use Critical section to manage the concurrent requests of updates at the level of Parity Buckets Use Critical section to manage the concurrent requests of updates at the level of Parity Buckets Parity Buckets Management Efficient Data Buckets Recovery
27 Future Work More performance Measurements Variation of Parity Calculus
References [LS00] [Ljungström, 2000] CERIA & U. Linkoping [Rizzo] [Luby] [XB99]
Demo of the Prototype Friday – Poster Session CERIA Lab. B017
End Thank you for your Attention