Presentation is loading. Please wait.

Presentation is loading. Please wait.

Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service Yasushi Saito, Brian N Bershad and Henry M.Levy.

Similar presentations


Presentation on theme: "Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service Yasushi Saito, Brian N Bershad and Henry M.Levy."— Presentation transcript:

1 Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service Yasushi Saito, Brian N Bershad and Henry M.Levy University of Washington

2 Kalyan Boggavarapu Lehigh University CSE 498 What is Porcupine?  Highly scalable Mail server  “Cluster based internet mail server using SMTP”  Why do we need a another mail service ? Conventional systems do not exploit the Heterogeneity of the nodes. Conventional systems are not efficient Conventional systems use legacy software

3 Kalyan Boggavarapu Lehigh University CSE 498 Disadvantages of Conventional Mail Servers  Manageability: The earlier systems are to be configured manually. System has to be tuned for the newly added node / system in the distributed file system. So a lot of work is involved when a node fails or a new node is added to the system.  Availability: This depends on how can the system tolerate the loss of a node. The conventional systems are less fault tolerant When a node has failed the users on that node cannot access the nodes temporarily.  Performance: Number of nodes in the system is not proportional to performance. No dynamic load balancing

4 Kalyan Boggavarapu Lehigh University CSE 498 Goals  Manageability  Availability  Performance Billions messages per second

5 Kalyan Boggavarapu Lehigh University CSE 498 System Overview

6 Kalyan Boggavarapu Lehigh University CSE 498 How Porcupine Achieve its goals

7 Kalyan Boggavarapu Lehigh University CSE 498 Key Data Structures  Mailbox fragment  Mail map  User profile database  User profile soft state (set of users)  User map  Cluster membership list

8 Kalyan Boggavarapu Lehigh University CSE 498 Data Structure Managers

9 Kalyan Boggavarapu Lehigh University CSE 498 A cluster of 2

10 Kalyan Boggavarapu Lehigh University CSE 498 Receiving a Message

11 Kalyan Boggavarapu Lehigh University CSE 498 Load Balancing  Equal distribution of data among the nodes  Identify the hot-spots and divide the load accordingly  Test Bed Systems: 30 Ethernet: 100Mbps OS: Linux 2.2.7 Mean Message Size: 4.7KB; Max 1MB Number of users: 5M Authentication: No

12 Kalyan Boggavarapu Lehigh University CSE 498 Manageability

13 Kalyan Boggavarapu Lehigh University CSE 498 Porcupine re-configures automatically Without: fall in #msgs = 100(approx) With: fall in # of msgs = 50(approx)

14 Kalyan Boggavarapu Lehigh University CSE 498 Availability

15 Kalyan Boggavarapu Lehigh University CSE 498 Mail map consistency  C fails before update  No problem the message is replicated  C deleted all the messages of Bob (A), but update failed.  No problem A will delete the dangling pointers  A fails before the update  A new manager will take the update later

16 Kalyan Boggavarapu Lehigh University CSE 498 States of Replication  Hard State Password and Userlogin is written permanently. Data that should not be lost.  Soft State User to nodes mapping. This can be reconstructed after a loss.

17 Kalyan Boggavarapu Lehigh University CSE 498 Hard State Replication  Aim: consistency  Type: Per- message, Per-User  Effect: efficient during normal operation

18 Kalyan Boggavarapu Lehigh University CSE 498 Effect of Replication

19 Kalyan Boggavarapu Lehigh University CSE 498 Soft-state Reconstruction BCABABAC bob : {A,C} joe : {C} BCABABAC BAABABAB bob : {A,C} joe : {C} BAABABAB ACACACAC bob : {A,C} joe : {C} ACACACAC suzy : {A,B} ann : {B} 1. Membership protocol Usermap recomputation 2. Distributed disk scan suzy : ann : Timeline A B ann : {B} BCABABAC suzy : {A,B} C ann : {B} BCABABAC suzy : {A,B} ann : {B} BCABABAC suzy : {A,B}

20 Kalyan Boggavarapu Lehigh University CSE 498 Advantages of Porcupine  Best use of Resources  Self configuration  Dynamic load balancing  Result: Geographically distributed clusters servers Highly scalable Fault tolerant  Future work Better membership protocol Applying porcupine to other applications like Usenet.

21 Kalyan Boggavarapu Lehigh University CSE 498 Sources  Porcupine figure in all slides is from http://www.bluebison.net/yosemite/porcupine. htm  Diagrams in slides 17,19 are from slides at http://www.hpl.hp.com/personal/Yasushi_S aito/pubs.html#publications


Download ppt "Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service Yasushi Saito, Brian N Bershad and Henry M.Levy."

Similar presentations


Ads by Google