Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Serv...
Download
1 / 21

Manageability - PowerPoint PPT Presentation


  • 259 Views
  • Updated On :

Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service. Yasushi Saito, Brian N Bershad and Henry M.Levy University of Washington. What is Porcupine?. Highly scalable Mail server “Cluster based internet mail server using SMTP”

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Manageability' - Philip


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service

Yasushi Saito, Brian N Bershad and Henry M.Levy

University of Washington


What is porcupine l.jpg
What is Porcupine? Highly Scalable, Cluster-based Mail Service

  • Highly scalable Mail server

  • “Cluster based internet mail server using SMTP”

  • Why do we need a another mail service ?

    • Conventional systems do not exploit the Heterogeneity of the nodes.

    • Conventional systems are not efficient

    • Conventional systems use legacy software

Kalyan Boggavarapu Lehigh University CSE 498


Disadvantages of conventional mail servers l.jpg
Disadvantages of Highly Scalable, Cluster-based Mail ServiceConventional Mail Servers

  • Manageability:

    • The earlier systems are to be configured manually.

    • System has to be tuned for the newly added node / system in the distributed file system.

    • So a lot of work is involved when a node fails or a new node is added to the system.

  • Availability:

    • This depends on how can the system tolerate the loss of a node.

    • The conventional systems are less fault tolerant

    • When a node has failed the users on that node cannot access the nodes temporarily.

  • Performance:

    • Number of nodes in the system is not proportional to performance.

    • No dynamic load balancing

Kalyan Boggavarapu Lehigh University CSE 498


Goals l.jpg
Goals Highly Scalable, Cluster-based Mail Service

  • Manageability

  • Availability

  • Performance

Billions messages

per second

Kalyan Boggavarapu Lehigh University CSE 498


System overview l.jpg

System Overview Highly Scalable, Cluster-based Mail Service

Kalyan Boggavarapu Lehigh University CSE 498


How porcupine achieve its goals l.jpg
How Porcupine Highly Scalable, Cluster-based Mail ServiceAchieve its goals

Kalyan Boggavarapu Lehigh University CSE 498


Key data structures l.jpg
Key Data Structures Highly Scalable, Cluster-based Mail Service

  • Mailbox fragment

  • Mail map

  • User profile database

  • User profile soft state (set of users)

  • User map

  • Cluster membership list

Kalyan Boggavarapu Lehigh University CSE 498


Data structure managers l.jpg
Data Structure Managers Highly Scalable, Cluster-based Mail Service

Kalyan Boggavarapu Lehigh University CSE 498


A cluster of 2 l.jpg
A cluster of 2 Highly Scalable, Cluster-based Mail Service

Kalyan Boggavarapu Lehigh University CSE 498


Receiving a message l.jpg
Receiving a Message Highly Scalable, Cluster-based Mail Service

Kalyan Boggavarapu Lehigh University CSE 498


Load balancing l.jpg
Load Balancing Highly Scalable, Cluster-based Mail Service

  • Equal distribution of data among the nodes

  • Identify the hot-spots and divide the load accordingly

  • Test Bed

    • Systems: 30

    • Ethernet: 100Mbps

    • OS: Linux 2.2.7

    • Mean Message Size: 4.7KB; Max 1MB

    • Number of users: 5M

    • Authentication: No

Kalyan Boggavarapu Lehigh University CSE 498


Manageability l.jpg

Manageability Highly Scalable, Cluster-based Mail Service

Kalyan Boggavarapu Lehigh University CSE 498


Porcupine re configures automatically l.jpg
Porcupine re-configures automatically Highly Scalable, Cluster-based Mail Service

Without: fall in #msgs = 100(approx)

With: fall in # of msgs = 50(approx)

Kalyan Boggavarapu Lehigh University CSE 498


Availability l.jpg

Availability Highly Scalable, Cluster-based Mail Service

Kalyan Boggavarapu Lehigh University CSE 498


Mail map consistency l.jpg
Mail map consistency Highly Scalable, Cluster-based Mail Service

  • C fails before update

    • No problem the message is replicated

  • C deleted all the messages of

    Bob (A), but update failed.

    • No problem A will delete the dangling pointers

  • A fails before the update

    • A new manager will take the update later

  • Kalyan Boggavarapu Lehigh University CSE 498


    States of replication l.jpg
    States of Replication Highly Scalable, Cluster-based Mail Service

    • Hard State

      • Password and Userlogin is written permanently.

      • Data that should not be lost.

    • Soft State

      • User to nodes mapping.

      • This can be reconstructed after a loss.

    Kalyan Boggavarapu Lehigh University CSE 498


    Hard state replication l.jpg
    Hard State Replication Highly Scalable, Cluster-based Mail Service

    • Aim: consistency

    • Type: Per-message, Per-User

    • Effect: efficient during normal operation

    Kalyan Boggavarapu Lehigh University CSE 498


    Effect of replication l.jpg
    Effect of Replication Highly Scalable, Cluster-based Mail Service

    Kalyan Boggavarapu Lehigh University CSE 498


    Soft state reconstruction l.jpg

    B Highly Scalable, Cluster-based Mail Service

    B

    B

    B

    B

    C

    C

    C

    C

    C

    A

    A

    A

    A

    A

    B

    B

    B

    B

    B

    A

    A

    A

    A

    A

    B

    B

    B

    B

    B

    A

    A

    A

    A

    A

    C

    C

    C

    C

    C

    Soft-state Reconstruction

    2. Distributed disk scan

    1. Membership protocol

    Usermap recomputation

    B

    A

    A

    B

    A

    B

    A

    B

    A

    C

    A

    C

    A

    C

    A

    C

    A

    bob: {A,C}

    bob: {A,C}

    bob: {A,C}

    suzy:

    suzy: {A,B}

    B

    A

    A

    B

    A

    B

    A

    B

    A

    C

    A

    C

    A

    C

    A

    C

    B

    joe: {C}

    joe: {C}

    joe: {C}

    ann:

    ann: {B}

    suzy: {A,B}

    C

    suzy: {A,B}

    suzy: {A,B}

    ann: {B}

    ann: {B}

    ann: {B}

    Kalyan Boggavarapu Lehigh University CSE 498

    Timeline


    Advantages of porcupine l.jpg
    Advantages of Porcupine Highly Scalable, Cluster-based Mail Service

    • Best use of Resources

    • Self configuration

    • Dynamic load balancing

    • Result:

      • Geographically distributed clusters servers

      • Highly scalable

      • Fault tolerant

    • Future work

      • Better membership protocol

      • Applying porcupine to other applications like Usenet.

    Kalyan Boggavarapu Lehigh University CSE 498


    Sources l.jpg
    Sources Highly Scalable, Cluster-based Mail Service

    • Porcupine figure in all slides is from

      http://www.bluebison.net/yosemite/porcupine.htm

    • Diagrams in slides 17,19 are from slides at http://www.hpl.hp.com/personal/Yasushi_Saito/pubs.html#publications

    Kalyan Boggavarapu Lehigh University CSE 498


    ad