Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service - PowerPoint PPT Presentation

Slide1 l.jpg
Download
1 / 21

  • 246 Views
  • Updated On :
  • Presentation posted in: Pets / Animals

Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service. Yasushi Saito, Brian N Bershad and Henry M.Levy University of Washington. What is Porcupine?. Highly scalable Mail server “Cluster based internet mail server using SMTP”

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide1 l.jpg

Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service

Yasushi Saito, Brian N Bershad and Henry M.Levy

University of Washington


What is porcupine l.jpg

What is Porcupine?

  • Highly scalable Mail server

  • “Cluster based internet mail server using SMTP”

  • Why do we need a another mail service ?

    • Conventional systems do not exploit the Heterogeneity of the nodes.

    • Conventional systems are not efficient

    • Conventional systems use legacy software

Kalyan Boggavarapu Lehigh University CSE 498


Disadvantages of conventional mail servers l.jpg

Disadvantages of Conventional Mail Servers

  • Manageability:

    • The earlier systems are to be configured manually.

    • System has to be tuned for the newly added node / system in the distributed file system.

    • So a lot of work is involved when a node fails or a new node is added to the system.

  • Availability:

    • This depends on how can the system tolerate the loss of a node.

    • The conventional systems are less fault tolerant

    • When a node has failed the users on that node cannot access the nodes temporarily.

  • Performance:

    • Number of nodes in the system is not proportional to performance.

    • No dynamic load balancing

Kalyan Boggavarapu Lehigh University CSE 498


Goals l.jpg

Goals

  • Manageability

  • Availability

  • Performance

Billions messages

per second

Kalyan Boggavarapu Lehigh University CSE 498


System overview l.jpg

System Overview

Kalyan Boggavarapu Lehigh University CSE 498


How porcupine achieve its goals l.jpg

How Porcupine Achieve its goals

Kalyan Boggavarapu Lehigh University CSE 498


Key data structures l.jpg

Key Data Structures

  • Mailbox fragment

  • Mail map

  • User profile database

  • User profile soft state (set of users)

  • User map

  • Cluster membership list

Kalyan Boggavarapu Lehigh University CSE 498


Data structure managers l.jpg

Data Structure Managers

Kalyan Boggavarapu Lehigh University CSE 498


A cluster of 2 l.jpg

A cluster of 2

Kalyan Boggavarapu Lehigh University CSE 498


Receiving a message l.jpg

Receiving a Message

Kalyan Boggavarapu Lehigh University CSE 498


Load balancing l.jpg

Load Balancing

  • Equal distribution of data among the nodes

  • Identify the hot-spots and divide the load accordingly

  • Test Bed

    • Systems: 30

    • Ethernet: 100Mbps

    • OS: Linux 2.2.7

    • Mean Message Size: 4.7KB; Max 1MB

    • Number of users: 5M

    • Authentication: No

Kalyan Boggavarapu Lehigh University CSE 498


Manageability l.jpg

Manageability

Kalyan Boggavarapu Lehigh University CSE 498


Porcupine re configures automatically l.jpg

Porcupine re-configures automatically

Without: fall in #msgs = 100(approx)

With: fall in # of msgs = 50(approx)

Kalyan Boggavarapu Lehigh University CSE 498


Availability l.jpg

Availability

Kalyan Boggavarapu Lehigh University CSE 498


Mail map consistency l.jpg

Mail map consistency

  • C fails before update

    • No problem the message is replicated

  • C deleted all the messages of

    Bob (A), but update failed.

    • No problem A will delete the dangling pointers

  • A fails before the update

    • A new manager will take the update later

  • Kalyan Boggavarapu Lehigh University CSE 498


    States of replication l.jpg

    States of Replication

    • Hard State

      • Password and Userlogin is written permanently.

      • Data that should not be lost.

    • Soft State

      • User to nodes mapping.

      • This can be reconstructed after a loss.

    Kalyan Boggavarapu Lehigh University CSE 498


    Hard state replication l.jpg

    Hard State Replication

    • Aim: consistency

    • Type: Per-message, Per-User

    • Effect: efficient during normal operation

    Kalyan Boggavarapu Lehigh University CSE 498


    Effect of replication l.jpg

    Effect of Replication

    Kalyan Boggavarapu Lehigh University CSE 498


    Soft state reconstruction l.jpg

    B

    B

    B

    B

    B

    C

    C

    C

    C

    C

    A

    A

    A

    A

    A

    B

    B

    B

    B

    B

    A

    A

    A

    A

    A

    B

    B

    B

    B

    B

    A

    A

    A

    A

    A

    C

    C

    C

    C

    C

    Soft-state Reconstruction

    2. Distributed disk scan

    1. Membership protocol

    Usermap recomputation

    B

    A

    A

    B

    A

    B

    A

    B

    A

    C

    A

    C

    A

    C

    A

    C

    A

    bob: {A,C}

    bob: {A,C}

    bob: {A,C}

    suzy:

    suzy: {A,B}

    B

    A

    A

    B

    A

    B

    A

    B

    A

    C

    A

    C

    A

    C

    A

    C

    B

    joe: {C}

    joe: {C}

    joe: {C}

    ann:

    ann: {B}

    suzy: {A,B}

    C

    suzy: {A,B}

    suzy: {A,B}

    ann: {B}

    ann: {B}

    ann: {B}

    Kalyan Boggavarapu Lehigh University CSE 498

    Timeline


    Advantages of porcupine l.jpg

    Advantages of Porcupine

    • Best use of Resources

    • Self configuration

    • Dynamic load balancing

    • Result:

      • Geographically distributed clusters servers

      • Highly scalable

      • Fault tolerant

    • Future work

      • Better membership protocol

      • Applying porcupine to other applications like Usenet.

    Kalyan Boggavarapu Lehigh University CSE 498


    Sources l.jpg

    Sources

    • Porcupine figure in all slides is from

      http://www.bluebison.net/yosemite/porcupine.htm

    • Diagrams in slides 17,19 are from slides at http://www.hpl.hp.com/personal/Yasushi_Saito/pubs.html#publications

    Kalyan Boggavarapu Lehigh University CSE 498


  • Login