TIPC as TML
This presentation is the property of its rightful owner.
Sponsored Links
1 / 44

Jon Maloy, Ericsson Steven Blake, Modularnet Maarten Koning, WindRiver Jamal Hadi Salim,Znyx PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on
  • Presentation posted in: General

TIPC as TML. draft-maloy-tipc-01.txt. Jon Maloy, Ericsson Steven Blake, Modularnet Maarten Koning, WindRiver Jamal Hadi Salim,Znyx Hormuzd Khosravi,Intel. IETF-61, Washington DC, Nov 2004. TIPC. A transport protocol for cluster environments

Download Presentation

Jon Maloy, Ericsson Steven Blake, Modularnet Maarten Koning, WindRiver Jamal Hadi Salim,Znyx

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

TIPC as TML

draft-maloy-tipc-01.txt

Jon Maloy, Ericsson

Steven Blake, Modularnet

Maarten Koning, WindRiver

Jamal Hadi Salim,Znyx

Hormuzd Khosravi,Intel

IETF-61, Washington DC,

Nov 2004


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

TIPC

  • A transport protocol for cluster environments

    • Connectionless and Connection Oriented; Reliable or Unreliable.

    • Reliable or Unreliable Multicast

    • Usage not limited to ForCES context

  • A framework for detecting, supervising and maintaining cluster topology

  • Available as portable open source code package under BSD licence

    • 12000 lines of C code, 112 kbyte Linux kernel module

    • Runs on 4 OS:es so far, and more to come

  • Proven concept, used and deployed in several Ericsson products


Forces protocol framework

CE PL

(ForCES Protocol)

FE PL

(ForCES Protocol)

CE TML

FE TML

Transport

(IP,TCP,RapidIO,Ethernet…)

Transport

(IP,TCP,RapidIO,Ethernet…)

ForCES Protocol Framework

ForCES Protocol Messages


Tipc as l2 tml

CE PL

(ForCES Protocol)

FE PL

(ForCES Protocol)

TIPC TML

TIPC TML

L2 Transport

(RapidIO,Ethernet…)

L2 Transport

(RapidIO,Ethernet…)

TIPC as L2 TML

ForCES Protocol Messages


Interface adaptation

CE PL

(ForCES Protocol)

FE PL

(ForCES Protocol)

TIPC TML

TIPC TML

L2 Transport

(RapidIO,Ethernet…)

L2 Transport

(RapidIO,Ethernet…)

Interface Adaptation

Interface Adaptation

Interface Adaptation

ForCES Protocol Messages


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Fulfilling Requirements(1)

  • Reliability

    • Reliable transport in all modes

    • Can be made unreliable per socket/direction

  • Security

    • Only secure within closed networks.

    • No explicit authentication/encryption support yet, but planned

    • Not IP-based, no router will forward TIPC messages!!

  • Congestion Control

    • At three levels: Connection/Transport, Signalling Link and Carrier level

    • Will give feedback to PL layer if connection is broken or message rejected

  • Multicast/Broadcast

    • Supported


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Fulfilling Requirements(2)

  • Timeliness

    • Immediate delivery (No Nagle algorithm)

    • Inter-node delivery time in the order of 100 microseconds

  • HA Considerations

    • L2 link failure detection and failover handled transparently for user

    • Connection abortion with error code if no redundant carrier available

    • Peer node failure detection after 0.5-1.5 seconds

  • Encapsulation

    • 24 byte extra header

    • 40 extra for connectionless

  • Priorities

    • Supports 4 message importance priorities, determining congestion levels and abort/rejection levels

    • Is 8 levels really needed ?


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Connection Directly on TIPC

CE

CE Object

FB X

FB Y

TIPC

FE

FE Object

LFB 1

LFB 2


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Connections via FE/CE Object

CE

CE Object

FB X

FB Y

TIPC

FE

FE Object

LFB 1

LFB 2


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Connection Usage

CE

CE Object

FB X

FB Y

Traffic Data Connection:

Low Priority

Reliable CE->FE

Unreliable FE->CE

Control Connection:

High Priority

Reliable in both directions

TIPC

FE

FE Object

LFB 1

LFB 2


Functional addressing unicast

foo,33

Functional Addressing: Unicast

  • Function Address

    • Persistent, reusable 64 bit port identifier assigned by user

    • Consists of type number and instance number

  • Function Address Sequence

    • Sequence of function addresses with same type

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

instance = 33)

Server Process,

Partition A

bind(type = foo,

lower=0,

upper=99)


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Address Mapping -Unicast

CE

RSVP77

CE Object

FB X

tml_bind(RSVP,77)

TML API

bind(RSVP,77,77)

TIPC API

TIPC

FE

TIPC API

bind(meter,44,44)

Meter44

FE Object

TML API

LFB 1

tml_bind(meter,44)


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

tml_bind(RSVP,77)

TML API

bind(RSVP,77,77)

Connection Setup

CE 8

RSVP77

CE Object

FB X

TIPC API

TIPC

FE 17

connect(RSVP,77,node=8)

Meter44

FE Object

LFB 1

tml_connect(RSVP,77, CEID=8)

If instance numbers are coordinated over whole cluster there is no need for LFBs to know CEID


Functional addressing multicast

Functional Addressing: Multicast

  • Based on Function Address Sequences

    • Any partition overlapping with the range used in the destination address will receive a copy of the message

    • Client defines “multicast group” per call

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

lower = 33,

upper = 133)

foo,33,133

Server Process,

Partition A

foo,33,133

bind(type = foo,

lower=0,

upper=99)


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Address Mapping -Multicast

CE

RSVP77

CE Object

tml_mcast(meter_mc,group=X)

FB X

sendto(meter_mc,X,X)

TIPC

FE

Meter13

bind(meter_mc,X,X)

Meter44

bind(meter_mc,X,X)

FE Object

tml_join(meter_mc,X)

tml_join(meter_mc,X)


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Questions???


Jon maloy ericsson steven blake modularnet maarten koning windriver jamal hadi salim znyx

Why TIPC in ForCES ?

  • Congestion control at three levels

    • Connection level, signalling link level and media level

    • Based on 4 importance priorities

  • Simple to configure

    • Each node needs to know its own identity, that is all

    • Automatic neighbour detection using multicast/broadcast

  • Lightweigth, Reactive Connections

    • Immediate connection abortion at node/process failure or overload

  • Toplogy Subscription Service

    • Functional and physical topology


Functional view

Functional View

Socket API Adapter

Port API Adapter

Other API Adapters

User Adapter API

Address Subscription

Address Resolution

Address Table

Distribution

Connection Supervision

Route/Link Selection

Reliable Multicast

Neighbour Detection

Link Establish/Supervision/Failover

Node

Internal

Fragmentation/De-fragmentation

Packet Bundling

Congestion Control

Sequence/Retransmission

Control

Bearer Adapter API

Ethernet

UDP

SCTP

Infiniband

Mirrored

Memory


Network topology

Cluster <1.2>

Cluster <1.1>

Internet/

Intranet

Network Topology

Zone <1>

Zone <2>

Cluster <2.1>

Slave Node

<2.1.3333>

Node <1.2.3>


Functional addressing unicast1

foo,33

Functional Addressing: Unicast

  • Function Address

    • Persistent, reusable 64 bit port identifier assigned by user

    • Consists of type number and instance number

  • Function Address Sequence

    • Sequence of function addresses with same type

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

instance = 33)

Server Process,

Partition A

bind(type = foo,

lower=0,

upper=99)


Functional addressing multicast1

Functional Addressing: Multicast

  • Based on Function Address Sequences

    • Any partition overlapping with the range used in the destination address will receive a copy of the message

    • Client defines “multicast group” per call

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

lower = 33,

upper = 133)

foo,33,133

Server Process,

Partition A

foo,33,133

bind(type = foo,

lower=0,

upper=99)


Location transparency

Location Transparency

  • Location of server not known by client

    • Lookup of physical destination performed on-the-fly

    • Efficient, no secondary messaging involved

Node <1.1.1>

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

lower = 33,

upper = 133)

Server Process,

Partition A

foo,33,133

bind(type = foo,

lower=0,

upper=99)


Location transparency1

Location Transparency

  • Location of server not known by client

    • Lookup of physical destination performed on-the-fly

    • Efficient, no secondary messaging involved

Node <1.1.2>

Node <1.1.1>

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

lower = 33,

upper = 133)

Server Process,

Partition A

foo,33,133

bind(type = foo,

lower=0,

upper=99)


Location transparency2

Node <1.1.2>

Node <1.1.3>

Location Transparency

  • Location of server not known by client

    • Lookup of physical destination performed on-the-fly

    • Efficient, no secondary messaging involved

Node <1.1.1>

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

lower = 33,

upper = 133)

Server Process,

Partition A

foo,33,133

bind(type = foo,

lower=0,

upper=99)


Address binding

Address Binding

  • Many sockets may bind to same partition

    • Closest-First or Round-Robin algorithm chosen by client

Server Process,

Partition A’

Client Process

bind(type = foo,

lower=0,

upper=99)

sendto(type = foo,

lower = 33,

upper = 133)

Server Process,

Partition A

foo,33,133

bind(type = foo,

lower=0,

upper=99)


Address binding1

Address Binding

  • Many sockets may bind to same partition

    • Closest-First or Round-Robin algorithm chosen by client

  • Same socket may bind to many partitions

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

lower = 33,

upper = 133)

Server Process,

Partition A+B’

foo,33,133

bind(type = foo,

lower=0,

upper=99)

bind(type=foo,

lower=100,

upper=199)


Address binding2

Address Binding

  • Many sockets may bind to same partition

    • Closest-First or Round-Robin algorithm chosen by client

  • Same socket may bind to many partitions

  • Same socket may bind to different functions

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

sendto(type = foo,

lower = 33,

upper = 133)

Server Process,

Partition A

foo,33,133

bind(type = foo,

lower=0,

upper=99)

bind(type=bar,

lower=0,

upper=999)


Functional topology subscription

foo,0,99

foo,100,199

Functional Topology Subscription

  • Function Address/Address Partition bind/unbind events

Server Process,

Partition B

Client Process

bind(type = foo,

lower=100,

upper=199)

subscribe(type = foo,

lower = 0,

upper = 500)

Server Process,

Partition A

bind(type = foo,

lower=0,

upper=99)


Network topology subscription

TIPC

bind(type = node,

lower=0x1001003,

upper=0x1001003)

TIPC

bind(type = node,

lower=0x1001002,

upper=0x1001002)

Network Topology Subscription

  • Node/Cluster/Zone availability events

    • Same mechanism as for function events

Node <1.1.3>

Node <1.1.1>

Client Process

node,0x1001003

subscribe(type = node,

lower = 0x1001000,

upper = 0x1001009)

Node <1.1.2>

node,0x1001002


Forces applied on tipc

LFB <IPv4F,5>

LFB <IPv4F,1>

LFB <CNT,17>

LFB <CNT,32>

ForCES Applied on TIPC

Network Equipment

Control Element

OSPF, RIP

COPS, CLI, SNMP

Other Applications

ForCES Protocol/TIPC

Forwarding Element


Forces applied on tipc1

Internet

Internet

LFB <IPv4F,5>

LFB <IPv4F,1>

LFB <CNT,32>

LFB <CNT,17>

ForCES applied on TIPC

Network Equipment

Control Element

Control Element

Control Element

OSPF, RIP

COPS, CLI, SNMP

Other Applications

ForCES Protocol/TIPC

Forwarding Element

Forwarding Element


Connections

CONNECTIONS

  • Establishment based on functional addressing

    • Selectable lookup algorithm, partitioning, redundancy etc

  • No protocol messages exchanged during setup/shutdown

    • Only payload carrying messages

  • Traditional TCP-style connection setup/shutdown as alternative

  • End-to-end flow control

  • SOCK_SEQPACKET

  • SOCK_STREAM

  • SOCK_RDM for connectionless and multicast

  • SOCK_DGRAM can easily be added if needed

  • Same with “Unreliable SOCK_SEQPACKET”


Connections1

Client Process

CONNECTIONS

  • No protocol messages exchanged during setup/shutdown

    • Only payload carrying messages

Server Process,

Partition B

foo,117

sendto(type = foo,

instance = 117)


Connections2

Client Process

CONNECTIONS

  • No protocol messages exchanged during setup/shutdown

    • Only payload carrying messages

Server Process,

Partition B

connect(client)

send()


Connections3

Client Process

CONNECTIONS

  • No protocol messages exchanged during setup/shutdown

    • Only payload carrying messages

Server Process,

Partition B

connect(server)


Connections4

Client Process

abort

CONNECTIONS

  • Immediate “abortion” event in case of peer process crash

Server Process,

Partition B


Connections5

Client Process

CONNECTIONS

  • Immediate “abortion” event in case of peer node crash

Node <1.1.5>

Node <1.1.3>

Server Process,

Partition B

abort


Connections6

Client Process

CONNECTIONS

  • Immediate “abortion” event in case of communication failure

Node <1.1.5>

Node <1.1.3>

Server Process,

Partition B

abort


Connections7

Client Process

abort

CONNECTIONS

  • Immediate “abortion” event in case of node overload

Node <1.1.5>

Node <1.1.3>

Server Process,

Partition B


Network redundancy

Client Process

Network Redundancy

  • Retransmission protocol and congestion control at signalling link level

  • Normally two links per node pair, for full load sharing and redundancy

Node <1.1.5>

Node <1.1.3>

Server Process,

Partition B


Network redundancy1

Client Process

Network Redundancy

  • Retransmission protocol and congestion control at signalling link level

  • Normally two links per node pair, for full load sharing and redundancy

  • Smooth failover in case of single link failure, with no consequences for user level connections

Node <1.1.5>

Node <1.1.3>

Server Process,

Partition B


Remaining work

Remaining Work

Implementation

  • Reliable Multicast not fully implemented yet (exp. end of Q1)

  • Re-stabilization after most recent changes

  • Re-implementation of multi-cluster neighbour detection and link setup

    Protocol

  • Fully manual inter cluster link setup

  • Guaranteeing Name Table consistency between clusters

  • Slave node Name Table reduction

  • ?????


Http tipc sourceforge net

http://tipc.sourceforge.net


Questions

QUESTIONS ??


  • Login