slide1 n.
Skip this Video
Download Presentation
National Sun Yat-sen University Embedded System Laboratory Efficient Network Interface Architecture for Network-on-Ch

Loading in 2 Seconds...

play fullscreen
1 / 15

National Sun Yat-sen University Embedded System Laboratory Efficient Network Interface Architecture for Network-on-Ch - PowerPoint PPT Presentation

  • Uploaded on

National Sun Yat-sen University Embedded System Laboratory Efficient Network Interface Architecture for Network-on-Chips. Presenter : Cheng_Ta Wu. Masoumeh Ebrahimi , Masoud Daneshtalab , N P Sreejesh , Pasi Liljeberg , Hannu Tenhunen

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'National Sun Yat-sen University Embedded System Laboratory Efficient Network Interface Architecture for Network-on-Ch' - lisle

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

National Sun Yat-sen University Embedded System LaboratoryEfficient Network Interface Architecture for Network-on-Chips

Presenter : Cheng_Ta Wu

MasoumehEbrahimi, MasoudDaneshtalab, N P Sreejesh, PasiLiljeberg, HannuTenhunen

Department of Information Technology, University of Turku, Turku, Finland


  • Abstract
  • What’s the problem
  • Related works
  • The proposed method
  • Experiment Results

In this paper, we present novel network interface architecture for on-chip networks to increase memory parallelism and to improve the resource utilization. The proposed architecture exploits AXI transaction based protocol to be compatible with existing IP cores. Experimental results with synthetic test case demonstrate that the proposed architecture outperforms the conventional architecture in term of latency.

what s the problem
What’s the problem
  • According to our observation, the utilization of reorder buffer in NIs is significantly low. Therefore, the traditional buffer management is not efficient enough for NIs.
related works
Related works


Supporting shared memory abstraction and flexible network configuration


NISAR (network interface architecture supporting adaptive routing)


Transaction ID renaming

Increasing latency

Low buffer utilization, and no support burst transaction


Moving the reorder buffer resources from NI into network routers

Using global synchronization the performance might be degraded, and the cost of hardware overhead is too high

the proposed method
The proposed method
  • Master-side NI architecture
  • Slave-side NI architecture
master side ni architecture
Master-side NI architecture
  • Both NI are partitioned into two paths
    • Forward path: transferring the requests to the network
      • AXI-Queue, Packetizer unit, Reorder unit
    • Reverse path: receiving the responses from the network
      • Packet-Queue, Depacketizer unit, Reorder unit
introduce of the axi queue and packetizer unit
Introduce of the AXI-Queue and Packetizer unit
  • AXI-Queue:
    • Performs the arbitration between the write and read transaction channels and stores requests into write or read requests buffers.
    • If admitted by the reorder unit the request message will be sent to the packetizer unit.
  • Packetizer:
    • Convert incoming messages from the AXI-Queue into header and data flits.
introduce of the packet queue and depacketizer unit
Introduce of the Packet-Queue and Depacketizer unit
  • Packet-Queue:
    • Receives packets from the router.
    • If the packet is out of order(according to the sequence number), it is transmitted to the reorder buffer, otherwise it will be delivered to the Depacketizer unit directly.
  • Depacketizer:
    • restore packets coming from either the reorder buffer or Packet-Queue into the original data format of the AXI master core.
introduce of the reorder unit
Introduce of the Reorder unit
  • Including a Status-Register, a Status-Table, a Reorder buffer, and a Reorder-Table
  • In the forward path:
    • Preparing the sequence number for corresponding transaction ID, and avoiding overflow of the reorder buffer by the admittance mechanism are provided by this unit.
  • In the reverse path:
    • Determines where the outstanding packets from the packet-queue should be transmitted(recorder buffer or Depacketizer), and when the packets in the reorder buffer could released to the depacketizer
introduce of the reorder unit cont
Introduce of the Reorder unit (cont.)
  • Status-Register and Status-Table:
    • Status-Register:
      • It’s an n-bit register where each bit corresponds to one of the AXI transaction IDs. This register records whether there are one or more messages with the same transaction ID being issued or not.
    • Status-Table:
      • Each entry of this table is considered for messages with the same transaction ID, and includes valid tag (v), Transaction ID (T-ID), Number of outstanding Transactions (N-T), and the Expecting Sequence number (E-S).

Size_nm: size of new message

Size_AOM: size of all outstanding messages

introduce of the reorder unit cont1
Introduce of the Reorder unit (cont.)
  • Reorder-table and reorder-buffer
    • Each row of the reorder table corresponds to an out-of-order packet stored in the reorder buffer.
    • Reorder-Table includes the valid tag (v), the transaction ID (T-ID), the sequence number (S-N),and the head pointer (P).
    • Whenever an in-order packet delivered to the depacketizer unit, the depacketizer controller checks the reorder table for the validity of any stored packet with the same transaction ID and next sequence number. If so, the stored packet will be released from the reorder unit to the depacketizer unit.
slave side ni architecture
Slave-side NI architecture
  • To avoid losing the order of header information carried by arriving requests, a FIFO has been considered
experimental results
Experimental results
  • In the first configuration (A), out of 25 nodes, ten nodes are assumed to be processor (master cores-with master NI) and other fifteen nodes are memories (slave cores-with slave NI).
  • For the second configuration (B), each node is considered to have a processor and a memory (master core with master-NI, and slave cores with slave-NI).
  • Latency defined as the number of cycles between the initiation of a request operation issued by a master and the time when the response is completely delivered to the master from the memory.
  • And the request rate is defined as the ratio of the successful read/write request injections into the NI over the total number of injection attempts.

Baseline architecture is according to the reference [5][6]

research tree
Research tree

Efficient Network Interface Architecture for Network-on-chip

Protocol Transducer Synthesis using Divide and Conquer approach

Automatic Interface Synthesis based on the Classification of Interface Protocols of IPs