1 / 34

Processing while routing: a network-on-chip-based parallel system

Processing while routing: a network-on-chip-based parallel system. S.R. Fernandes 1 B.C. Oliveira 2 M. Costa 2 I.S. Silva 2 Computers & Digital Techniques, IET ,2009 Reporter: 陳健豪. OUTLINE. Introduction Related works IPNoSys architecture Results

makaio
Download Presentation

Processing while routing: a network-on-chip-based parallel system

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Processing while routing: a network-on-chip-based parallel system S.R. Fernandes 1 B.C. Oliveira 2 M. Costa 2 I.S. Silva 2 Computers & Digital Techniques, IET ,2009 Reporter:陳健豪

  2. OUTLINE • Introduction • Related works • IPNoSys architecture • Results • Conclusions

  3. Introduction • Technology integration has increased to the point where the development of multi-core processor architectures is a market reality nowadays. • Bus-based design remains useful while the number of cores in the processor is kept to a limit. • More powerful interconnections, such as network-on-chip(NoC).

  4. Introduction • NoC requires more chip area and more power. • This paper proposes IPNoSys system, where the routers are also responsible for the execution of operations,besidesthe routing process.

  5. Related works • NoC

  6. IPNoSys architecture • The NoC is not only interconnection mechanism but also becomes an active element in the execution of applications. • square 2D mesh • XY routing policy • virtual-cut-through (VCT) and wormhole switching scheme • virtual channel • credit-based control flow • distributed arbitration and input buffering

  7. IPNoSys architecture

  8. IPNoSys architecture • An arithmetic logic unit (ALU) allowing the router to perform the most common logic-arithmetic operations usually found in applications. • Routing packets and processing,beingcalled routing and processing unit (RPU). • The memory modules are accessed by memory access cores (MAC).

  9. IPNoSys architecture

  10. IPNoSys architecture

  11. IPNoSys architecture • spiral complement routing algorithm:

  12. IPNoSys architecture • Deadlock treatment:

  13. The number of virtual channels is the number of times that the packet should pass through the same physical channel in the same direction. • Inour case the maximum is three times (Fig. 3) • Thus, the IPNoSys system treats the deadlock through asolution called local execution

  14. IPNoSys architecture • Packet format

  15. IPNoSys architecture • Routing and processing unit (RPU)

  16. IPNoSys architecture

  17. IPNoSys architecture • Memory access core • The MACsplaced in the cornersare responsible for reading thepackets from memory and to injecting them into the NoC,

  18. Results • implemented in cycle-accurateSystemC • DifferentNoC dimensions Three simulation cases • Simple counter • DCT • RLE

  19. Results Simple counter • sequential and a parallel execution

  20. IPNoSyssystem allowed to reduce the maximum number of performed instructions around 80% comparing the sequential and parallel execution

  21. Results DCT • The 2D-DCT is largely used in compression process of images.

  22. The DCT application has much data dependencies,whichis the worst case in ILP.

  23. Required memory for IPNoSys is slightly increased with more parallelism because of the rise of the communication.

  24. Results RLE • RLE is suited for compressing any type of data regardless of its information content. • For example, an uncompressed string formed by 15‘A’characters would normally require 15 bytes to store:AAAAAAAAAAAAAAA.

  25. It means the number of packets decrease,on average, at the end of its execution.

  26. Detailed comparison(STORM x IPNoSys) STORM • instances with one, two, four or 15 SPARC V8 processors • 2D-mesh NoC • two, three, five and 16 routers,respectively • cache coherent directory-based MP-SoCplatform • XY routing scheme

  27. Conclusions • This paper presented an innovative NoC-based architecture that does not use traditional processors, IPNoSys. • Architecture’s execution capability independent of the number of application instructions and NoC dimensions. • In DCT,the execution time in the IPNoSys is 3.5 times smaller than the STORM best case that shows the efficiency of the parallelism in this system. • In RLE,theIPNoSysperformance also was better than STORM.

More Related