Software-Compiled System Design:  A Methodology for Field Programmable System-on-Chip Design

Software-Compiled System Design: A Methodology for Field Programmable System-on-Chip Design PowerPoint PPT Presentation


  • 137 Views
  • Uploaded on
  • Presentation posted in: General

EE201A

Download Presentation

Software-Compiled System Design: A Methodology for Field Programmable System-on-Chip Design

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


1. Software-Compiled System Design: A Methodology for Field Programmable System-on-Chip Design EE201A – VLSI Architectures and Design Methods May 13, 2003 Doug Johnson

2. EE201A – VLSI Architectures and Design Methods Methodology Target Devices Field Programmable System-on-Chip (FPSoC) Definition Over 1 million usable gates in FPGA logic Embedded processor cores On-board peripherals and memory Usage Rapid prototypes (verification stage and ASIC RTL sign-off) Low NRE implementation Reconfigurable systems Examples Altera Excalibur Apex FPGA (up to 1M gates) NIOS (32/16 bit RISC soft core) ARM922T (32-bit, AMBA)

3. EE201A – VLSI Architectures and Design Methods System Design Challenges FPSoC design (as SoC) presents system design challenges Over half of FPGA designs include over 500K gates* Requires convergence of HW/SW methodologies Current flow deficiencies Poorly profiled partitions lead to sub-optimal performance Incompatible flow, verification between HW/SW designs Gap between specification and hardware RTL Impact of specification changes

4. EE201A – VLSI Architectures and Design Methods Software-Compiled System Design Software compiled system design is a methodology that meets the needs of today’s complex codesign and system level challenges. It merges best practice in hardware and software design to provide an open, flexible and quality driven codesign flow, that enables designers to more fully explore the design space, evaluate and verify hardware software trade offs and rapid prototype their design before production begins. And by driving system partitioning, verification and direct optimized implementation from the system specification, it can help reduce design time by some 50% - 75% - whilst delivering proven Quality of Results (QoR) and improving the overall Quality of Design (QoD). Software compiled system design is a methodology that meets the needs of today’s complex codesign and system level challenges. It merges best practice in hardware and software design to provide an open, flexible and quality driven codesign flow, that enables designers to more fully explore the design space, evaluate and verify hardware software trade offs and rapid prototype their design before production begins. And by driving system partitioning, verification and direct optimized implementation from the system specification, it can help reduce design time by some 50% - 75% - whilst delivering proven Quality of Results (QoR) and improving the overall Quality of Design (QoD).

5. EE201A – VLSI Architectures and Design Methods Software-Compiled System Design Flow System Design Functions: Co-design Provide rapid iteration of partitioning decisions throughout flow Co-verification Drive continuous system verification from concept to hardware C to RTL Generate human-readable VHDL and Verilog for ASIC RTL hand-off C to FPGA (FPSoC) Enable direct implementation to device optimized programmable logic

6. EE201A – VLSI Architectures and Design Methods System Co-design

7. EE201A – VLSI Architectures and Design Methods DSM – Data Streaming Manager As a final step, we have now added the “PAL-Core”. This is a collection of added functionality which is widely applicable. An example is the PS2 Mouse and keyboard. The implementation of the PS2 port is specific to the platform and would be implemented in the PSL. The PAL layer provides consistent calls to PS2 type ports on all supported platforms. The PAL-Core layer uses these PAL PS2 calls to provide interpretation of the PS2 port for keyboard and mouse I/O and provides the application with simple and consistent calls to access Keyboard and Mouse devices through the PS2 port. Other functions offered by the PAL-Core include Video frame buffering and the PAL Console – a mechanism for providing text output on a suitable video output from an FPGA – this can be very useful for debugging applications or providing status and progress messages. As a final step, we have now added the “PAL-Core”. This is a collection of added functionality which is widely applicable. An example is the PS2 Mouse and keyboard. The implementation of the PS2 port is specific to the platform and would be implemented in the PSL. The PAL layer provides consistent calls to PS2 type ports on all supported platforms. The PAL-Core layer uses these PAL PS2 calls to provide interpretation of the PS2 port for keyboard and mouse I/O and provides the application with simple and consistent calls to access Keyboard and Mouse devices through the PS2 port. Other functions offered by the PAL-Core include Video frame buffering and the PAL Console – a mechanism for providing text output on a suitable video output from an FPGA – this can be very useful for debugging applications or providing status and progress messages.

8. EE201A – VLSI Architectures and Design Methods System Co-verification

9. EE201A – VLSI Architectures and Design Methods Direct FPSoC Implementation

10. EE201A – VLSI Architectures and Design Methods Handel-C extensions to ANSI-C Handel-C is almost identical to ANSI-C and should be familiar to anyone that has done algorithm development. The extensions that have been put in not only control timing and parallelism, but also include constructs to interface to external logic, instantiate RAMs and define clock domains. Things that do not make sense in hardware (recursion, malloc, etc.) have been taken out of the language but can be used in simulation.Handel-C is almost identical to ANSI-C and should be familiar to anyone that has done algorithm development. The extensions that have been put in not only control timing and parallelism, but also include constructs to interface to external logic, instantiate RAMs and define clock domains. Things that do not make sense in hardware (recursion, malloc, etc.) have been taken out of the language but can be used in simulation.

11. EE201A – VLSI Architectures and Design Methods Xilinx project benchmark for Platform FPGA (FPSoC) Start with C description of JPEG2000 algorithm Use Software-Compiled System Design methodology Partition and Implement JPEG2000 Design Compare results against original VHDL design performance SCSD Design Case Study: JPEG2000

12. EE201A – VLSI Architectures and Design Methods JPEG2000 Case Study: Flow Steps Phase 1: Profile and Verify Simulate C algorithm on PowerPC 405 processor ISS, identify bottlenecks Establish verification plan Phase 2: Partition and Verify Use DSM to explore design space, move blocks to HW Analyze HW/SW communication interface Fine tune partition performance Phase 3: Design and Verify Tune design for performance Combine SW calls Add HW parallelism, refine HC code Co-verify HW/SW (with WRS ISS)

13. EE201A – VLSI Architectures and Design Methods JPEG2000 Case Study: Flow Steps (cont) Phase 3b: Specification Change Import VHDL IP for 2d DWT algorithm Treat as black box, called from HC Perform RTL/HC/ISS co-simulation Phase 4: Implement and Verify Directly compile HC blocks to EDIF Import VHDL IP hard core as EDIF Compile SW to PPC under VxWorks Xilinx P&R tools for Virtex-II Pro target platform Initial platform: Wind River SBC405GP and Proteus FPGA daughter card Platform retargeted without app code changes using PAL API Final platform: Virtex-II Pro ML300 evaluation platform

14. EE201A – VLSI Architectures and Design Methods JPEG2000 Case Study results Our DWT block has the following specs when targeted for a 2V2000-4 part, an input bit width of 12, and a maximum image width of 1K: Lines of Code: 435 (counting number of ";") Fmax: 128 MHz Area: 800 slices, 9 BRAMsOur DWT block has the following specs when targeted for a 2V2000-4 part, an input bit width of 12, and a maximum image width of 1K: Lines of Code: 435 (counting number of ";") Fmax: 128 MHz Area: 800 slices, 9 BRAMs

15. EE201A – VLSI Architectures and Design Methods Northrop Grumman FPSoC design project Design started in VHDL Parallel effort begun using Celoxica SCSD methodology X Million gate Altera FPGA, blank algorithm, x MHz target ROI Example: Northrop Grumman

16. EE201A – VLSI Architectures and Design Methods Celoxica System-Design Summary

17. Handel-C Language EE201A – VLSI Architectures and Design Methods May 13, 2003 Doug Johnson

18. EE201A – VLSI Architectures and Design Methods Fundamentals Based on the international ANSI-C standard No complex class libraries or structures Language extensions for hardware implementation as part of a system level design methodology Extensions enable optimization of timing and area performance Systems described in ANSI-C can be implemented in software and hardware using language extensions defined in Handel-C to describe hardware. Language developed at the University of Oxford and based upon CSP (Communicating Sequential Processes)

19. EE201A – VLSI Architectures and Design Methods Core Language Features Standard C (if, while, switch etc) including Functions Structures Pointers par {…} construct for parallelism Simple model of timing each assignment is one clock cycle Arbitrary widths on variables Enhanced bit manipulation operators Sharing/Copying expressions Support for hardware constructs Multiple clock domains, RAM, ROM, external interfaces

20. EE201A – VLSI Architectures and Design Methods Handel-C for hardware No side effects in expressions i.e. statements like a = b*c++; are not supported No floating point Floating point not directly supported by Handel-C. Library support provided for fixed and floating point arithmetic No run-time recursion Due to the absence of any kind of ‘call stack’ in hardware. Limited standard library (i.e. no printf, fopen etc.) However, DK1.1 allows direct calls to external functions written in C/C++, and these could incorporate file I/O, user interaction, recursion, etc.

21. EE201A – VLSI Architectures and Design Methods Variables Handel-C has one basic type - integer May be signed or unsigned Can be any width, not limited to 8, 16, 32 etc. There is only one fundamental type of variable, an int In addition, the int type may be qualified with the unsigned keyword to indicate that the variable only contains positive integers. For example: These two lines declare two variables, a 5 bit signed integer x and a 13 bit positive only integer y Handle-C can sometimes infer the width of variables form their usage. It is therefore not always necessary to explicitly define the width of all the variables. The undefined keyword has been added to indicate that the compiler should attempt to infer the width of a variable, for example: x=y The compiler knows the width of the variable x and knows that y should then be the same. There is only one fundamental type of variable, an int In addition, the int type may be qualified with the unsigned keyword to indicate that the variable only contains positive integers. For example: These two lines declare two variables, a 5 bit signed integer x and a 13 bit positive only integer y Handle-C can sometimes infer the width of variables form their usage. It is therefore not always necessary to explicitly define the width of all the variables. The undefined keyword has been added to indicate that the compiler should attempt to infer the width of a variable, for example: x=y The compiler knows the width of the variable x and knows that y should then be the same.

22. EE201A – VLSI Architectures and Design Methods Bit Manipulation Operators Extra operators have been added to allow more ‘hardware like’ bit manipulation: The following bit manipulation operators are provided in Handel-C The following slides will describe all of these operators in more detail The following bit manipulation operators are provided in Handel-C The following slides will describe all of these operators in more detail

23. EE201A – VLSI Architectures and Design Methods Example Bit Manipulation A range of bits may be selected from a value In this example b is assigned the four middle bits of variable a A range of bits may be selected from a value In this example b is assigned the four middle bits of variable a

24. EE201A – VLSI Architectures and Design Methods Bit Manipulation 2 Other bit manipulation examples:

25. EE201A – VLSI Architectures and Design Methods Timing model Assignments and delay statements take 1 clock cycle Combinatorial Expressions computed between clock edges Most complex expression determines clock period Example: takes 1+n cycles (n is number of iterations)

26. EE201A – VLSI Architectures and Design Methods Parallelism Handel-C blocks are by default sequential par{…} executes statements in parallel par block completes when all statements complete Time for block is time for longest statement Can nest sequential blocks in par blocks

27. EE201A – VLSI Architectures and Design Methods More Parallelism Example – array initialisation Sequential version takes 20 clock cycles for() loop has 1 cycle overhead for increment Parallel version takes 1 clock cycle Replicated par() builds hardware to execute all 20 iterations in a single cycle Allows trade-off between hardware size and performance

28. EE201A – VLSI Architectures and Design Methods Channels Allow communication and synchronisation between two parallel branches Semantics based on CSP (used by NASA and US Naval Research Laboratory) unbuffered (synchronous) send and receive Declaration Specifies data type to be communicated Handel-C provides channels for communicating between parallel branches of code. One branch writes to a channel and a second reads from it. The communication only occurs when both tasks are ready for transfer at which point one item of data is transferred between the two branches. Handel-C provides channels for communicating between parallel branches of code. One branch writes to a channel and a second reads from it. The communication only occurs when both tasks are ready for transfer at which point one item of data is transferred between the two branches.

29. EE201A – VLSI Architectures and Design Methods Sharing Hardware for Expressions Functions provide a means of sharing hardware for expressions By default, compiler generates separate hardware for each expression Hardware is idle when control flow is elsewhere in the program Hardware function body is shared among call sites By, default, Handel-C generates all the hardware required for every expression in the whole program. In many programs, this means that large parts of the hardware will be idle for long periods of time. The shared expression allows hardware to be shared between different parts of the program to decrease hardware usage. The shared expression has the same format as a macro expression but does not allow recursion.By, default, Handel-C generates all the hardware required for every expression in the whole program. In many programs, this means that large parts of the hardware will be idle for long periods of time. The shared expression allows hardware to be shared between different parts of the program to decrease hardware usage. The shared expression has the same format as a macro expression but does not allow recursion.

30. EE201A – VLSI Architectures and Design Methods Replicating Hardware for Expressions Inline Functions are expanded at the call site Provide for functional abstraction of complex hardware By, default, Handel-C generates all the hardware required for every expression in the whole program. In many programs, this means that large parts of the hardware will be idle for long periods of time. The shared expression allows hardware to be shared between different parts of the program to decrease hardware usage. The shared expression has the same format as a macro expression but does not allow recursion.By, default, Handel-C generates all the hardware required for every expression in the whole program. In many programs, this means that large parts of the hardware will be idle for long periods of time. The shared expression allows hardware to be shared between different parts of the program to decrease hardware usage. The shared expression has the same format as a macro expression but does not allow recursion.

31. EE201A – VLSI Architectures and Design Methods Macro Procedures macro proc is similar to an inline function, but is expanded at compile time. They also allow for arbitrary bit width calculations The following generates a reusable timer:

32. EE201A – VLSI Architectures and Design Methods Signals A signal behaves like a wire - takes the value assigned to it but only for that clock cycle. The value can be read back during the same clock cycle. The signal can also be given a default value.

33. EE201A – VLSI Architectures and Design Methods Interfaces - Introduction Interfaces allow Handel-C designs to connect to external hardware and logic. Three types of interfaces Buses – used for connecting to external pins Ports – used for creating connection points for external logic. e.g. Creating the ports for a VHDL entity User Defined – used for including external logic blocks inside a Handel-C design. e.g. Including an EDIF black box inside a design.

34. EE201A – VLSI Architectures and Design Methods Interfaces – Buses Makes connections to pins on the FPGA. Bus types Output Input – direct, clocked and latched input Tri-state – direct, clocked and latched tri-state

35. EE201A – VLSI Architectures and Design Methods Interfaces – Ports Allows connection points for external logic to be specified. e.g. Defining the ports for a ‘black box’ VHDL entity Port types: Input, Output

36. EE201A – VLSI Architectures and Design Methods Interfaces – User Defined Allows external logic blocks to be used inside a Handel-C design. e.g. Using an EDIF core.

37. EE201A – VLSI Architectures and Design Methods Multiple Clock Domains - Example

38. EE201A – VLSI Architectures and Design Methods Handel-C Summary Handel-C is based on ANSI C Well-defined semantics similar to CSP Everything that can be described in Handel-C has translation hardware Co-simulation with mixed language descriptions such as C/ C++/ SystemC, SpecC and HDLs Additions: support for parallelism channels for communications between parallel processes operators for detailed control of hardware constructs for RAM, ROM, interfacing, etc.

39. The Solution Framework – Hardware and Software Abstraction and Integration EE201A – VLSI Architectures and Design Methods May 13, 2003 Doug Johnson This presentation is about enabling developers to make more effective use of DK1, particularly in the area of hardware acceleration of software applications. The elements to be covered provide a framework to help in the creation of complete platform solutions.This presentation is about enabling developers to make more effective use of DK1, particularly in the area of hardware acceleration of software applications. The elements to be covered provide a framework to help in the creation of complete platform solutions.

40. EE201A – VLSI Architectures and Design Methods Solution Framework – Hardware Abstraction Goal – to provide ‘Platform Independence’ for Handel-C applications Allow migration of applications between hardware platforms Insulate developer from hardware implementation Reduce development time Allow development focus on added value rather than on detail hardware interfacing Provide an environment similar to that expected from a software operating system Platform independence, or hardware abstraction, insulates the developer from the specifics of the hardware platform – something with which software engineers are very familiar due to operating systems and the drivers they provide for hardware I/O. The goal of this part of the solution framework is to provide a very similar environment to that provided by a modern operating system The ultimate benefit is reduction of effort in original development or in migration between platforms, with consequent reduced cost, time to market and the ability to focus on the added-value application rather than hardware detail.Platform independence, or hardware abstraction, insulates the developer from the specifics of the hardware platform – something with which software engineers are very familiar due to operating systems and the drivers they provide for hardware I/O. The goal of this part of the solution framework is to provide a very similar environment to that provided by a modern operating system The ultimate benefit is reduction of effort in original development or in migration between platforms, with consequent reduced cost, time to market and the ability to focus on the added-value application rather than hardware detail.

41. EE201A – VLSI Architectures and Design Methods Solution Framework – Platform Abstraction Layer (PAL) This is the situation now (or before PAL) In this diagram, the Handel-C application has not only to provide the value-added functionality, but also has to deal with the detail interfacing to the two I/O ports on the chip/board (these are the two ‘legs’ of the application). In an ideal world, this work would be done for the developer reducing development time and focussing on added valueThis is the situation now (or before PAL) In this diagram, the Handel-C application has not only to provide the value-added functionality, but also has to deal with the detail interfacing to the two I/O ports on the chip/board (these are the two ‘legs’ of the application). In an ideal world, this work would be done for the developer reducing development time and focussing on added value

42. EE201A – VLSI Architectures and Design Methods Solution Framework – Platform Abstraction Layer (PAL) This is the situation now for boards with a Handel-C Platform Support Library or “PSL” such as the Celoxica RC100 and RC1000. The application programmer no longer needs to deal directly with the hardware, but the Application is still not portable as the PSL is platform specific and the calls made to it from the application would need to be changed for each board the application is to run on.This is the situation now for boards with a Handel-C Platform Support Library or “PSL” such as the Celoxica RC100 and RC1000. The application programmer no longer needs to deal directly with the hardware, but the Application is still not portable as the PSL is platform specific and the calls made to it from the application would need to be changed for each board the application is to run on.

43. EE201A – VLSI Architectures and Design Methods Solution Framework – Platform Abstraction Layer (PAL) In this diagram we have added the consistency required for portability in the form of the Platform Abstraction Layer or “PAL”. This offers a consistent programming interface - the Platform Abstraction Layer (PAL) API (Application Programming Interface) – to access a Platform Support Library (PSL) to do the detail task of communication with the I/O interfaces. Because the PAL API is consistent across all supported platforms, the application can be easily migrated to any other platform which is supported by a PAL library and which is suitable for the applicationIn this diagram we have added the consistency required for portability in the form of the Platform Abstraction Layer or “PAL”. This offers a consistent programming interface - the Platform Abstraction Layer (PAL) API (Application Programming Interface) – to access a Platform Support Library (PSL) to do the detail task of communication with the I/O interfaces. Because the PAL API is consistent across all supported platforms, the application can be easily migrated to any other platform which is supported by a PAL library and which is suitable for the application

44. EE201A – VLSI Architectures and Design Methods Solution Framework – Platform Abstraction Layer (PAL) As a final step, we have now added the “PAL-Core”. This is a collection of added functionality which is widely applicable. An example is the PS2 Mouse and keyboard. The implementation of the PS2 port is specific to the platform and would be implemented in the PSL. The PAL layer provides consistent calls to PS2 type ports on all supported platforms. The PAL-Core layer uses these PAL PS2 calls to provide interpretation of the PS2 port for keyboard and mouse I/O and provides the application with simple and consistent calls to access Keyboard and Mouse devices through the PS2 port. Other functions offered by the PAL-Core include Video frame buffering and the PAL Console – a mechanism for providing text output on a suitable video output from an FPGA – this can be very useful for debugging applications or providing status and progress messages. As a final step, we have now added the “PAL-Core”. This is a collection of added functionality which is widely applicable. An example is the PS2 Mouse and keyboard. The implementation of the PS2 port is specific to the platform and would be implemented in the PSL. The PAL layer provides consistent calls to PS2 type ports on all supported platforms. The PAL-Core layer uses these PAL PS2 calls to provide interpretation of the PS2 port for keyboard and mouse I/O and provides the application with simple and consistent calls to access Keyboard and Mouse devices through the PS2 port. Other functions offered by the PAL-Core include Video frame buffering and the PAL Console – a mechanism for providing text output on a suitable video output from an FPGA – this can be very useful for debugging applications or providing status and progress messages.

45. EE201A – VLSI Architectures and Design Methods Solution Framework – Platform Abstraction Layer (PAL) This shows an overall picture both of the layered PAL model and the portability offered by this approach. This shows an overall picture both of the layered PAL model and the portability offered by this approach.

46. EE201A – VLSI Architectures and Design Methods Solution Framework – Hardware and Software Integration Goal – to provide simple integration between Microprocessor applications and Handel-C accelerators Insulate developer from physical implementation Insulate developer from hardware details Reduce development time Allow development focus on added value Provide an environment similar to that expected from a software operating system The goals of the Data Stream manager (DSM) are very similar to those of PAL – independence from the hardware allowing portability and reduced development time and cost. Again, the DSM provides the type of environment that would be expected of an operating system in a microprocessor – the ability to pass requests to other applications or functions without having to be concerned with the detail of how this is accomplishedThe goals of the Data Stream manager (DSM) are very similar to those of PAL – independence from the hardware allowing portability and reduced development time and cost. Again, the DSM provides the type of environment that would be expected of an operating system in a microprocessor – the ability to pass requests to other applications or functions without having to be concerned with the detail of how this is accomplished

47. EE201A – VLSI Architectures and Design Methods Solution Framework – Hardware and Software Integration This is the situation before DSM – co-processing functions can be implemented in FPGAs/PLDs, but the interconnection of these with applications in microprocessors or DSPs would be an essentially manual programming task. The potential for duplication of effort is large and the consequent complexity will increase development time/cost and increase the costs of support and migration. This is the situation before DSM – co-processing functions can be implemented in FPGAs/PLDs, but the interconnection of these with applications in microprocessors or DSPs would be an essentially manual programming task. The potential for duplication of effort is large and the consequent complexity will increase development time/cost and increase the costs of support and migration.

48. EE201A – VLSI Architectures and Design Methods Solution Framework – Data Stream Manager (DSM) DSM provides a layered model which facilitates interaction between applications on processors or DSPs and functions on FPGAs/PLDs. In the FPGA there is a Hardware DSM (H-DSM), which has an API which the Handel-C co-processing function uses to connect to the environment and through which it receives requests and returns results. In the processor/DSM there is an equivalent Software DSM (S-DSM) which allows applications too make function requests and receive returned data through a consistent API.DSM provides a layered model which facilitates interaction between applications on processors or DSPs and functions on FPGAs/PLDs. In the FPGA there is a Hardware DSM (H-DSM), which has an API which the Handel-C co-processing function uses to connect to the environment and through which it receives requests and returns results. In the processor/DSM there is an equivalent Software DSM (S-DSM) which allows applications too make function requests and receive returned data through a consistent API.

49. EE201A – VLSI Architectures and Design Methods Solution Framework – Hardware Data Stream Manager (H-DSM) This slide shows more detail of the Hardware DSM layers. Note that in effect the H-DSM is layered onto PAL for the hardware specifics such a the bus used to integrate FPGA and processor. This slide shows more detail of the Hardware DSM layers. Note that in effect the H-DSM is layered onto PAL for the hardware specifics such a the bus used to integrate FPGA and processor.

50. EE201A – VLSI Architectures and Design Methods Solution Framework – Software Data Stream Manager (S-DSM) The S-DSM layers are similar to the H-DSM, but with the operating system providing the hardware interfacing through conventional device drivers.The S-DSM layers are similar to the H-DSM, but with the operating system providing the hardware interfacing through conventional device drivers.

51. EE201A – VLSI Architectures and Design Methods DSM Implementation Two components: Software DSM and Hardware DSM Each side opens unidirectional ports: write port on one side / read port on other. DSM takes care of buffering for optimal bus throughput. Connects multiple software functions to multiple hardware blocks using simple API. Available for VII Pro prototyping boards and VII Pro

52. EE201A – VLSI Architectures and Design Methods DSM and CoreConnect DSM can hook onto PLB as a slave device. Multiple functions beyond a single DSM.

53. EE201A – VLSI Architectures and Design Methods S-DSM API Calls

54. EE201A – VLSI Architectures and Design Methods H-DSM API Calls

55. Technical Demonstration JPEG2000 Implementation Using DK1 EE201A – VLSI Architectures and Design Methods May 13, 2003 Doug Johnson

56. Software-Compiled System Design PHASE 1: PROFILING AND VERIFICATION

57. EE201A – VLSI Architectures and Design Methods Software Model Systems often start as pure software model. Very quickly design, demonstrate and validate algorithm. Provides test framework for future development. Let’s see what a software model looks like…

58. EE201A – VLSI Architectures and Design Methods Software Analysis Partitioning is a difficult exercise. Software model can highlight computing intensive components. Well-known tools available for profiling and data-flow. JPEG2000 profile…

59. Software-Compiled System Design PHASE 2: PARTITIONING AND VERIFICATION

60. EE201A – VLSI Architectures and Design Methods Partitioning Typical partitioning exercise is an event. Based on experience and judgment. Partition, Implementation, Integration, verification. Software-compiled system design: partitioning is a process. Allows design exploration. Piece-by-piece partition, implement, integrate, verify.

61. EE201A – VLSI Architectures and Design Methods Nexus PDK Co-design & co-verification solution for the DK design suite System partitioning System driven verification flow Cycle based High level language simulator Co-simulation with RTL and ISS Mixed language description Platform Developer’s Kit Design partitioning (DSM) HW/ SW integration (DSM) Design re-use (PAL) Application portability (PAL) Introducing Nexus PDK, the hub of your system co-design and co-verification environment. Nexus is a codesign and co verification solution that enables system partitioning, co simulation, co-verification and codesign. Nexus is supplied with a Platform Developer's Kit (the PDK) that uniquely provides the vital elements needed to: Enable informed system design partitioning; Co-simulate mixed language descriptions; Manage hardware/ software integration; and, Simplify design implementation The Nexus-PDK co-verification environment provides clear advantages over more typical co-simulation products. Nexus-PDK delivers a software simulation and debug environment, providing an optimal design environment for system design using higher-level languages. It supports multiple higher-level languages including C, C++, SystemC and Handel-C as well as HDLs. Co-simulation support is provided inside Nexus for mixed language descriptions such as HDL, SystemC, C or C++ and Handel-C enabling the designer to select the appropriate language that best expresses individual elements or blocks of their design. Nexus-PDK augments the designer's productivity in the exploration of algorithmic models by allowing algorithmic development in C-based languages, or using compatibility with Matlab algorithm design flows. Nexus creates models that can be simulated in a Simulink environment to connect algorithm development to the embedded system implementation and verification flow. Nexus-PDK also provides a Platform Developer's Kit to support the rapid development of embedded system platforms. This supports user productivity by allowing new systems to be readily created from existing software models. . Introducing Nexus PDK, the hub of your system co-design and co-verification environment. Nexus is a codesign and co verification solution that enables system partitioning, co simulation, co-verification and codesign. Nexus is supplied with a Platform Developer's Kit (the PDK) that uniquely provides the vital elements needed to: Enable informed system design partitioning; Co-simulate mixed language descriptions; Manage hardware/ software integration; and, Simplify design implementation The Nexus-PDK co-verification environment provides clear advantages over more typical co-simulation products. Nexus-PDK delivers a software simulation and debug environment, providing an optimal design environment for system design using higher-level languages. It supports multiple higher-level languages including C, C++, SystemC and Handel-C as well as HDLs. Co-simulation support is provided inside Nexus for mixed language descriptions such as HDL, SystemC, C or C++ and Handel-C enabling the designer to select the appropriate language that best expresses individual elements or blocks of their design. Nexus-PDK augments the designer's productivity in the exploration of algorithmic models by allowing algorithmic development in C-based languages, or using compatibility with Matlab algorithm design flows. Nexus creates models that can be simulated in a Simulink environment to connect algorithm development to the embedded system implementation and verification flow. Nexus-PDK also provides a Platform Developer's Kit to support the rapid development of embedded system platforms. This supports user productivity by allowing new systems to be readily created from existing software models. .

62. EE201A – VLSI Architectures and Design Methods Integration – Data Streaming Manager As a final step, we have now added the “PAL-Core”. This is a collection of added functionality which is widely applicable. An example is the PS2 Mouse and keyboard. The implementation of the PS2 port is specific to the platform and would be implemented in the PSL. The PAL layer provides consistent calls to PS2 type ports on all supported platforms. The PAL-Core layer uses these PAL PS2 calls to provide interpretation of the PS2 port for keyboard and mouse I/O and provides the application with simple and consistent calls to access Keyboard and Mouse devices through the PS2 port. Other functions offered by the PAL-Core include Video frame buffering and the PAL Console – a mechanism for providing text output on a suitable video output from an FPGA – this can be very useful for debugging applications or providing status and progress messages. As a final step, we have now added the “PAL-Core”. This is a collection of added functionality which is widely applicable. An example is the PS2 Mouse and keyboard. The implementation of the PS2 port is specific to the platform and would be implemented in the PSL. The PAL layer provides consistent calls to PS2 type ports on all supported platforms. The PAL-Core layer uses these PAL PS2 calls to provide interpretation of the PS2 port for keyboard and mouse I/O and provides the application with simple and consistent calls to access Keyboard and Mouse devices through the PS2 port. Other functions offered by the PAL-Core include Video frame buffering and the PAL Console – a mechanism for providing text output on a suitable video output from an FPGA – this can be very useful for debugging applications or providing status and progress messages.

63. EE201A – VLSI Architectures and Design Methods Using DSM for HW/ SW integration How does DSM work? Create ports in software and hardware code Use library calls to read and write to the ports. Use DSM monitor to track software and hardware transactions…

  • Login