Ramp stats and monitoring
Download
1 / 24

RAMP Stats and Monitoring - PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on

RAMP Stats and Monitoring. Derek Chiou , Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer. Goals/Requirements. Provide functionality equivalent to software-based simulators at RAMP speeds Full observability Monitoring for events

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' RAMP Stats and Monitoring' - lyle-bates


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ramp stats and monitoring

RAMP Stats and Monitoring

Derek Chiou, Bill Reinhart, Nikhil Patil

with KrsteAsanovic and Joel Emer


Goals requirements
Goals/Requirements

  • Provide functionality equivalent to software-based simulators at RAMP speeds

    • Full observability

    • Monitoring for events

      • Triggers for breakpoints, dumping state, etc.

    • Trace (lossy and lossless)

    • Aggregate Statistics

  • Baseline functionality automatically included

  • Resource efficient

  • Flexible

  • Dynamic and static configurablility

  • Integrated with other infrastructure (component interfaces)


At least three levels of debug monitoring stats
At Least Three Levels of Debug/Monitoring/Stats

  • Platform/Unmodellevel

    • Bringing up BEE3/ACP system independent of RAMP code

    • May be strange bugs that get exercised with RAMP usage model

  • Simulator (Model) level

    • Simulator may model target incorrectly

    • Monitor simulator bandwidth requirements

      • Could be very different than target machine (e.g., cache of target cache)

  • Target level

    • The target machine may have been implemented correctly, but that is incorrect

    • Stats/tracing of working target

  • We focus on simulator (model)/target level, but hopefully some will be useful for platform level as well


Statistics monitoring philosophy

Bill Reinhart, Nikhil A Patil

Statistics/Monitoring Philosophy

  • Instrument simulator communication (eg, RAMP channels)

    • Communication mechanisms are logically connected to command network

    • Can export/examine/change anything being communicated

      • No need to add additional code if that is sufficient

    • Turn off to save resources when possible

  • Introduce additional communication to export where communication does not already exist

    • Use standard simulator communication (channel) interfaces

      • Automatically provides target timing information

      • Connected to null end-point that logically dumps

        • Pipe to /dev/null

      • Potentially have non-timed interface, but need time reference point


Simple example
Simple Example

F

D

E

M

W

State

compressor

compressor


Required support
Required Support

  • Endpoint support

  • Channel support

  • Transport (network)

  • Naming


User vs simulator initiated
User vs Simulator Initiated

  • Precise User-Initiated

    • function call to read/write value at specific target time

    • Can be implemented through timed channels

      • Commands live in target time

    • Can be handled logically as a compressor

      • discard data unless there is a command

    • How far ahead in target time should pull command be issued?

      • Too close impact performance but enables precise control

      • Too far makes reacting to event difficult

  • Imprecise User-Initiated

    • Issue a read of state, perform whenever, report back target time

  • Simulator-initited

    • dump everything, filter later

    • can be slow if there is limited bandwidth, storage, filtering


Required support endpoint
Required Support: Endpoint

  • Provide state connected to command network

    • Same interface as a register, drop in replacement

    • Stats counters, monitor points, control points, etc.

  • Provide default compressors/filters

    • Output every n cycles

    • Output on rollover

    • Output toggled on signal

    • Etc.


Required support channel
Required Support: Channel

  • Optional connection to control network

  • Use internal buffering to look back in time

    • Channels implements as circular buffer in BRAM

      • Far more storage than needed (in general)

    • Can look back in time

    • Can save bandwidth by only exporting when needed

tail

head


Re quired support transport
Required Support: Transport

  • Transport

    • To units: commands, configuration, state changes, etc.

    • From units: Extract target/host state, statistics, etc.

  • Could be virtual channel(s) on common physical network

  • LossyNetwork?

    • Lossless for now, support lossy at endpoint

  • QoS?

  • A ring or a ring of rings for simplicity

  • Ordered network simpler

    • helps reconstruction of data outside

    • But, could result in less efficiency


Required support naming tagging
Required Support: Naming/Tagging

  • Naming of source of data

    • Command

      • read P1.iCache.num_hits stats register translated to actual register

    • Returned data/Trace entry

      • Needs to be tagged to indicate data

  • Each stats entry also includes at least

    • Target time

    • Potentially platform/host time for platform/simulator-level debugging


Fpga debug

FPGA Debug

HariAngepat, Chris Craik and Derek Chiou

Electrical and Computer Engineering

University of Texas at Austin


Introduction
Introduction

  • FPGA Simulators offer magnitude speedup

    • However, can suffer from traditional hardware issues of limited visibility and debugging challenges

  • RAMP Simulators face additional complexity to due scalability requirements that may prevent instrumenting every signal in the simulator

1

FPGADBG


Challenge
Challenge

  • How to bring software level debugging visibility to RAMP simulators without dramatically increasing resources or affecting timing closure


Challenge1
Challenge

  • How to bring software level debugging visibility to RAMP simulators without dramatically increasing resources or affecting timing closure

  • Revisit idea of FPGA state readback in combination with gdb style debug interfaces


Our technique
Our Technique

  • 1) Leverage FPGA readback mechanism to exploit as much free visibility as possible

    • FPGA frame readback exists in V2Pro, V4, V5

    • Can sample flip-flop state dynamically

    • Can sample BRAM/LUT (notes on this later..)

    • Can use JTAG hardware for latency-tolerant low-resource physical link

1

FPGADBG


Our technique1
Our Technique

  • 2) Provide a GDB interface that can debug both a software process, as well as a FPGA fabric simultaneously.

    • Can display FPGA netlist symbols alongside software symbols

    • Can allow for hybrid CPU/FPGA platform debugging (ie. X86-FSB-FPGA)

1

FPGADBG


Fpgadbg toolflow
FPGADBG Toolflow

Software Sources

(C/C++/…)

Hardware Sources

(Verilog/VHDL/…)

Compiler

Hierarchy Name Preservation

Constraints

Debug Flags

(-g -Ox)

Synthesis

FPGA Implementation

Symbol Table

ASCII Disassembly

Binary

Executable

Logic Allocation Map

PAR Netlist

FPGA Bitstream

Dummy!

FPGADBG – Interactive extension that enables non-intrusive debugging of software running on FPGA (GDB-Py)

Software Debugger (GDB)

1

FPGADBG


Architecture
Architecture

  • Designed as set of C/Python libraries

    • GDB Interface (plugin)

    • Netlist Frontend (parsing, mapping)

    • FPGA Backend (board comm, readback)

    • Hardware library (step control, ICAP readback)

  • GDB frontend allows connecting to software-based portions of a simulator

  • Assumes design-level support for step

    • Allows design to ensure consistent state before sampling

1

FPGADBG


Architecture1
Architecture

Target Application

User Logic

Target OS

Target Virtual Machine

GDB

GDB Plugin Bindings (Python)

Domain Step Control

Readback Engine

(ICAP)

FPGADBG Core (Python)

FPGA Chip Comm

(C)

FPGA Readback

(C)

Netlist Parser

(Python)

IO Logic (Transport Layer)

FPGA Fabric

HW/SW Simulation Platform

1

FPGADBG


Netlist parsing
Netlist Parsing

Top

myREG

regOut

dout

Bit  6597758 0x005e0200   5758 Block=SLICE_X88Y18 Latch=XQ Net=dout(3)Bit  6597838 0x005e0200   5838 Block=SLICE_X88Y16 Latch=XQ Net=dout(1)Bit  6604350 0x005e0400   5758 Block=SLICE_X88Y18 Latch=YQ Net=dout(2)Bit  6604430 0x005e0400   5838 Block=SLICE_X88Y16 Latch=YQ Net=dout(0)

inst "regOut(1)" "SLICE",placed R72C45 SLICE_X88Y16  ,cfg " BXINV::BX BXOUTUSED::#OFF BYINV::BY BYINVOUTUSED::#OFF BYOUTUSED::#OFF

... DXMUX::0 DYMUX::0 F::#OFF F5USED::#OFF FFX:myREG/dout_1:#FF FFX_INIT_ATTR::INIT0 FFX_SR_ATTR::SRLOW FFY:myREG/dout_0:#FF FFY_INIT_ATTR::INIT0 FFY_SR_ATTR::SRLOW    ... ";inst "regOut(3)" "SLICE",placed R71C45 SLICE_X88Y18  ,cfg " BXINV::BX BXOUTUSED::#OFF BYINV::BY BYINVOUTUSED::#OFF BYOUTUSED::#OFF ... DXMUX::0 DYMUX::0 F::#OFF F5USED::#OFF FFX:myREG/dout_3:#FF FFX_INIT_ATTR::INIT0 FFX_SR_ATTR::SRLOW FFY:myREG/dout_2:#FF FFY_INIT_ATTR::INIT0 FFY_SR_ATTR::SRLOW ...“

;

1

FPGADBG


Netlist parsing1
Netlist Parsing

  • FPGA toolflow introduces optimizations and naming issues

Physical Netlist

Alias Detection

Vector Merger

Hierarchy

Construction

Frame Address

Mapping

Symbolic Netlist

FPGA Cmd Generator

ReadbackCmd Parser

Bitstream Reorder

FPGA Board Communication

ReadbackBitstream


Limitations
Limitations

  • Hardware readback has limitations:

    • RAMs require offline readback due to resource contention issues

    • FPGA frame span large vertical stripes potentially restricting visibility if some logic cannot be disabled during sampling

    • Hierarchy must be preserved during synthesis to ensure understandable netnames

    • Step control requires design-level support

1

FPGADBG


Status future work
Status & Future Work

  • Current prototype implements board communication with the XUP Virtex2Pro30 with JTAG-based frame readback

  • Frontend netlist parser support hierachical node generation, bit vector merging and some support for aliased signals.

  • Full GDB shell expected to be released in Q1-2009 with support for Virtex5{110/330}

1

FPGADBG


ad