Metropolis metamodel
Download
1 / 81

Metropolis Metamodel - PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on

Metropolis Metamodel. Metropolis Objects. Proc 1. P 1. P 2. Media 1. I 1. I 2. QM 1. Metropolis elements adhere to a “separation of concerns” point of view. Processes (Computation). Active Objects Sequential Executing Thread. Media (Communication). Passive Objects

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Metropolis Metamodel' - hyatt-whitaker


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Metropolis objects
Metropolis Objects

Proc1

P1

P2

Media1

I1

I2

QM1

  • Metropolis elements adhere to a “separation of concerns” point of view.

  • Processes (Computation)

Active Objects

Sequential Executing Thread

  • Media (Communication)

Passive Objects

Implement Interface Services

  • Quantity Managers (Coordination)

Schedule access to resources and quantities


Metro netlists and events
Metro. Netlists and Events

QM1

Problem Statement

Approach

Contribution

  • Metropolis Architectures are created via two netlists:

  • Scheduled – generate events1 for services in the scheduled netlist.

  • Scheduling – allow these events access to the services and annotateevents with quantities.

Related Work

Scheduled Netlist

Scheduling Netlist

Event1 – represents a transition in the action automata of an object. Can be annotated with any number of quantities. This allows performance estimation.

Proc1

Proc2

P1

P2

Global

Time

I1

Media1

I2

  • E. Lee and A. Sangiovanni-Vincentelli, A Unified Framework for Comparing Models of Computation, IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, Vol. 17, N. 12, pg. 1217-1229, December 1998


Key modeling concepts
Key Modeling Concepts

  • An event is the fundamental concept in the framework

    • Represents a transition in the action automataof an object

    • An event is owned by the object that exports it

    • During simulation, generated events are termed as event instances

    • Events can be annotated with any number of quantities

    • Events can partially expose the state around them, constraints can then reference or influence this state

  • A service corresponds to a set of sequences of events

    • All elements in the set have a common begin event and a common end event

    • A service may be parameterized with arguments

  • E. Lee and A. Sangiovanni-Vincentelli, A Unified Framework for Comparing Models of Computation,

  • IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, Vol. 17, N. 12, pg. 1217-1229, December 1998


Action automata
Action Automata

  • Processes take actions.

    • statements and some expressions, e.g.

      y = z+port.f();, z+port.f(), port.f(), i < 10, …

    • only calls to media functions are observable actions

  • An execution of a given netlist is a sequence of vectors of events.

    • event : the beginning of an action, e.g. B(port.f()),

      the end of an action, e.g. E(port.f()), or null N

    • the i-th component of a vector is an event of the i-th process

  • An execution is legal if

    • it satisfies all coordination constraints, and

    • it is accepted by all action automata.


Execution semantics
Execution semantics

Action automaton:

  • one for each action of each process

    • defines the set of sequences of events that can happen in executing the action

  • a transition corresponds to an event:

    • it may update shared memory variables:

      • process and media member variables

      • values of actions-expressions

    • it may have guards that depend on states of other action automata and memory variables

  • each state has a self-loop transition with the null N event.

  • all the automata have their alphabets in common:

    • transitions must be taken together in different automata, if they correspond to the same event.


Action automata1
Action Automata

  • y=x+1;

y=x+1

x+1

B y=x+1

B x+1

E x+1

E y=x+1

y:=Vx+1

*

*

*

* = write y

B x+1

E x+1

E y=x+1

y:=any

B x+1

E x+1

Vx+1 :=x+1

writex

E x+1

Vx+1 :=any

Vx+1

y

x

0

0

0

5

0

0

1

0

0

5

5

0

1

1

0

B y=x+1

N

B x+1

N

N

E x+1

E y=x+1

Return


Semantics summary
Semantics summary

  • Processes run sequential code concurrently, each at its own arbitrary pace.

  • Read-Write and Write-Write hazards may cause unpredictable results

    • atomicity has to be explicitly specified.

  • Progress may block at synchronization points

    • awaits

    • function calls and labels to which awaits or constraints refer.

  • The legal behavior of a netlist is given by a set of sequences of event vectors.

    • multiple sequences reflect the non-determinism of the semantics:

      concurrency, synchronization (awaits and constraints)



Architecture components
Architecture components

  • An architecture component specifies services, i.e.

    • what it can do

    • how much it costs


Meta model functional netlist
Meta-Model : Functional Netlist

MyFncNetlist

P1

Y

X

M

P2

Y

X

Env2

Env1

process P{

port reader X;

port writer Y;

thread(){

while(true){

...

z = f(X.read());

Y.write(z);

}}}

interface reader extends Port{

update int read();

eval int n();

}

interface writer extends Port{

update void write(int i);

eval int space();

}

medium M implements reader, writer{

int storage;

int n, space;

void write(int z){

await(space>0; this.writer ; this.writer)

n=1; space=0; storage=z;

}

word read(){ ... }

}


Meta model architecture components
Meta-Model: Architecture Components

interface BusArbiterService extends Port {

update void request(event e);

update void resolve();

}

interface BusMasterService extends Port {

update void busRead(String dest, int size);

update void busWrite(String dest, int size);

}

medium Bus implements BusMasterService…{

port BusArbiterService Arb;

port MemService Mem; …

update void busRead(String dest, int size) {

if(dest== … ) Mem.memRead(size);

[[Arb.request(B(thisthread, this.busRead));

GTime.request(B(thisthread, this.memRead),

BUSCLKCYCLE +

GTime.A(B(thisthread, this.busRead)));

]]

}

scheduler BusArbiter extends Quantity

implements BusArbiterService {

update void request(event e){ … }

update void resolve() { //schedule }

}

Bus

BusArbiter

  • An architecture component specifies services, i.e.

    • what it can do

    • how much it costs

: interfaces

: quantities, annotation, logic of constraints


Meta model quantities
Meta-model: quantities

  • The domain D of the quantity, e.g. real for the global time,

  • The operations and relations on D, e.g. subtraction, <, =,

  • The function from an event instance to an element of D,

  • Axioms on the quantity, e.g.

  • the global time is non-decreasing in a sequence of vectors of any

  • feasible execution.

class GTime extends Quantity {

double t;

double sub(double t2, double t1){...}

double add(double t1, double t2){…}

boolean equal(double t1, double t2){ ... }

boolean less(double t1, double t2){ ... }

double A(event e, int i){ ... }

constraints{

forall(event e1, event e2, int i, int j):

GXI.A(e1, i) == GXI.A(e2, j) -> equal(A(e1, i), A(e2, j)) &&

GXI.A(e1, i) < GXI.A(e2, j) -> (less(A(e1, i), A(e2, j)) || equal(A(e1, i), A(e2. j)));

}}


Meta model architecture components1
Meta-model: architecture components

  • This modeling mechanism is generic, independent of services and cost specified.

  • Which levels of abstraction, what kind of quantities, what kind of cost constraints should be used to capture architecture components?

    • depends on applications: on-going research

Transaction:

Services:

- fuzzy instruction set for SW, execute() for HW

- bounded FIFO (point-to-point)

Quantities:

- #reads, #writes, token size, context switches

CPU

ASIC1

ASIC2

Sw1

Hw

Sw2

Hw

C-Ctl

C-Ctl

Sw I/F

Channel Ctl

Channel I/F

B-I/F

CPU-IOs

Bus I/F

Wrappers

Virtual BUS:

Services:

- data decomposition/composition

- address (internal v.s. external)

Quantities: same as above, different weights

Physical:

Services: full characterization

Quantities: time

RTOS

e.g. PIBus 32b

e.g. OtherBus 64b...


Quantity resolution
Quantity resolution

The 2-step approach to resolve quantities at each state of a netlist being executed:

1. quantity requests

for each process Pi, for each event e that Pican take, find all the quantity constraints on e.

In the meta-model, this is done by explicitly requesting quantity annotations at the relevant events, i.e. Quantity.request(event, requested quantities).

2. quantity resolution

find a vector made of the candidate events and a set of quantities annotated with each of the events, such that the annotated quantities satisfy:

  • all the quantity requests, and

  • all the axioms of the Quantity types.

    In the meta-model, this is done by letting each Quantity type implement a resolve() method, and the methods of relevant Quantity types are iteratively called.

  • theory of fixed-point computation


Quantity resolution1
Quantity resolution

  • The 2-step approach is same as how schedulers work, e.g. OS schedulers, BUS schedulers, BUS bridge controllers.

  • Semantically, a scheduler can be considered as one that resolves a quantity called execution index.

  • Two ways to model schedulers:

    1. As processes:

    • explicitly model the scheduling protocols using the meta-model building blocks

    • a good reflection of actual implementations

      2. As quantities:

    • use the built-in request/resolve approach for modeling the scheduling protocols

    • more focus on resolution (scheduling) algorithms, than protocols: suitable for higher level abstraction models


Quantity request service
Quantity Request – Service

  • Task.Read(){

    • CpuRtos.cpuRead();

  • }

  • CS.Resolve(){

    • //Task scheduling algorithm;

  • }

Ti

setMustDo(e)

CpuRtos.cpuRead()

CS.Resolve()

CpuScheduler

GTime

CpuRtos

CS.Request(beg(Ti, this.cpuRead),csr)

Bus.busRead()

  • CpuRtos.Read(){

    • CS.Request(beg(Ti, this.cpuRead), csr);

    • Bus.busRead();

    • CS.Request(end(Ti, this.cpuRead), csr);

  • }

SchedulingNetlist

ScheduledNetlist


Meta model mapping netlist
Meta-Model: Mapping Netlist

MyFncNetlist

P1

P2

M

Env2

Env1

MyArchNetlist

MyArchNetlist

mP1

mP1

mP2

mP2

Bus

Arbiter

Bus

Arbiter

Bus

Bus

Mem

Mem

Cpu

Cpu

OsSched

OsSched

MyMapNetlist

B(P1, M.write) <=> B(mP1, mP1.writeCpu); E(P1, M.write) <=> E(mP1, mP1.writeCpu);

B(P1, P1.f) <=> B(mP1, mP1.mapf); E(P1, P1.f) <=> E(mP1, mP1.mapf);

B(P2, M.read) <=> B(P2, mP2.readCpu); E(P2, M.read) <=> E(mP2, mP2.readCpu);

B(P2, P2.f) <=> B(mP2, mP2.mapf); E(P2, P2.f) <=> E(mP2, mP2.mapf);


Architecture modeling related work
Architecture Modeling Related Work

  • David C. Luckham and James Vera, An Event-Based Architecture Definition Language , IEEE Transactions on Software Engineering, Vol. 21, No 9, pg. 717-734, Sep. 1995.

  • Ingo Sander and Axel Jantsch, System Modeling and Transformational Design Refinement in ForSyDe, IEEE Transactions on CAD, Vol. 23, No 1, pg. 17-32, Jan. 2004.

  • Paul Lieverse, Pieter van der Wolf, Ed Deprettere, and Kees Vissers, A Methodology for Architecture Exploration of Heterogeneous Signal Processing Systems, IEEE Workshop in Signal Processing Systems, Taipei, Taiwan, 1999.

Return


Na ve approach
Naïve Approach

System Level Design does not guarantee accuracy or efficiency!!

Estimated

Performance

Data

Abstract

Modular

SLD

Manual

“C” Model

Disconnected

Inaccurate!

Lengthy Feedback

Inefficient

Miss Time to Market!

Manual

RTL “Golden Model”

Implementation

Implementation Gap!


Improved approach
Improved Approach

Functional level blocks of programmable components

Technique 1: Modeling style and characterization for programmable platforms

Estimated

Performance

Data

Real

Performance

Data

Abstract

Modular

SLD

From characterization flow

Narrow the Gap

Actual

Programmable

Platform Description

New approach has improved accuracy and efficiency by relating programmable devices and their tool flow with SLD (Metropolis). Retains modularity and abstraction.


Programmable focus
Programmable Focus

P. Schaumont, et al, A Quick Safari Through the Reconfigurable Jungle,

DAC, June 2001.

K. Bondalapati, V. Prasanna, Reconfigurable Computing Systems, USC


Programmable arch modeling
Programmable Arch. Modeling

  • Computation Services

  • Communication Services

  • Other Services

PPC405

MicroBlaze

SynthMaster

SynthSlave

Computation Services

Read (addr, offset, cnt, size), Write(addr, offset, cnt, size),

Execute (operation, complexity)

Processor

Local

Bus

(PLB)

On-Chip

Peripheral

Bus

(OPB)

Communication Services

addrTransfer(target, master)

addrReq(base, offset, transType, device)

addrAck(device)

dataTransfer(device, readSeq, writeSeq)

dataAck(device)

BRAM

Mapping

Process

OPB/PLB Bridge

Task Before Mapping

Read (addr, offset, cnt, size)

Task After Mapping

Read (0x34, 8, 10, 4)


Programmable arch modeling1
Programmable Arch. Modeling

  • Coordination Services

PPC Sched

MicroBlaze

Sched

PLB Sched

OPB Sched

BRAM Sched

General Sched

  • PostCond()

  • Augment event with information

  • (annotation). This is typically the

  • interaction with the quantity manager

  • Request (event e)

  • Adds event to pending

  • queue of requested events

  • Resolve()

  • Uses algorithm to select an

  • event from the pending queue

GTime


Programmable arch modeling2
Programmable Arch. Modeling

Structure

Extractor

  • Compose scheduling and scheduled netlists in top level netlist.

  • Extract structure for programmable platform tool flow.

Top Level Netlist

public netlist XilinxCCArch {

XilinxCCArchSched schedNetlist;

XilinxCCArchScheduling schedulingNetlist;

SchedToQuantity[] _stateMedia;

}

Mapping

Process

File for Programmable Platform Tool Flow

MicroBlaze

Sched

MicroBlaze

OPB

OPB Sched

  • Type

  • Parameters

  • Etc

Connections

Scheduled Netlist

Scheduling Netlist

Topology

ModularModeling Style

Accurate & Efficient


Prog platform characterization
Prog. Platform Characterization

Need to tie the model to actual implementation data!

1. Create template system description.

2. Generate many permutations of the architecture using this template and run them through programmable platform tool flow.

3. Extract the desired performance information from the tool reports for database population.


Prog platform characterization1
Prog. Platform Characterization

Create database ONCE prior to simulation and populate with independent (modular) information.

1. Data detailing performance based on physical implementation.

2. Data detailing the composition of communication transactions.

3. Data detailing the processing elements computation.

From Char Flow Shown

From Metro Model Design

From ISS for PPC

Work with Xilinx Research Labs

  • Douglas Densmore, Adam Donlin, A.Sangiovanni-Vincentelli, FPGA Architecture Characterization in System Level Design, Submitted to CODES 2005.

  • Adam Donlin and Douglas Densmore,Method and Apparatus for Precharacterizing Systems for Use in System Level Design of Integrated Circuits, Patent Pending.


Architecture extensions for mapping
Architecture Extensions for Mapping

  • Programmable platforms allow for both SW and HW implementations of a function.

  • Need to express which architecture components can provide which services and with what affinity.

Dedicated HW DCT

General Purpose

uProc

Mapping

Process

Mapping

Process

Potential Mapping Strategies

Greedy

Best Average

Task Specific


Architecture extensions for preemption
Architecture Extensions for Preemption

  • Some Services are naturally preempted

    • CPU context switch, Bus transactions

  • Notion of Atomic Transactions

    • Prior to dispatching events to a quantity manager via the request() method, decompose events in the scheduled netlist into non-preemptable chunks.

    • Maintain status with an FSM object (counter) and controller.

Event Size 3

Decoder

1

1

1

Initial State

S1

S2

S3


Modeling char review
Modeling & Char. Review

Scheduling Netlist

Task1

Task2

Task3

Task4

DEDICATED HW

DedHW Sched

PPC

PPC Sched

Global

Time

PLB

PLB Sched

BRAM Sched

BRAM

Scheduled Netlist

Characterizer

Media (scheduled)

Process

Enabled Event

Quantity

Quantity Manager

Disabled Event


Arch refinement verification
Arch. Refinement Verification

Architectures often involve hierarchy and multiple abstraction levels.

These techniques are limited if it is not possible to check if elements in hierarchy or less abstract components are implementations of their counterparts.

Asks “Can I substitute M1 for M2?”

Representing the internal structure of a component.

Recasting an architectural description in a new style.

Applying tools developed for one style to another style.

D. Garlan, Style-Based Refinement for Software Architectures, SIGSOFT 96, San Francisco, CA, pg. 72-75.


Metropolis refinement
Metropolis Refinement

  • Internal Refinement

    • Depth Refinement1

      • Interface Based

      • Focus on introducing new behaviors (Reason 1)

  • Structural Refinement

    • Vertical Refinement1

    • Horizontal Refinement1

      • Event Based

        • Refinement Properties

      • Focus on abstraction & synthesis (Reasons 2 & 3)

  • Douglas Densmore, Metropolis Architecture Refinement Styles and Methodology, University of California, Berkeley, UCB/ERL M04/36, 14 September 2004.


Depth refinement
Depth Refinement

This Observable Behavior can be captured for Metropolis processes in the model via the creation of a control flow graph (CFG)1.

Trace containment check for single threaded processes

Work with Cypress Semiconductor

  • Douglas Densmore, Sanjay Rekhi, A. Sangiovanni-Vincentelli, MicroArchitecture Development via Successive Platform Refinement, Design Automation and Test Europe (DATE), Paris France, 2004.


Vertical refinement
Vertical Refinement

Mapping

Process

Rtos

PPC405

Cache

PLB

BRAM

Rtos Sched

Mapping

Process

  • Definition: A manipulation to the scheduled netlist structure to introduce/remove the number or origin of events as seen by the scheduling netlist.

Cache Sched

Sequential

New

origins and

amounts

of events

scheduled

and

annotated

PPC Sched

PLB Sched

Concurrent

BRAM Sched

Scheduling Netlist

Scheduled Netlist


Horizontal refinement
Horizontal Refinement

Mapping

Process

Control

Thread

Rtos

PPC405

Arb

Cache

PLB

BRAM

Rtos Sched

Mapping

Process

PPC Sched

  • Definition: A manipulation of both the scheduled and scheduling netlist which changes the possible ordering of events as seen by the scheduling netlist.

PPC405

Cache Sched

Ordering

of event

requests

changed

PPC Sched

PLB Sched

BRAM Sched

Scheduling Netlist

Scheduled Netlist

*Contains all possible orderings if abstract enough


Refinement properties
Refinement Properties

Properties expressed as event sequences as seen by the scheduling netlist.

E1 (CPUExe)

E2 (CPUExe)

E3 (CPUExe)

E4(CPURead)

Resource Utilization

Latency


Example design
Example Design

JPEG Encoder Function Model (Block Level)

Structure

Extractor

Preprocessing

DCT

Mapping

Process

Quantization

Huffman

Mapping

Process

Mapping

Process

Mapping

Process

MicroBlaze

SynthMaster

On-Chip

Peripheral

Bus

(OPB)

SynthSlave

BRAM

BRAM

3. Assemble an architecture from library services or create your own services.

  • Select an application and understand its behavior.

Top Level Netlist

5. Extract a structural file from the top level netlist of the architecture created.

2. Create a Metropolis functional model which models this behavior.

4. Map the functionality to the architecture.

File for Xilinx EDK Tool Flow

IP Library


Example design cont
Example Design Cont.

Permutation 1

Permutation 2

Permutation N

Platform Characterization Tool (Xilinx EDK/ISE Tools)

Manual

Software Routines

int DCT (data){

Begin

calculate …

}

Hardware Routines

DCT1 = 10 Cycles

DCT2 =5 Cycles

FFT = 5 Cycles

Automatic

Manual

32 Bit Read = Ack, Addr, Data, Trans, Ack

Problem Statement

Approach

Contribution

1. Feed the captured structural file to the permutation generator.

File for Xilinx EDK Tool Flow

2. Feed the permutations to the Xilinx tools and extract the data.

3. Capture execution info for software and hardware services.

4. Provide transaction info for communication services.

Permutation Generator

ISS Info

Char

Data

Transaction

Info

Characterizer Database


Example design cont1
Example Design Cont.

JPEG Encoder Function Model (Block Level)

Execution time 100ms

Bus Cycles 4000

Ave Memory Occupancy 500KB

Preprocessing

DCT

Mapping

Process

Quantization

Huffman

Mapping

Process

Mapping

Process

Mapping

Process

MicroBlaze

On-Chip

Peripheral

Bus

(OPB)

BRAM

Concurrent

Vertical

Refinement

4. Re-simulate to see if your goals are met.

Verification

Tool

Execution time 200ms

Bus Cycles 1000

Ave Memory Occupancy

100KB

BRAM

BRAM

Yes?

No?

  • Backend Tool Process:

  • Abstract Syntax Tree (AST) retrieves

  • structure.

  • 2. Control Data Flow Graph - Depth

  • FORTE – Intel Tool

  • Reactive Models – UC Berkeley

  • 3. Event Traces – Refinement

  • Properties.

  • Vertical Refinement

  • Horizontal Refinement

1. Simulate the design and observe the performance.

SynthMaster

2. Refine design to meet performance requirements.

New Algorithm

SynthSlave

  • 3. Use Refinement Verification to check validity of design changes.

    • Depth, Vertical, or Horizontal

    • Refinement properties

Depth

ISS Info

Char

Data

Transaction

Info


Goals for metro ii
Goals for Metro II

Import heterogeneous IP

Different languages

Different models of computation

Key Platform-based Design Activities

Behavior-Performance Separation

Quickly change performance characteristics of models

Design Space Exploration

Relate functionality and architecture

Verify relationships between different abstraction levels

Coordination

Framework

3-Phase Execution

Event-oriented

Framework

40


Components ports and connections
Components, Ports, and Connections

required

port

Component

provided

port

Wrapper

view port

IP

  • Ports

    • Coordination: provided, required

    • View ports

  • Connections

    • Each method in interface for provided-required connection associated with begin and end events

  • IP is wrapped to expose framework-compatible interface

  • Components encapsulate wrapped IP


Mappers
Mappers

  • Mappers are objects that help specify the mapping

    • Bridge syntactic gaps only

    • E.g. Missing method parameters

  • Mapping occurs at the component level

    • Between components with compatible interfaces

    • Possibly many functional components mapped to a single architectural component

Func. Comp

Arch. Comp

Mapper


Adaptor
Adaptor

Events

?

Events

Component1

(MOC1)

Component2

(MOC2)

Adaptor

How to communicate

with different MoC?

  • Adaptor transforms the tags of the events to make different MoCs compatible

    • Values are not changed

    • Will not produce/discard events

43

Bridge different models of computation (MoCs)


Implementation of adaptor
Implementation of Adaptor

44

Adaptor contains internal channels for storing the information of events, and a process to transforms the tags of events

Adaptor will be executed during the base model execution phase (phase 1)

Test case with an adaptor between dataflow and FSM semantics

Will be further tested in the cruise control and heating and cooling project


Metro ii system architecture status
Metro II System Architecture Status

m2_manager

Mapper

Adaptor

Annotator

Scheduler

m2_port

m2_interface

m2_component

m2_method

Constraints

m2_event

Metro II Core

sc_module

sc_event

Implementation Platform: SystemC 2.2

Implementation started

45


Behavior performance separation in metropolis
Behavior-Performance Separation in Metropolis

3. Granting of requests

Phase 1

Phase 2

Global Time

P2

P1

Resource

Scheduler

2. Quantity

Resolution

R

1. Explicit quantity requests

  • Processes make explicit requests for annotation

  • Annotation/scheduling are intertwined

    • Iteration between multiple quantity managers

  • Challenges in GM case study

    • Vehicle stability application on distributed CAN architecture

    • Interactions between global time QM and resource QM difficult to debug


Execution semantics in metro ii
Execution Semantics in Metro II

Not Blocked

Blocked

  • Metro II components (imperative code) are run by processes (sequential thread of execution).

start

Propose Event or Wait

Event Enabled or Notified

Metro II Process States

47


Execution semantics in metro ii1
Execution Semantics in Metro II

start

Phase 3

Phase 1

Phase 2

Logical Time

P2

P1

Physical Time

3. Sched.

Resolution

Annotated

Inactive

Proposed

R

Resource Scheduler

1. Block processes at interfaces

Event proposed by Process

Event enabled by CS then processcontinues execution

Event Disabled by CS must be reannotated

4. Enable some processes

Event Annotated

2. Annotations

Event Disabled by CS, but keep the same annotations

48


Phases and events
Phases and Events

  • Each phase is allowed to interact with events in a limited way

    • Keep responsibilities separate


Assumptions
Assumptions

  • “Blocking”

    • Both the architectural and functional models should be allowed to block

  • Scheduling

    • Functional model execution is valid (i.e. doesn’t deadlock)

  • Mapping

    • The enabling of events in one model, correspond directly to the enabling of other events

50


Mapping
Mapping

  • Mapping in Metro II requires:

    • Assigning functional operations to architecture services. Many-to-one relationship.

      • This is done through events.

  • Issues to resolve:

    • Which types and in what order should events be related between function and architecture?

    • How processes present in the functional model trigger architectural components? How does simulation execution originate?

51


Proposal 1
Proposal 1

Functional model initiates execution and is followed by the architecture model.

Function

F

R

P

G

P

Architecture

A

B

  • Port Mapping Conventions

    • Required to Provided

FR.e

FR.b

AP.b

A.body

AP.e

  • Call graph Example

Synchronized Events - - -

Direct Event Ordering __

GP.b

G.body

GP.e

Key

52


Proposal 2
Proposal 2

Architectural model initiates execution and is followed by the functional model.

Function

F

R

P

G

P

Architecture

A

B

  • Port Mapping Conventions

    • Required to Provided

FR.e

FR.b

GP.b

G.body

GP.e

  • Call graph Example

Synchronized Events - - -

Direct Event Ordering __

AP.b

A.body

AP.e

Key

53


Proposal 3
Proposal 3

Functional and architectural model execute concurrently.

Function

F

R

P

G

P

Architecture

A

B

  • Port Mapping Conventions

    • Provided to Provided

A.body

AP.b

AP.e

FR.e

FR.b

  • Call graph Example

G.body

GP.b

GP.e

Synchronized Events - - -

Direct Event Ordering __

Key

54


Key points of proposals
Key Points of Proposals

  • Proposal 1 – Functional model execution cannot be determined by architectural state.

  • Proposal 2 – Architecture model must block if the functionality blocks.

  • Proposal 3 – Requires that the component’s execution be granular enough to support explicit synchronization opportunities (i.e. protocols).

55


Mapping granularity tradeoff
Mapping Granularity Tradeoff

  • Granularity changes may be needed to support proposal 3.

  • The functional and architectural models need not have the same level of granularity.

  • Grab bus access

  • Read fifo status

  • If it can proceed to read/write

    • Read/Write; release bus

  • Else

    • Release bus; wait a random number of cycles; goto 1

FIFO READ Begin

FIFO READ END

56


Example design scenario
Example Design Scenario

Functional Model

ExS

ExD

ExQ

ExH

Ex

Ex

Ex

Ex

Source

DCT

Quant

Huffman

FIFO

1

FIFO

2

FIFO

3

W

R

R

W

W

R

Ex

Ex

Arch

1

Bus

Architecture Model

MJPEG

Ex

Ex

Arch

2

Arch

3

Arch

4

Shared FIFO is another design scenario

57


Hand trace for proposal 3
Hand Trace for Proposal 3

EXD

DCT

Arch2

FIFO

Bus

Arch1

EXS

SRC

58


Pret functional model
PRET Functional Model

PREcision Timed machine work at UC Berkeley is going to require support for a variety of timing models.

Expose Execution Events

TLP1

TLP2

TLP3

Dummy Events Added

TLE1

TLE2

TLE3

1 2 3

Period 5, 3, 11

Start 8, 3, 10

Exe Time 1, 2, 3

3

5

6

8

9

10

13

14

18

21

24

32

8

1

4

1

4

3

2

1

2

1

10

3

8

8

3

59


Metro ii mapping conclusions
Metro II Mapping Conclusions

  • Metro II mapping uses events to synchronize execution between the functional and architectural model.

  • Potential tradeoffs in granularity and expressiveness depend on the mapping style (Metro II supports various).

  • Established a style to describe Metro II execution and started a set of design scenarios to discuss the tradeoffs.

60


Design activity umts case study
Design Activity: UMTS Case Study

UMTS is a mobile communication protocol standard

Universal Mobile Telecommunications System

3G cell phone technology

Often used in Software Defined Radio (SDR)

Started with C and SystemC models as baseline

Source of Metro II functional models

Profiling to use in architecture models

Comparisons for Metro II simulation results

Have both DLL and PHY level SystemC models

Converted only data link layer to Metro II

61


Umts dll function model
UMTS DLL Function Model

RLC

Tr Buffer

Segment.

RLC Header

Add

Ciphering

fifo

MAC

TrCH Type

Switch

C_T_Mux

Tr_format

sel

PHY

Transmitter

13 Computational Components

12 FIFOs

Receiver

MAC

RLC

CT

DEMUX

Rx_TrCH

Type Switch

Deciphering

RLC Header

Rem

Reassembly

62


Metro ii umts models
Metro II UMTS Models

Focused on the DLL

layer

Initial SystemC

model was converted

to Metro II

  • Two Models:

  • Pure functional model with blocking read and write semantics.

  • Timed model with a scheduler and preemption.

63


Synchronization mechanisms
Synchronization Mechanisms

UMTS example exposed two approaches to synchronization in Metro II:

Explicit Synchronization:

Use the underlying simulation framework directly

i.e. SystemC “or/and” waits

Constraints:

Move synchronization from phase 1 to phase 3 completely.

64


Metro ii service modeling
Metro II: Service Modeling

Two basic architecture modeling styles: cycle accurate runtime analysis vs. off line, pre-profiled approach

Task

Mapper

Architecture

Component

Cycle Accurate

Pipeline

IMEM

65


Sparc runtime processing element
SPARC Runtime Processing Element

A runtime processing based element was created to model the Leon 3 SPARC processor

66


Architecture model overview
Architecture Model Overview

Tasks for mapping 1-to1 with functional components

  • RTOS for scheduling events from N tasks to M processing elements

  • Three scheduling policies:

  • Round Robin

  • Fixed Priority

  • FCFS

Numerous configurations of processing elements (48 chosen)

67


Metro II Complete System

FC1

FCN

Annotator

M1

M2

MN

Phase 2

T1

T2

TN

Mapping Constraint

Solver

OS

Sparc1

ARM7

uB

Logical Time Scheduler

Phase 3

Phase 1

68


Umts case study outcomes
UMTS Case Study Outcomes

Processing elements include

ARM7, ARM9: GPP profiling approach

Microblaze: Programmable platform flow

SPARC: runtime profiling with C code snippets of core routines

48 different mappings explored

11 PEs, 1 PE, combinations of 4 PEs broken down by RLC tx/rx and MAC tx/rx

9 classes within the 48 mappings

69



Estimated execution time and utilization
Estimated Execution Time and Utilization

1uB; +2%

Measured exe. time:

11uB; +3.1%

4uB; +16%

71


Execution time and utilization analysis
Execution Time and Utilization Analysis

Round Robin

Mapping #1 (fastest, 11 SPARCs) and #46 (slowest, 1 uBlaze) had a 2,167% difference

Priority

Avg. execution time reduced by 13% over round robin

Avg. utilization decreases by 2%

FCFS

Avg. execution time reduced by 7%

Avg. utilization increases by 27%

72


Runtime analysis across phases and mapping classes
Runtime analysis across phases and mapping classes

An average 61% of the time is spent in Phase 1, 5% in Phase 2 and 17%

in Phase 3 (third section). For most models using RTP the averages are 93%, 0.9%, and 3% respectively.

For pure profiled (PP) mappings they are 21%, 7% and 26%.

For mixed classes the

numbers are 82%, 2.6% and 7.6%.

Key message: runtime processing elements dominate.

Despite all of this, the average runtime to process 7000 bytes of data was 54 seconds.

73


Systemc vs metro ii
SystemC vs. Metro II

Metro II timed functional model has a 7.4% increase in runtime over SystemC timed functional model

Mapped Metro II model is 54.8% faster than timed SystemC model

Metro II phases 2 and 3 have significantly less overhead than the timer-and-scheduler based system required by the SystemC timed functional model

In a comparison of the Metro II timed model running without constraints and one running with them, the average runtime decrease was 25%

74


Design effort
Design Effort

75

  • Entire design

    • 85 files

    • 8,300 LOC

  • Mapping change affects only 2 files

  • Metro II conversion affects 1% of lines in each file

    • 58% of these lines relate to constraint registration

  • SystemC SPARC model conversion adds only 3.4% to code size (92 lines)


Future directions
Future Directions

Complete another design using similar architecture as in UMTS but with an emphasis on the communication structure

H.264 decoding

Produce documentation of design activities

Recently put tech report on “Metro II Semantics for Mapping” on GSRC website

Metro II journal paper in progress

76


Design activity heating and cooling in building automation
Design Activity: Heating and Cooling in Building Automation

Source (Temp and Pressure constant)

Room 1 (P,T)

Room 2 (P,T)

P,T

P,T

Environment (P,T)

Door (Crack Model)

Sink

77


Dop center modeling
DOP Center Modeling

  • Sensor to controller

  • Latency: 0.3 s

  • Message length: 8 bits

  • Period :1 s

  • Controller to actuators

  • Latency: 0.4 s

  • Message length:16 bits

  • Period:1 s

  • 8 Networks (2.5Mb/s) plus a

  • high speed, second level network

  • Estimated cost $21385

  • Bus load: 96kb/s(min), 237kb/s(max),

  • 139kb/s(avg), Networks are distance

  • and degree limited, not bandwidth limited

  • Network library

  • -Field bus 78kb/s (ARCNET)

  • Field bus 2.5Mb/s (ARCNET)

  • -Constraints: topology, degree, length

  • -Two level hierarchical network


Cosi and metro ii design flow
COSI and Metro II: Design Flow

Step 3

Mapping

Metro II Functional Model

Refined Metro II

Architecture

Model

MetroII

Architecture

Model

Step 1

COSI

Synthesis

results

Metro II

Simulation

Results

COSI

Step 2

79


Heating and cooling functional model
Heating and Cooling Functional Model

Modelica

Model

OpenModelica

CORBA communication

MetroII

A2

S2

Room Temperatures

A1

S1

Controller1

FIFO_a1c

FIFO_s1c

Controller2

FIFO_a2c

FIFO_s2c

GSRC Quarterly Workshop

80


Current status
Current Status

Completed and tested function model

Interacts with Modelica model

Room number is configurable (both full and mini-DOP center models created).

Created architecture model with mapping

Processor, sensor, actuator numbers are configurable

Support multiple controller tasks on same processor (with orthogonalized scheduling policy model)

81


ad