Notes on an actor language j rn w janneck xilinx inc 13 february 2007 7 th ptolemy miniconference
Download
1 / 27

CAL Actor Language - PowerPoint PPT Presentation


  • 122 Views
  • Uploaded on

Notes on an actor language Jörn W. Janneck Xilinx Inc. 13 February 2007 – 7 th Ptolemy Miniconference. CAL @ Ptolemy the language domain-dependent interpretation CAL @ Xilinx overview application. CAL Actor Language. scripting actor specifications make it easier to write atomic actors

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CAL Actor Language' - mariel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Notes on an actor language j rn w janneck xilinx inc 13 february 2007 7 th ptolemy miniconference

Notes on an actor languageJörn W. JanneckXilinx Inc.13 February 2007 – 7th Ptolemy Miniconference


Cal actor language

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

CAL Actor Language

  • scripting actor specifications

    • make it easier to write atomic actors

  • experimenting with domain polymorphism

  • (code generation)


Actors in cal

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

Actions

State

actors in CAL

guarded atomic actions

encapsulated state


Simple actors

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

actor SumAbs () Input ==> Output:

sum := 0;

action [a] ==> [sum]

guard a >= 0

do

sum := sum + a;

end

action [a] ==> [sum]

guard a < 0

do

sum := sum - a;

end

end

SumAbs

simple actors

actor Sum () Input ==> Output:

sum := 0;

action [a] ==> [sum]

do

sum := sum + a;

end

end

Sum

Output

Input


Nondeterminism

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

nondeterminism

actor NDMerge () Input1, Input2 ==> Output:

action Input1: [x] ==> [x] end

action Input2: [x] ==> [x] end

end

NDMerge

Input1

Output

Input2


Data dependent token flow

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

data-dependent token flow

actor Select () S, A, B ==> Output:

action S: [sel], A: [v] ==> [v]

guard sel end

action S: [sel], B: [v] ==> [v]

guardnot sel end

end

S

Select

Output

A

B


Cal and domain polymorphism

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

CAL anddomain polymorphism

  • two fundamental questions:

    • Can an actor be interpreted/used in a given MoC?

    • What is its interpretation?

  • domain-specific interpretation


Example sdf

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

Example: SDF

Add

actor Add () Input1, Input2 ==> Output:

action [a], [b] ==> [a + b] end

end

1

Input1

1

Output

Input2

1

actor AddSeq () Input ==> Output:

action[a, b]==> [a + b] end

end

AddSeq

2

1

Output

Input


Example sdf cont d

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

Example: SDF (cont’d)

actor NDMerge () Input1, Input2 ==> Output:

action Input1: [x] ==> [x] end

action Input2: [x] ==> [x] end

end

NDMerge

Input1

Output

Input2

actor Merge () Input1, Input2==> Output:

action [x1], [x2] ==> [x1, x2] end

end

Merge

1

Input1

2

Output

Input2

1


Some kind of synchronous

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

Merge

1

2

1

Some kind of “synchronous”...

F

1

1

1

NDMerge

A

2

1

1

1


Example csp

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

Example: CSP

[

Input1 ? x -> Output ! x

||

Input2 ? x -> Output ! x

]

actor NDMerge () Input1, Input2 ==> Output:

action Input1: [x] ==> [x] end

action Input2: [x] ==> [x] end

end

actor Add () Input1, Input2

==> Output:

action [a], [b] ==> [a + b] end

end

Input1 ? a ->

Input2 ? b ->

Output ! a + b

[

Input1 ? a -> Input2 ? b

||

Input2 ? b -> Input1 ? a

] ; Output ! a + b


Example csp cont d

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

Example: CSP (cont’d)

actor Select () S, A, B ==> Output:

action S: [sel], A: [v] ==> [v]

guard sel end

action S: [sel], B: [v] ==> [v]

guardnot sel end

end

S ? sel;

[

sel -> A ? v -> Output ! v

||

not sel -> B ? v -> Output ! v

]

actor A () X, Y ==> Z:

action X: [x1, x2] ==> [f(x1, x2)]

guard P(x1, x2) end

action Y: [y1, y2] ==> [f(y1, y2)]

guard P(y1, y2) end

end

?


Cal and dataflow at xilinx

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

CAL and dataflow at Xilinx

software

class MyActor

{ schedule(); readPort( portNum ); writePort( portNum );

}

actor source+ network

simulation

hardware

high-level synthesis

  • new FPGA programming model & tools

    • hardware code generation

    • software (& mixed) code generation

  • driver application

    • MPEG4 Simple Profile Decoder

  • MPEG standardization effort

    • ISO/IEC 23001-4 (working draft): Codec Configuration Representation

    • ISO/IEC 23002-4 (working draft):Video Tool Library


Fpga programming in practice networked mpeg 4 viewer

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

Ethernet

UDP

Memory Controller

VGADisplay IP

FPGA Programming In PracticeNetworked MPEG-4 Viewer

XUP Board(2VP30)

Microblaze running LWIP protocol stack

Raster Scan Actor

Decoder Actor Network

VGA Display IP

UDP over Ethernet

Remote Video Stream Server

LocalVGA Monitor


Mpeg 4 sp decoder quality of compiled code

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

1 http://www.xilinx.com/bvdocs/ipcenter/data_sheet/ds520_prod_brf.pdf

2 BRAM-limited to 4-CIF image size.

3 Supports HD image size. Reduces to 16 BRAMs for 4-CIF image size.

MPEG-4 SP Decoderquality of compiled code


Comparing decoder solutions

  • CAL @ Ptolemy

  • the language

  • domain-dependent interpretation

  • CAL @ Xilinx

  • overview

  • application

b

d

c

a

a TI64xx MPEG-4 (CPU + L1 cache only)

c FPGA MPEG-4 using traditional HDL flow (12 MM effort)

d FPGA MPEG-4 using actor/dataflow synthesis (3 MM effort)

b ISSCC’06 H.264 capable (includes periphery)

comparing decoder solutions

relative area efficiency

  • 10

  • 5

  • 2

  • 1

CIF

SD

HD

10

100

1000

throughputmacroblocks/secx1000


Thank You.

Credits:

Dave B. Parlour, Ian D. Miller, Johan Eker, Edward A. Lee, and many others.

CAL actor language: embedded.eecs.berkeley.edu/caltrop



Programming language adoption
programming language adoption

Name TPCI TPCI cum. Year

C 17.66% 17.66% 1973

C++ 11.06% 28.73% 1985

Perl 5.48% 34.20% 1987

Python 3.47% 37.67% 1990

VB 9.73% 47.40% 1991

Delphi 2.15% 49.54% 1994

Java 21.17% 70.72% 1995

PHP 9.86% 80.58% 1995

JavaScript 2.20% 82.78% 1995

C# 3.07% 85.85% 2002

100

cumulative TCPI by

language creation date

(for top 10 languages)

JavaPHPJavaScript

C#

50

Delphi

VB

Python

Perl

C++

C

1970

1975

1980

1985

1990

1995

2000

2005

source: TIOBE Programming Community Index, TPCI, October 2006, http://www.tiobe.com/tpci.htm


Smaller faster easier too good to be true
Smaller, Faster, Easier Too good to be true?

  • This is what happens when design effort is constrained.

  • The key is enabling architectural exploration with rapid turn-around time.

  • New decoder architecture incorporates many improvements over original design in motion compensation, AC/DC reconstruction, parser, 2-d IDCT.

  • Approximate manpower numbers:

    • VHDL decoder: 12 months

    • Dataflow decoder: 3 months


Architectural exploration mpeg4 motion compensator
Architectural ExplorationMPEG4 Motion Compensator

video stream feedback

PROBLEM! Memory latency for random access reads and writes prevents real-world operation at HD rates.

video frame buffer(off-chip DRAM)


First step try on chip cache
First Step: Try on-chip cache

policy1Pass-through just to make sure model is OK.

  • Break the address and data streams, insert a cache placeholder.

  • Insert different policies, see what happens.

policy2Insert a cache actor in the read path and monitor statistics.


Simulation result with policy2
Simulation result with policy2

  • Memory controller performance 133MHz clock 32 pixel cache line fill in ~18 cycles

  • Worst case compensation is 81 reads for an 8x8 block.

  • 8.3% miss rate impliesaverage read is ~ 2.4 cycles

  • Rate limit is 44 Mpixel/s

  • HD (1920p, 4:2:0, 30fps) rate target is 93.3 Mpixel/s

  • Options for improvement- more expensive controller- much better cache policy- application-aware prefetch

Monitor console

Frame 1 OK time: 28111ms

Frame 2 OK time: 23834ms

Requests: 49456, Hits: 45360Miss rate: 8.28%

Frame 3 OK time: 27369ms

Requests: 98704, Hits: 90512Miss rate: 8.30%


Step2 application aware prefetch
Step2: Application-aware prefetch

prefetch requests to frame buffer

prefetch data

replace cache with “search window”

compensation addresses now relative to search window

search window senses block type


Results of prefetch strategy
Results of prefetch strategy

  • Better performance

    • prefetch needs to operate at 3x pixel rate

    • exploits longer burst read with application-awareness(longer cache line did not help policy2 significantly)

    • 64 pixels in 26 cycles → average read is ~ 0.4 cycles

    • peak theoretical performance is 111 Mpixel/s

    • exceeds HD rate target with cheap DRAM

  • Substantial change to overall model behavior, but

    • impact limited to two actors

    • no refactoring of control in other actors needed


The fpga programming problem
The FPGA programming problem

  • Big, heterogeneous chips

  • circuit-design programming (+ C, Simulink, ...)

  • 1985:

    • 128 4-LUTs

  • 2006: [V5-LX]

    • 207360 6-LUTs

    • 10Mbit BRAM

    • 192 ALUs


ad