embedded computer architecture 5kk73 mpsoc n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Embedded Computer Architecture 5KK73 MPSoC PowerPoint Presentation
Download Presentation
Embedded Computer Architecture 5KK73 MPSoC

Loading in 2 Seconds...

play fullscreen
1 / 44

Embedded Computer Architecture 5KK73 MPSoC - PowerPoint PPT Presentation


  • 66 Views
  • Uploaded on

Embedded Computer Architecture 5KK73 MPSoC. Controlling the Parallel Resources. flexibility. efficiency. DSP. Programmable CPU. Programmable DSP. Application specific instruction set processor (ASIP). Application- specific processor. Contents. GPUs revisited

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Embedded Computer Architecture 5KK73 MPSoC' - jovita


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
embedded computer architecture 5kk73 mpsoc

Embedded Computer Architecture5KK73MPSoC

Controlling the Parallel Resources

slide2

flexibility

  • efficiency
  • DSP
  • Programmable
  • CPU
  • Programmable
  • DSP
  • Application specific
  • instruction set
  • processor (ASIP)
  • Application-
  • specific processor
contents
Contents

GPUs revisited

PicoChip

Real-Time Scheduling basics

Resource Management

3

gpu basics
GPU basics

Synthetic objects are represented with a bunch of triangles (3d) in a language/library like OpenGL or DirectX plus texture

Triangles are represented with 3 vertices

A vertex is represented with 4 coordinates with floating-point precision

Objects are transformed between coordinate representations

Transformations are matrix-vector multiplications

4

geforce 8800 gpu
GeForce 8800 GPU

7

330 Gflops, 128 processors with 4-way SIMD

gpu why more general purpose programmable
GPU: Why more general-purpose programmable?

All transformations are shading

Shading is all matrix-vector multiplications

Computational load varies heavily between different sorts of shading

Programmable shaders allow dynamic resource allocation between shaders

Result:

Modern GPUs are serious competitor for general-purpose processors!

8

real time systems reinder bril
Real-time systems (Reinder Bril)
  • Correct result at the right time: timeliness
    • Many products contain embedded computers, e.g. cars, planes, medical and consumer electronics equipment, industrial control.
    • In such systems, it’s important to deliver correct functionality on time.
  • Example: inflation of an air bag
example multimedia consumer terminals

Cable

modem

DVB Tuner

IEEE 1394

interface

RF Tuner

CVBS

interface

YC

interface

VGA

DVD CDx

front end

Example: Multimedia Consumer Terminals

(by courtesy of Maria Gabrani)

example high quality video real time

up-scaled

Example: High quality video & real time

TV companies invest heavily in video enhancement,e.g. temporal up-scaling

Input stream: 24 Hz (movie)

original

Rendered stream: 60 Hz (TV screen)

example high quality video real time1

up-scaled

displayed

Example: High quality video & real time

TV companies invest heavily in video enhancement,e.g. temporal up-scaling

Input stream: 24 Hz (movie)

original

  • Deadline miss leads to “wrong” picture.
  • Deadline misses tend to come in bursts (heavy load).
  • Valuable work may be lost.
real time systems
Real-time systems
  • Guaranteeing timeliness requirements:
    • real-time tasks with timing constraints
    • scheduling of tasks
  • Fixed-priority scheduling (FPS) is the de-facto standard for scheduling in real-time systems.
  • FPS: supported by
    • commercially available RTOS;
    • analytic and synthetic methods.
recap of fps
Recap of FPS
  • Fixed Priority Pre-emptive Scheduling (FPPS)
    • A basic scheduling model
    • Analysis
    • Example
    • Optimality of RMS and DMS
fpps a basic scheduling model
FPPS: A basic scheduling model
  • Single processor
  • Set of n independent, periodic tasks 1, …, n
  • Tasks are assigned fixed priorities, and can be pre-empted instantaneously.
  • Scheduling: At any moment in time, the processor is used to execute the highest priority task that has work pending.
fpps a basic scheduling model1
FPPS: A basic scheduling model
  • Task characteristics:
    • period T,
    • (worst-case) computation time C,
    • (relative) deadline D,
  • Assumptions:
    • non-idling;
    • context switching and scheduling overhead is ignored;
    • execution of releases in order of arrival;
    • deadlines are hard, and D T;
    • 1 has highest and n has lowest priority.
    • No data-dependencies between tasks
fpps example

1

2

3

4

5

6

1

2

3

time

0

10

20

30

40

50

60

WR1 = 3

WR2 = 17

WR3 = 56

FPPS: Example
  • Worst-case response time WR for task 3: First point in time that 1, 2, and 3 are finished

Task 1

Task 2

Task 3

fpps analysis
FPPS: Analysis
  • Schedulable iff:WRi Di for 1  i  n
  • Necessary condition:
  • Sufficient condition for RMS:ULL(n) = n (21/n – 1), i.e.

ri >rj iff Ti < Tj;Di = Ti.

fpps analysis1
FPPS: Analysis
  • Otherwise,
    • i.e. U  1 and not RMS, or
    • n(21/n – 1) < U < 1 and RMS
  • exact condition:
    • Critical instant: simultaneous release of i with all higher priority tasks
    • WRi is the smallest positive solution of
fpps example1
FPPS: Example
  • Task set Γ consisting of 3 tasks:
  • Notes:
    • RM priority assignment and Di = Ti(RMS);
    • U1 + U2 + U3 = 0.97  1, hence Γcould be schedulable;
    • Utilization bound: U(n) LL(n) = n (21/n – 1):
      • U1+U2 = 0.88 > LL(2)  0.83,
      • therefore U(3) > LL(3), hence another test required.
fpps example2

1

2

3

4

5

6

1

2

3

time

0

10

20

30

40

50

60

WR1 = 3

WR2 = 17

WR3 = 56

FPPS: Example
  • Time line

Task 1

Task 2

Task 3

fpps optimality of rms and dms
FPPS: Optimality of RMS and DMS
  • Priority assignment policies:
    • Rate Monotonic (RM): ri >rj iff Ti < Tj
    • Deadline Monotonic (DM): ri >rj iff Di < Dj
  • Under arbitrary phasing:
    • RMS is optimal among FPS when Di = Ti;
    • DMS is optimal among FPS when DiTi,
    • where optimal means: if an FPS algorithm can schedule the task set, so can RMS/DMS.
fpps not suitable for multimedia multiprocessor
FPPS not suitable for multimedia multiprocessor!!

Assumptions:

  • context switching and scheduling overhead is ignored; No longer true
  • deadlines are hard, and D T; No longer true
  • 1 has highest and n has lowest priority: No prorities
  • No data-dependencies between tasks: not true
  • Single processor: not true
non preemptive systems akash kumar

Task

Non-Preemptive Systems (Akash Kumar)
  • State-space needed is smaller
  • Lower implementation cost
  • Less overhead at run-time
  • Cache pollution, memory size
why fps doesn t work for future high performance platforms
Why FPS doesn’t work for “future” high-performance platforms
  • Heavy-duty DSPs: Preemption not supported
    • If it was: Context switching is significant
  • Data-dependencies not taken into account
  • Multi-processor
related research feasibility analysis
Related Research – Feasibility Analysis

Preemptive

[Liu, Layland, 1973]

B

A

D

[Jeffay, 1991]

Non-Preemptive

C

Homogeneous MPSoC

[Baruah, 2006]

P1

P2

P3

P4

P5

P6

[ , 2020??]

Heterogeneous MPSoC

problem
Problem

No good techniques exist to analyze and schedule applications on non-preemptive heterogeneous systems

Resource Manager proposed to schedule applications such that they meet their performance requirements on non-preemptive heterogeneous systems

our assumptions

B2

A2

D2

C2

Task

Job

Our Assumptions
  • Heterogeneous MPSoC
  • Applications modeled as SDF
    • Non-preemptive system – tasks can not be stopped
    • Jobs can be suspended
  • Lot of dynamism in the system
    • Jobs arriving and leaving at run-time
    • Variation in execution time
  • Very simple arbiter at cores
resource manager

Application QoS Manager

Application level

few sec

Reconfigure to meet above quality

milliseconds

Resource

Manager

B

A

Local Processor Arbiter

Task level

micro sec

Core

Resource Manager
architecture description

Resource

Manager

Local Arbiter

P1

P2

P3

Architecture Description
  • Computation resources available are described
  • Each processor can have different arbiter
    • In this model First Come First Serve mechanism is used
  • Resource manager can configure/control the local arbiters: to regulate the progress of application if needed
resource manager1
Resource Manager
  • Responsible for two main things
  • Admission control
    • Incoming application specifies throughput requirement
    • Execution-time and mapping of each actor
    • Repetition vector used to compute expected utilization
    • RM checks if enough resources present
    • Allocates resources to applications if admitted
admission control

Video Conf

Play MP3

Typing Sms

P1

Admission Control

Resource Reqmt

Exceeded!

P2

P3

resource manager2
Resource Manager
  • Admission control
  • Budget enforcement
    • When running, each application signals RM when it completes an iteration
    • RM keeps track of each application’s progress
    • Operation modes
      • ‘Polling’ mode
      • ‘Interrupt’ mode
    • Suspends application if needed
budget enforcement polling

Performance goes down!

Better than required!

Budget Enforcement (Polling)

New job enters!

Resource

Manager

job

suspended!

job

resumed!

experiments
Experiments
  • A high-level simulation model developed
    • POOSL – a parallel simulation language used
  • A protocol for communication defined
  • System verified with a number of application SDF models
  • Case study done with H263 and JPEG application models
    • Impact of varying ‘polling’ interval studied