- 191 Views
- Uploaded on
- Presentation posted in: General

Scheduling

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Scheduling

EE249

- Have some work to do
- know subtasks

- Have limited resources
- Have some constraints to meet
- Want to optimize quality

EE249

- Overview
- shop scheduling
- data-flow scheduling
- real-time scheduling
- OS scheduling

- Real-time scheduling
- RTOS generation
- scheduling
- Communication

- Data-flow scheduling
- pure
- Petri nets

EE249

Single job, one time

- finite and known amount of work
- multiple resources of different kind
- often minimize lateness
- could add release, precedence, deadlines, ...
SOLUTION: compute the schedule

APPLICATION: manufacturing

- could add release, precedence, deadlines, ...

EE249

Single-job, repeatedly

- known amount of work
- simple subtasks

- multi-processor
- max. throughput, min. latency
SOLUTION: code generation

APPLICATION: signal processing

EE249

- Work
- data dependent (BDF, FCPN)

- Resources
- many different execution units (HLS)

- Goal
- min. code, min. buffers, min. resources

EE249

Fixed number of repeating jobs

- each job has fixed work
- job is a sub-task

- processor(s)
- meet individual deadlines
SOLUTION: choose policy, let RTOS implement it

APPLICATION: real-time control

EE249

- Work
- sporadic or event-driven tasks,
- variable (data dependent) work
- coordination between tasks:
- mutual exclusion, precedence, …

- Goal
- event loss, input or output correlation, freshness, soft deadlines, ...

EE249

Variable number of random tasks

- know nothing about sub-tasks
- processor + other computer resources
- progress of all tasks, average service time
SOLUTION: OS implements time-slicing

APPLICATION: computer systems

EE249

- Overview
- shop scheduling
- data-flow scheduling
- real-time scheduling
- OS scheduling

- Real-time scheduling
- RTOS generation
- Scheduling
- Communication

- Data-flow scheduling
- pure
- Petri nets

EE249

- Enable communication between software tasks, hardware and other system resources
- Coordinate software tasks
- keep track which tasks are ready to execute
- decide which one to execute: scheduling

EE249

- Implementing communication through events
- Coordination:
- classic scheduling results
- reactive model of real-time systems
- conservative scheduling analysis
- priority assignment

EE249

- Given:
- estimates on execution times of each task
- timing constraints

- Find:
- an execution ordering of tasks that satisfies constraints

- A schedule needs to be:
- constructed
- validated

EE249

- Plus side:
- simpler
- lower overhead
- highly predictable

- Minus side
- bad service to urgent tasks
- independent of actual requests

EE249

- off-line (pre-run-time, static)
- round-robin, e.g.
- C1 C2 C3 C4 C1 C2 C3 C4 C1 C2 C3 C4 …

- static cyclic, e. g.
- C1 C2 C3 C2 C4 C1 C2 C3 C2 C4 C1 C2 …

- round-robin, e.g.
- on-line (run-time, dynamic)
- static priority
- dynamic priority
- preemptive or not

EE249

- synthesis:
- priority assignment
- RMS [LL73]

- analysis
- Audsley 91

EE249

- Liu -Layland [73]consider systems consisting of tasks:
- enabled periodically
- with fixed run time
- that should be executed before enabled again
- scheduled preemptively with statically assigned priorities

EE249

- giving higher priority to tasks with shorter period (RMS) is optimal
- if any other static priority assignment can schedule it, them RMS can do it too

- define utilization as sum of Ei/Ti
- any set of n tasks with utilization of less than n(21/n-1) is schedulable
- for n=2,3,…. n(21/n-1) = 0.83, 0.78, … ln(2)=0.69

EE249

Audsley [91]:

- for a task in Liu-Layland’s model find its worst case execution time

k

n

i

i

i

i

i

time

WCET i

run time i

period i

period i

EE249

- let Ei’s be run-times, Ti’s periods
- how much can i be delayed by a higher priority task k:
- each execution delays it by Ek
- while i is executing k will be executed ciel(WCETi/ Tk)

- WCETi = Ei + SUMk>i ciel(WCETi/ Tk)* Ek

EE249

- iteration
- WCETi,0 = Ei
- WCETi,n+1 = Ei + SUMk>i ciel(WCETi,n/ Tk)* Ek
will converge if processor utilization if less than 1

EE249

- Earliest deadline first:
- at each moment schedule a task with the least time before next occurrence

- LL have shown that for their model, EDF schedules any feasible set of tasks

EE249

- Liu-Layland model yields strong results but does not model reactivity well
- Our model:
- models reactivity directly
- abstracts functionality
- allows efficient conservative schedule validation

EE249

- System is a network of internal and external tasks
- External tasks have minimum times between execution
- Internal tasks have priorities and run times

20

1,2

5,2

3,2

10

2,1

4,1

EE249

20

1,2

5,2

3,2

10

2,1

4,1

- External task execute at random, respecting the lower bound between executions
- Execution of a task enables all its successors
- Correct if no events are lost

EE249

- To check correctness:
- check whether internal events can be lost
- priority analysis

- check whether external events can be lost
- bound WCET

- check whether internal events can be lost

EE249

- More general: if fan-ins of i form a tree such that leaves have lower priority than non-leaves and k, then (i,k) cannot be lost

i

k

- Simple: if priority of i is less than k, then (i,k) cannot be lost

i

k

EE249

- Compute a bound on the period of time a processor executes task of priority i or higher (i-busy period)

> i

> i

> i

i

i

< i

< i

time

i-busy period

- (i-busy period ) > ( WCETi )

EE249

- i-busy period is bounded by:
- initial workload at priority level i or higher caused by execution of some task < i
- workload at priority level i or higher caused by execution of external tasks during the i-busy period

- can find (by simulation) workload at priority level i or higher caused by execution of a single task
- can bound the number of occurrences of external tasks in a given period
- need to solve a fix-point equation

EE249

F

B=>C

C=>F

G

C=>G

F^(G==1)

C

C=>A

CFSM1

CFSM2

C

C=>B

A

B

C=>B

(A==0)=>B

CFSM3

EE249

- CFSMs can be implemented:
- in hardware: HW-CFSMs
- in software: SW-CFSMs
- by built-in peripherals (e.g. timer): MP-CFSMs

EE249

- for every event, RTOS maintains
- global values
- local flags

x

CFSM2

x

emit x( 3 )

detect x

CFSM1

CFSM3

x

3

EE249

- TASK 1 detects y AND NOT x, which is never true
- to avoid, need atomic detects

TASK 1

detect x

detect y

TASK 2

emit x

TASK 3

if detect x then

emit y

EE249

- for atomicity:
- always read from frozen
- others always write to live
- at the beginning of execution, switch

CFSM

live

frozen

EE249

- event can be polled or driving an interrupt
- for polled events:
- allocate I/O port bits for value, occurrence and acknowledge flags
- generate the polling task that acknowledges and emits all polled events that have occurred

- for events driving an interrupt:
- allocate I/O port bits for value,
- allocate an interrupt vector,
- create an interrupt service routine that emits an event

EE249

- interrupt service routine:
- optional interrupt service routine:

{

emit x

}

{

emit x

execute SW-CFSM

}

R

T

O

S

X

IRQ

X

HW-CFSM

SW-CFSM

EE249

- allocate I/O port bits for value and occurrence flag
- use existing ports or memory-mapped ports
- write value to I/O port
- create a pulse on occurrence flag

EE249

- every peripheral must have a library with
- init function (to be called at initialization time)
- deliver function for each input (to be called by emit)
- detect function for each output (to be called by poll-taker)
- interrupt service routine (containing emit)

EE249

- consider SW-CFSM ready to run whenever it has some not consumed input events
- choose the next ready SW-CFSM to run:
- scheduling problem

EE249

- dashboard
- 6 tasks, 13 events
- 0.1s (8.6s to estimate run times)

- shock absorber controller
- 48 tasks, 11 events
- 0.3s (880s to estimate run times)

- PATHO RTOS
- orders of magnitude faster than timed automata
- scales linearly

EE249

- Propagation of constraints from external I/O behavior to each CFSM
- probabilistic: Markov chains
- exact: FSM state traversal

- Satisfaction of constraints within a single transition
(e.g., software-driven bus interface protocol)

- Automatic choice of scheduling algorithm, based on performance estimation and constraints
- Scheduling for verifiability

EE249

- Overview
- shop scheduling
- data-flow scheduling
- real-time scheduling
- OS scheduling

- Real-time scheduling
- RTOS generation
- scheduling
- Communication

- Data-flow scheduling
- pure
- Petri nets

EE249

- Functionality usually represented with a data-flow graph
- Kahn’s conditions allow scheduling freedom
- if a computation is specified with actors (operators) and data dependency, and
- every actor waits for data on all inputs before firing, and
- no data is lost
- then the firing order doesn’t matter

EE249

- Schedule: a firing order that respects data-flow constraints and returns the graph to initial state

A, 1

B, 2

D, 1

C, 3

EE249

Static scheduling (cyclic executive, round robin)

- A, B, C, D are processes
- RTOS schedules them repeatedly in order A D B C
- simple, but context-switching overhead large

A, 1

B, 2

A schedule:

A D B C

D, 1

C, 3

EE249

Code synthesis (OS generation)

- A, B, C, D are subroutines
- generate: forever{ call A; call D; call B; call C; }
- less robust, better overhead

A, 1

B, 2

A schedule:

A D B C

D, 1

C, 3

EE249

In-lined code synthesis

- A, B, C, D are code fragments
- generate: forever{A; D; B; C; }
- even less robust, even better overhead

A, 1

B, 2

A schedule:

A D B C

D, 1

C, 3

EE249

Resources

- fixed or arbitrary number of processors
Goal:

- max. throughput given a fixed number of processors
- min. processors to achieve required throughput

EE249

Max. throughput given a fixed number of processors

- it is NP-hard to determine max. achievable throughput
Min. processors to achieve required throughput

- if there are loops than there is a fundamental upper bound
- easy to compute

EE249

1/maxloops(Time/Delay)

A, 1

B, 2

D, 1

C, 3

N+2’nd output of A can be computed at least 7 time units after the Nth

EE249

Non-overlapped scheduling

- Look at one iteration
- Use list scheduling algorithm (developed for shop scheduling)
Overlapped scheduling

- less developed

EE249

- Remove delayed edges
- List scheduling:
- maintain list of tasks that could be scheduled
- schedule one with longest path

A, 1

B, 2

D, 1

C, 3

EE249

- Assume two processors

A, 1

B, 2

3

D, 1

C, 3

3

EE249

A, 1

B, 2

D, 1

C

P1

C, 3

A

P2

EE249

A, 1

B, 2

D, 1

C

P1

C, 3

A

D

P2

EE249

A, 1

B, 2

D, 1

C

P1

C, 3

A

D

B

P2

EE249

- Unfold k iteration (e.g. k=2)
- Do list scheduling

A1, 1

B1, 2

A2, 1

B2, 2

D1, 1

D2, 1

C1, 3

C2, 3

EE249

- Rate optimal (not true in general)

A1, 1

B1, 2

A2, 1

B2, 2

D1, 1

D2, 1

C1, 3

C2, 3

C1

A2

C2

P1

A1

D1

B1

D2

B2

P2

EE249

- Loop scheduling
- Code size
- Buffer size

A

B

C

20

10

20

10

EE249

ABCBCCC

A (2 B (2 C))

A (2 B) (4 C)

A (2 B C) (2 C)

A

B

C

20

10

20

10

EE249

A (2 B (2 C))

A;

for i = 1 … 2 {

B;

for i = 1 … 2 {

C;

}

}

- single appearance schedules minimize in-lined code size

A

B

C

20

10

20

10

EE249

ABCBCCC 20 30

A (2 B (2 C)) 20 20

A (2 B) (4 C) 20 40

A (2 B C) (2 C) 20 30

A

B

C

20

10

20

10

EE249

Perfect design-time information

- Fixed amount of repeating work
- data-independent

- Input streams from the environment always available
- Simple global constraints
data dependency => Petri nets

timing constraints => real-time scheduling

EE249

- Problem: computation result may depend on dynamic schedule
- Synchronous languages: no scheduler needed
(but inapplicable to HW/SW heterogeneous systems)

- Data Flow networks: deterministic computation
(but blocking read is unsuitable for reactive systems)

- Can we obtain determinism without losing efficiency ?

EE249