BTeV-RTES Project
Download
1 / 16

BTeV-RTES Project Very Lightweight Agents: VLAs Daniel Moss , Jae Oh, Madhura Tamhankar, John Gross Computer Science D - PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on

BTeV-RTES Project Very Lightweight Agents: VLAs Daniel Moss é, Jae Oh, Madhura Tamhankar, John Gross Computer Science Department University of Pittsburgh. Shameless plug. LARTES IEEE Workshop on Large Scale Real-Time and Embedded Systems In conjunction with

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'BTeV-RTES Project Very Lightweight Agents: VLAs Daniel Moss , Jae Oh, Madhura Tamhankar, John Gross Computer Science D' - ford


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

BTeV-RTES Project

Very Lightweight Agents: VLAs

Daniel Mossé, Jae Oh, Madhura Tamhankar, John Gross

Computer Science Department

University of Pittsburgh


Shameless plug

LARTES

IEEE Workshop on Large Scale Real-Time and Embedded Systems

In conjunction with

IEEE Real-Time Systems Symposium (RTSS 2002 is on Dec 3-5, 2002)

December 2, 2002

Austin, TX, USA

http://www.rtss.org/LARTES.html


BTeV Test Station

Collider detectors are about the size of a small apartment building. Fermilab's two detectors-CDF and DZero-are about four stories high, weighing some 5,000 tons (10 million pounds) each. Particle collisions occur in the middle of the detectors, which are crammed with electronic instrumentation.

Each detector has about 800,000 individual pathways for recording electronic data generated by the particle collisions. Signals are carried over nearly a thousand miles of wire and cable.

Information from FERMI National Accelerator Laboratory


L1/L2/L3 Trigger Overview

Information from FERMI National Accelerator Laboratory


System Characteristics

Software Perspective

  • Reconfigurable node allocation

  • L1 runs one physics application, severely time constrained

  • L2/L3 runs several physics applications, little time constraints

  • Multiple operating systems and differing processors

    • TI DSP BIOS, Linux, Windows?

  • Communication among system sections via fast network

  • Fault tolerance is essentially absent in embedded and RT systems


L1/L2/L3 Trigger Hierarchy

Regional L2/L3

Manager (1)

TimeSys RT Linux

Regional Manager VLA

Global Manager

TimeSys RT Linux

Global Manager VLA

Regional L1 Manager (1)

TimeSys RT Linux

Regional Manager VLA

Gigabit Ethernet

Gigabit Ethernet

Section Managers (8), RH 8.x Linux, Section Manager VLA

Crate Managers (20), TimeSys RT Linux,

Crate Manager VLA

Linux Nodes (320)

RH 8.x Linux

Low-Level VLA

Farmlet Managers (16)

TimeSys RT Linux

Farmlet Manager VLA

DSPs (8)

TI DSP BIOS

Low-Level VLA

Data Archive

External Level


Very Lightweight Agents (VLAs)

Proposed Solution: Very Lightweight Agent

Minimize footprint

Platform independence

Monitor hardware

Monitor software

Comprehensible source code

Communication with high-level software entity

Error prediction

Error logging and messaging

Schedule and priorities of test events


Level 2/3 Farm Nodes

Hardware

OS Kernel

(Linux)

Physics

Application

VLA

Physics

Application

Network

API

L2/L3 Manager Nodes

VLAs on L1 and L2/3 nodes

Level 1 Farm Nodes

Hardware

OS Kernel

(DSP BIOS)

Physics

Application

VLA

Network

API

L1 Manager Nodes


DSP

VLA

VLA Error Reporting

Level 1/2/3 Manager Nodes

Hardware

Linux

Kernel

ARMOR

VLA

Manager

Application

Network

API

To Network


VLA Error Prediction

Buffer overflow:

1. VLA message or application data input buffers may overflow

2. Messages or data lost in each case

3. Detection through monitoring fill rate and overflow condition

4. High fill rate indicative of

* high error rate, producing messages

* undersized data buffers

Throttled CPU:

1. Throttled from high temperature

2. Throttle by erroneous power saving feature

3. Causes missed deadlines due to low CPU speed

4. Potentially critical failure if L1 data not processed fast enough

Note the the CPU may be throttled on purpose


FILTERS

VLA Error Logging

Hardware

Failures

Software

Failures

Communication API

Message Buffer

ARMOR

1. Reads messages

2. Stores/uses for error prediction

3. Appends appropriate info

4. Sends to archive

TCP/IP

Ethernet

VLA

Packages info:

1. Message time

2. Operational data

3. Environmental data

4. Sensor values

5. App & OS error codes

6. Beam crossing ID

“15” Message Buffer

Communication API

Data Archive


VLA Scheduling Issues

L1 trigger application has highest priority

VLA must run sufficiently to ensure efficacy of purpose

VLA must internally prioritize error tests

VLA must preempt the L1 trigger app on critical errors

Task priorities must be alterable during run-time


When physics app is unexpectedly

ended, more VLAs can be scheduled

Adaptive

Resource

Scheduling

Kernel

Physics

Application

VLA

VLA

VLA

VLA

VLA

Kernel

Physics Application

VLA

VLA has ability to control its own priority and that

of other apps, based on internal decision making

Alternative

Scheduling

Concept

Kernel

VLA

Physics Application

Kernel

Physics

Application

VLA

VLA Scheduling Issues

Normal

Scheduling

Kernel

Physics Application

VLA

VLA

Kernel

Physics Application

VLA

VLA


VLA Scheduling Issues

External Message Source

(FPGA)

VLA

Inhibitor

Kernel

No

VLA

Physics

Application


Vla status

Current Status

VLA skeleton and timing implemented in Syracuse (poster)

Hardware platform from Vandy

Software (muon application) from Fermi and UIUC

Linux drivers to use GME and Vandy devkit

Near term

Muon application to run on the dsp board

Muon application timing

Instantiate VLAs with Vandy hardware and Muon application

VLA Status


Vla and network usage

Network usage influences amount of data dropped by Triggers and other Filters

Network usage typically not considered in load balancing algorithms (assume network is fast enough)

VLAs monitor and report network usage

Agents use this information to re-distribute loads

Network architecture to control flows on a per-process basis (http://www.netnice.org)

VLA and Network Usage


ad