Sandbox
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

Sandbox Learning : Try without error ? Prof. Dr.-Ing. C. Müller-Schloer Universität Hannover PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on
  • Presentation posted in: General

Sandbox Learning : Try without error ? Prof. Dr.-Ing. C. Müller-Schloer Universität Hannover Institut für Systems Engineering – System- und Rechnerarchitektur Appelstraße 4 30159 Hannover [email protected] +49 (0)511 762 19730

Download Presentation

Sandbox Learning : Try without error ? Prof. Dr.-Ing. C. Müller-Schloer Universität Hannover

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Sandbox learning try without error prof dr ing c m ller schloer universit t hannover

Sandbox Learning: Trywithouterror?

Prof. Dr.-Ing. C. Müller-Schloer

Universität Hannover

Institut für Systems Engineering –

System- und Rechnerarchitektur

Appelstraße 4

30159 Hannover

[email protected]

+49 (0)511 762 19730

based on jointworkwith Hartmut Schmeck, University of Karlsruhe,

and Theo Ungerer, University of Augsburg

Team: Jörg Hähner, Holger Prothmann, Fabian Rochner, Sven Tomforde


Outline

Outline

  • Online learning and errors

  • A firstsolution

    • OrganicTrafficControl

    • OrganicNetworkControl

  • Open questions


Learning

Learning

Learning

  • Observation of the world, update of a world model

  • Acting in the world: Try & error

    • Reinforcement learning: Reward/penalty assigned to action  influencesfuturedecisions

    • But: Immediate real-world effects

      Nature

  • Collective level (genotype)

    • 4 bn. years

    • Huge populations

    • Redundancy (neglect of the individual)

  • Individual level (phenotype)

    • Modification of behavior or preferences based on experience (try) …

    • …as long as theindividualsurvives.


Learning in technical systems

Learning in technical systems

Requirements

  • Immediate reaction (even if sub-optimal)

  • Guaranteed prevention of illegal actions (4-way green)

  • Adaptation and long-term improvement

  • How long is long-term? Learning speed!!

    Example

  • Learning traffic light controller

    • Genetic algorithm with selection based on real-world evaluation

    • # tries until reasonable solution: 1000

    • Assessment time constant (traffic): 15 minutes

    •  999 unsuitabletries

    •  10 days


Generic 3 level architecture

5

Generic 3-level architecture

User

Definition of system objectives

objectives (LoS, …)

Level 2

Layer 2

  • Sandbox: Off-line parameter optimization

  • Evolutionary Algorithm (EA)

  • Simulation-based evaluation

  • Only legal parameter sets sent to level 1

Simulator

EA

Layer 1

  • Immediate reaction

  • Observer: Situation classification

  • Selection from legal parameter sets

    • might be suboptimal level 2

Level 1

Observer

LCS

SuOC

  • Real world

  • Sensors

  • Actuators

System under Observation/Control

detector

data

actuator

settings

Productive system


Example 1 organic traffic control

Example 1: Organic Traffic Control

Goals

  • Network of adaptive learning traffic light controllers (TLCs).

  • TLCslearn with some limited sensory horizon.

  • TLCscooperate to achieve a global goal (e.g. reduced avg. travel time).

  • Explore possibilities/limitations of decentralized control systems.

    Phase 1

    • Single, isolated junction

      Phase 2

    • Collaborating TLCs

    • Progressive signals (GrüneWelle)


Traffic control architecture

7

Traffic Control Architecture

User

Definition of system objectives

objectives (LOS, …)

Level 2

Layer 2

  • Off-line parameter optimisation

  • Evolutionary Algorithm (EA) evolves TLC parameters

  • Simulation-based evaluation (AIMSUN)

Simulator

EA

Layer 1

  • On-line parameter selection

  • Observer monitors traffic

  • Learning Classifier System (LCS) selects TLC parameters and learns rule quality

Level 1

Observer

LCS

SuOC

  • Control of traffic signals

  • Industry-standard TLC

    • Fixed-time

    • Traffic-responsive

    • Parameters determine performance

System under Observation/Control

detector

data

signal

settings

Traffic Light Controller (TLC)


Hamburg

Hamburg


Otc performance

OTC: Performance

OTC performance during three consecutive days

Manually designed reference


Example 2 organic network control

Example 2: OrganicNetworkControl

  • OrganicControl of Data CommunicationNetworks

  • Controland management of networkprotocolclients in datacommunicationnetworks

  • Autonomouscontrolsystemforeachnetworkentity

  • Collaborationbetweenneighbourednetworkentities


Onc motivation

ONC: Motivation

  • Networkprotocolconfigurationisstatic

    • Goal: dynamicadaptation of networkprotocolparametersettings to changingenvironment

  • Client actswithin large computernetworks

    • Currentnetworkstatus has influence on theperformanceof thenetworkprotocol.

  • Computer isusedfor different taskssimultaneously

    • Currentusage of systemressourceshas influence ontheperformance of thenetworkprotocol.


Onc bittorrent

File

ONC: BitTorrent

  • Currentfocus: BitTorrent1)

    • Trackerresponsibleformeeting of peers

    • Fairness-baseddistribution

    • Files aresplitintosmallerparts („chunks“)

  • Variable parameters(mostimportantones):

    • Delays

    • Intervals (Choking, …)

    • Number of peers(minimum,maximum, initiallyfromtracker, etc.)

    • Number of openconnections

    • Chunksize

Chunk

(1) „IncentivesBuildRobustness in BitTorrent“: Bram Cohen, Proc. 1st Workshop on Economics of Peer-to-Peer Systems, Berkeley 2003.


Onc architecture

objectives (download-rate, etc.)

Level 1

Observer

LCS

ONC architecture

  • User interface

  • User defines system objectives

    • E.g. download-rate for BitTorrent or coverage-rate for MANETs

Level 2

Simulator

Observer

  • Level 2

  • Extend behavioral repertoire of level 1

  • Off-line learning (protocol parameter sets)

EA

  • Level 1

  • Adapt SuOC-parameters (rules)

  • On-line learning (rule fitness)

SuOC

  • System under Observation and Control

  • Network protocol client

  • E.g. BitTorrent Client

network

data

Network protocol Client

protocol

configuration


Evaluation off line 1

Evaluation: Off-line (1)

  • Off-lineoptimisation: influence of number of peers


Onc evaluation on line 2

ONC Evaluation : On-line (2)

  • Adaptation to backgroundclientusageprofile


Open questions future work 1 2

Open questions, future work (1/2)

Incongruent model

  • Model adjustment

    Abstraction of non-local environment

  • Influence of neighboring nodes?

    Verification

  • Optimized parameter sets could be verified before implemented into layer 1

    State-less behavior  Multi-step LCS

  • LCSs are stateless (stimulus – response)

  • Learning of action sequences?

objectives

Layer 2

Simulator

EA

Layer 1

Observer

LCS

Productive system


Open questions future work 2 2

Open questions, future work (2/2)

  • So far: Simulation of local neighborhood with assumptions about the behavior of other nodes.

Communication between nodes

  • Level 1: Increase learning performance by exchange of learnt rule sets: Rule generalization?

  • Level 2: Exchange of populations  distributed EA

    Parallel “sandbox” world on layer 2

  • Network-wide distributed simulation: Synchronization? Convergence?

  • Influence on real world?

  • Analogy from human society: social discourse

Layer 2

Layer 2

Simulator

Simulator

EA

EA

Layer 1

Layer 1

Observer

Observer

LCS

LCS

Productive system

Productive system


Sandbox learning try without error prof dr ing c m ller schloer universit t hannover

Thankyouforyourattention!


  • Login