Yingping huang and gregory madey
Download
1 / 26

Yingping Huang and Gregory Madey - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

A. utonomic. eb-based. W. S. imulation. Yingping Huang and Gregory Madey. University of Notre Dame. Presented by Tariq M. King. Published by the IEEE Computer Society in the 38 th Annual Simulation Symposium (April 2005). A utonomic W eb-based S imulation. Web-based Simulation +

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Yingping Huang and Gregory Madey' - tamas


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Yingping huang and gregory madey

A

utonomic

eb-based

W

S

imulation

Yingping Huang and Gregory Madey

University of Notre Dame

Presented by Tariq M. King

Published by the IEEE Computer Society in the 38th Annual Simulation Symposium (April 2005)


A utonomic w eb based s imulation
Autonomic Web-based Simulation

  • Web-based Simulation +

  • Autonomic Computing

    Motivations:

  • Scientific simulations are large programs that will probably contain errors when deployed to web

  • Increased complexity in large-scale web-based simulations due to integration of different pieces of services

    Goal:

  • Self-manageable Web-based Simulations

A

W

S


Human Nervous System = CNS + PNS

  • Brain controls higher order conscious activities (thought, reasoning, abstraction)

  • Brain also controls lower level involuntary activities called autonomic functions

  • ANS monitors and regulates in such a way that there is no conscious human involvement

  • ANS was the basis for IBM’s Autonomic Computing initiative for system self-management


Autonomic Computing Vision

Adapt to dynamically changing environments

Discover, diagnose & react to disruptions

I

B

Monitor and tune resources automatically

Anticipate, detect, identify and protect against attacks

M

4


Aws requirements
AWS Requirements

  • Simulation checkpointing and restarting

  • Simulation self-awareness and proactive failure detection

  • Self-manageable computing infrastructure to host simulations

A

W

S


Checkpointing self healing optimizing
Checkpointing(Self-healing/Optimizing)

  • Checkpointing is often used in simulations, databases, systems and operations research

  • Determining optimal checkpoint is not trivial

    • Excessive => performance degradation

    • Deficient => expensive redo

    • Both yields a longer execution time

  • An optimization problem is formed

R

Q

1




Proactive failure detection
Proactive Failure Detection

  • Major cause of simulation crashes is low memory

  • API’s in J2SE 5.0 can be used for:

    • External monitoring using external monitoring software

    • Internal monitoring by adding logic inside the simulation

  • E.g. MXBeans Low Mem Notification =>

    checkpoint and terminate gracefully

R

Q

2


Autonomic infrastructure for aws
Autonomic Infrastructure for AWS

Autonomic Agent

on each server

Autonomic Manager

on DB server

Autonomic IP forwarding switch

with Standby

DB

R

Firewall/Router

Q

with

Standby

DW

3

10


Self configuring under aws
Self-Configuring under AWS

  • Autonomic discovery of new servers

  • Autonomic resize of server pool

  • Autonomic configuration of firewall/router, application servers and simulation servers

  • Autonomic configuration of the database server and the data warehouse

A

W

S


Self healing under aws
Self-Healing under AWS

  • Some degree of redundancy is required to achieve self-healing in AWS

  • Hot standby data warehouse and hot standby database

  • Database and data warehouse are designed on two physical hosts

  • Server pool ensures that when an application server is down, other servers can pick up its tasks

A

W

S


Self healing under aws contd
Self-Healing under AWS (contd)

  • Application Servers

    • autonomic agent monitors execution status

    • untimely response => failed app server

    • New server started and IP forwarding is changed by the autonomic agent on the firewall

  • Simulation Servers

    • Autonomic agents upload operating system metrics (load avg, free memory)

    • This also serves as the “heart-beat”, if the autonomic manager doesn’t receive the heart-beat => failed simulation server

A

W

S


Self healing under aws contd1
Self-Healing under AWS (contd)

  • Database Servers

    • The autonomic manager resides on the DBS.

    • Vital to keep server running 24/7

    • Whenever primary database is down, database connections can be failed over to the standby database.

  • Simulations

    • Checkpointing

    • Dispatcher redistributes crashed simulations to appropriate simulation servers.

A

W

S


Self optimizing under aws
Self-Optimizing under AWS

  • Load balancing the server pool

    • Achieved by the Dispatcher and the Autonomic Agents

    • New simulation is assigned to the simulation server with the lowest OS load

    • Agents check Dispatcher table periodically to start any unassigned simulations

    • At each checkpoint, Agents check with the Autonomic Manager to see if migration is necessary

    • Simulations on heavily loaded servers are checkpointed and restarted on light servers.

A

W

S


Self protecting under aws
Self-Protecting under AWS

  • Careful configuration of the firewall

  • Security configuration on the grid

    • Users of the grid must register and be verified by the system administrator

    • System administrator must assign appropriate user roles

    • Use of data model tables USERS, USER_ROLES, VERIFIER

      Is this self-protecting/autonomic?

A

W

S


Conclusions and future work
Conclusions and Future Work

  • Paper presents a prototype of autonomic web-based simulation

  • Implementation of an autonomic infrastructure to support AWS is discussed

  • Future work focuses on implementing more autonomic features into AWS

A

W

S


Agnostic question 1
Agnostic Question #1

The authors describe one possible implementation of autonomic web-based simulations. One example for a project that uses such an implementation is the NOM project.

Do you know of any other projects that have been proposed or developed? How do they compare to each other in terms of efficiency, technique and architecture used?

A

W

S


Agnostic question 2
Agnostic Question #2

The paper states that web-based simulations need to be deployed through computing systems (i.e. storage devices, database, web servers and simulation servers).

Can you think of any component(s) involved that would increase the level of complexity more than the other?

A

W

S


Agnostic question 3
Agnostic Question #3

One method the authors provide for handling faults after they have occurred is through the use of checkpointing and restarting.

Which approach is better:

Using static checkpointing (fixed time intervals)

Using dynamic checkpointing

(context-specific, amount of computation, etc)

A

W

S


Agnostic question 4
Agnostic Question #4

The authors suggest that for a system to achieve autonomic features, that system must become even more complex by embedding the complexity into the system infrastructure itself.

Is there any approach that involves less complexity in achieving autonomic features? If yes, give examples.

A

W

S


Agnostic question 5
Agnostic Question #5

One method given by the paper for handling simulation servers that have not uploaded the OS metrics in a timely fashion would be to mark the simulations on that server as crashed and restart the simulations from the last checkpoint on another server.

What action would be taken if the former server starts responding optimally.

A

W

S


Agnostic question 6
Agnostic Question #6

The authors stated two major requirements (proactive failure detection and, checkpointing and restarting) for AWS.

Can you think of any other requirement that would be necessary for AWS?

A

W

S


Agnostic question 7
Agnostic Question #7

The paper suggests some techniques that could be used to implement the autonomic infrastructure for AWS such as autonomic discovery of new servers, autonomic failure detection etc.

Can you think of any other techniques that could be considered useful?

A

W

S


Agnostic question 8
Agnostic Question #8

What challenges would be faced when trying to validate and test an autonomic web-based simulation?

How important is test to autonomic web-based simulation?

A

W

S


Agnostic question 9
Agnostic Question #9

Compare and contrast the difference between autonomic grid computing and autonomic web-based simulations?

How would the challenges in validating and testing an autonomic web-based simulation application differ from what is required to validate and test an autonomic grid computing application?

A

W

S


ad