a framework for flexible programming in complex grid environments n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A Framework for Flexible Programming in Complex Grid Environments PowerPoint Presentation
Download Presentation
A Framework for Flexible Programming in Complex Grid Environments

Loading in 2 Seconds...

play fullscreen
1 / 41

A Framework for Flexible Programming in Complex Grid Environments - PowerPoint PPT Presentation


  • 138 Views
  • Uploaded on

A Framework for Flexible Programming in Complex Grid Environments. 04/24/08 Taura Lab. 2 nd Year 76426 Ken Hironaka. New Context for Grid Computing. Grid Computing: Computation across multiple clusters over WAN Conventionally, high performance computing Parallel programming experts

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Framework for Flexible Programming in Complex Grid Environments' - reba


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a framework for flexible programming in complex grid environments

A Framework for Flexible Programming in Complex Grid Environments

04/24/08

Taura Lab. 2nd Year

76426 Ken Hironaka

new context for grid computing
New Context for Grid Computing
  • Grid Computing:
    • Computation across multiple clusters over WAN
  • Conventionally, high performance computing
    • Parallel programming experts
  • Broadening of demands and needs
    • Natural Language Processing
    • Genetic Sequence Analysis
    • The users are extending to

Non-parallel programming experts

more applications for grid computing
More Applications for Grid Computing
  • Just computing ⇒ computing is only 1 part
  • “Cloud Computing”
    • Applications with Grid computing backend
      • Handle intensive computation
      • load-balancing
    • e.g.: Web-Applications

backend

Publicly accessible

Application

Simple Job Submitter is not enough

problems with conventional frameworks
Problems with conventional Frameworks

Task

File

  • Conventional Grid Computing
    • Distributed Task computation frameworks
    • No interaction
  • Answer to broader Application and Demands
    • Complex interaction
  • Need for flexible workflow coordination without loss of simplicity
    • Programming support for Grid

Fine Grain

Interaction

problems with grid computing
Problems with Grid Computing
  • Deployment Complexity on Grid Environments
    • Dynamically joining nodes
    • Node/Network failures
    • Network environment
      • Prevalence of NAT/firewall
      • Unreliable WAN connections

Fire Wall

leave

Faulty link

Configuration?Communication (sockets)?

Error Handling?

Need for simple deployment on complex environments

join

our contribution
Our Contribution
  • A distributed object-oriented Programming Framework that alleviates the burden of Grid environments
    • Flexibility of programming without loss of simplicity
    • Simplicity of deployment
      • Run on the Grid with minimal configuration
    • Real – Life Applications
      • Deployed an application on over900 cores across 9 clusters
      • “trouble-shooting” search engine on the Grid
        • As example of Cloud Computing
agenda
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Preliminary Experiments
  • Conclusionand Future Work
distributed objects and rmi
Distributed Objects and RMI

foo.doJob(args)

  • ProActive [Huet ‘04]
  • Distributed Object Oriented
    • Objects on remote nodes
  • Work Delegation
    • RMI (Remote Method Invocation)
  • Parallel Computation via

asynchronous RMI

    • Possible race-conditions
  • Active Objects
    • 1 object = 1 thread
    • induces deadlocks

compute

RMI

foo

Async. RMI

  • Need for synchronization ⇒
  • cluttered with locks/synchronization
  • Coding becomes complex

deadlock

b.f()

b

a

a.g()

handling joins and failures
Handling Joins and Failures
  • JoJo [Nakada ‘04]
    • Master – Worker framework
    • Event driven coding
    • A handler is invoked for each event
      • Task completion
      • Node Joins
      • Node Failures

Failure Handler

Join Handler

Join

  • Synchronization issues
  • Event driven programming
  • For more complex problems, coding easily becomes unreadable
resolving connectivity on the grid
Resolving Connectivity on the Grid

NAT

  • ProActive [Huet ‘04]
    • overlay Network for communication
    • Resorts to manual network configuration files
      • Specify each connection

Configure each link

Firewall

  • configuration overhead becomes enormous on Grid scale

Connection

Configuration File

agenda1
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Preliminary Experiments
  • Conclusionand Future Work
our proposal
Our Proposal
  • A distributed object oriented framework for the Grid
    • Distributed Objects with Grid programming support
      • deadlock-free synchronization
      • Additional constructs to cope with join/failure of node
    • Automatic and Adaptive Overlay Construction for Grid Runtime
  • object oriented with support for race-condition/join/failure : flexibility and simplicity
  • deployment requires minimal configuration : simplicity
object synchronization model
Object Synchronization Model

waiting threads

owner

thread

object

  • parallel programming with minimal use of explicit locks
  • Distributed objects with ownership
    • Its method can only be executed by 1 thread at a time : the owner thread
    • Eliminates data races
  • Owner gives up ownership for blocking operations
    • Other threads may contest for ownership
    • Eliminates deadlocks for common cases

Th

Th

Th

Th

new

owner

thread

object

Th

Th

Th

block

Give-up

Owner

ship

Th

re-contest

for ownership

object

Th

Th

Th

Th

unblock

adaptation to dynamic resources
Adaptation to Dynamic Resources

Objects in

computation

  • programming support for joining/leaving nodes
  • Decentralized object lookup
    • Allow joining nodes to access other objects and join the computation
  • Node Failure ⇒ RMI Failure
    • Failure returned as exception to method invocation
    • The user can catch the exception, and perform rollback procedures if necessary

lookup

New object on joining node

Exception!

Object on failed node

automatic overlay construction 1
Automatic Overlay Construction (1)
  • Automatic/Transparent communication
  • Configuration ONLY for firewalled clusters
  • Adapts to dynamic joins/leaves
  • Nodes create a TCP overlay network cooperatively
    • Each node picks a small number of nodes to connect
    • Created connected graph

NAT

Global IP

Firewall

Attempt connection

established connections

automatic overlay construction 2
Automatic Overlay Construction (2)
  • NAT Clusters
    • NAT nodes can connect to global nodes
  • Firewalled Clusters
    • Automatic SSH port-forwarding
      • User specifies points
  • Transparent Communication
    • Point-Point communication is routedover the network
    • Ad-hoc routing Protocol
      • AODV [Perkins ‘97]
    • Adapts to node joins/leaves

Firewall

traversal

SSH

P-to-P

communication

failure detection on overlay
Failure Detection on Overlay

RMI handler

  • How do we detect failures on the overlay?
  • RMI Failure
    • Intermediate/end node failure

⇒ link failure

  • Path Pointers
    • Forwarding nodes remember the nexthop
    • RMI reply is returned the same way
  • For link failure along pointer, back-propagate the failure to the invoker

Path pointer

Backpropagate

failure

agenda2
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Preliminary Experiments
  • Conclusionand Future Work
experiment cluster settings
Experiment Cluster Settings

900 cores over 9 clusters

Global IPs

istbs

tsubame

hongo

okubo

chiba

All packets dropped

suzuk

kyoto

kototoi

imade

Private IPs

Firewall

overlay construction simulation
Overlay Construction Simulation
  • Evaluate the overlay construction scheme
  • For different cluster configurations, modified number of attempted connections per peer
  • 1000 trials per each cluster/attempted connection configuration

Even for pathological case,20 connections per peer is enough

dynamic master worker
Dynamic Master-Worker
  • A master object distributes work to worker objects
    • 10,000 tasks all together
    • Task as RMIs
  • Worker nodes join/leave at runtime
    • New task for new node
    • Reassignment for tasks on failed nodes
    • No tasks were lost during computation
dynamic master worker1
Dynamic Master-Worker

As the number of workers change, the number of assigned tasks change accordingly.

The Master adaptively distributes, rolls back, and redistribute tasks.

a real life application
A Real-Life Application
  • Solving a combination optimization problem
    • Permutation Flow Shop Problem
    • Parallel Branch-and-Bound
      • Master-Workerstyle
      • Periodic updates
    • Work distribution
      • Divide search space evenly as subtasks
      • Load-balancing
        • Unfinished tasks are sub-divided and redistributed
        • Wasteful computation is quite possible
master worker coordination
Master-Worker Coordination
  • Master does RMI to Worker
    • Worker: periodic bound exchange with master
    • Not a straightforward Master-Worker application
    • Requires flexible framework like ours

Master

exchange_bound()

doJob()

Worker

runtime speedup
Runtime Speedup

Lacks scalability with

over 900 cores

cumulative computation time
Cumulative Computation Time

Growth in Cum. Comp. time is attributed to increased re-execution of task

If the Cum. Comp. time is taken into account, the speed up from 169 cores to 948 cores (5.64 times) is 4.94

troubleshoot search engine
Troubleshoot Search Engine
  • Ever stuck debugging, or troubleshooting?
  • Re-rank google queries and give weight to pages for web-forums and solutions
    • Natural language processing and machine learning
  • Parallel computation on Grid backend
    • Real time response

Compute!!

Compute!!

backend

Query:

“vmware kernel panic”

Search Engine

Compute!!

agenda3
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Preliminary Experiments
  • Conclusionand Future Work
conclusion
Conclusion
  • A distributed object oriented programming framework for Grid environments
    • A novel distributed object oriented programming model
    • Grid-enabled via automatic overlay construction
  • Showed that real-life Grid application needs can be addressed by our framework
    • Deployed actual parallel applications on over 900 cores over 9 clusters with NAT/Firewalls, joins, and failures
    • Implemented a Grid computing backend for troubleshooting search engine
future work
Future Work
  • Reliable WAN communication for the Grid overlays
    • Node failure
    • Connection failure
  • Weakness of WAN connections
    • Router Policies
      • close connections after given period
    • Obscure kernel bugs with NAT
      • Connection resets

Faulty link

WAN links are more vulnerable, and failures will occur

some related work
Some Related Work

Fewest Hop:

High Reliability

High Power Usage

  • Robust Tree Topologies for Sensor Networks [ England ‘06]
    • Create spanning tree for data reduction
    • Flat tree for high reliability
      • Fewest Hops
    • Tree with short distance for low power consumption
      • Shortest Path

⇒ Spanning Tree that merges the two metrics for the best of two worlds

Shortest Path:

Low Reliability

Low Power Usage

possible future direction
Possible Future Direction
  • Our context: Grid computing
    • communication latency

= metric for link reliability

  • Fewest Hops
    • Reliability for node failure
  • Shortest Distance
    • Reliability for link failure

Short reliable links

Long faulty links

Can we construct an overlay connection topology that take the best of two worlds?

publications
Publications
  • 1. Ken Hironaka, Hideo Saito, Kei Takahashi, KenjiroTaura. A Flexible Programming Framework for Complex Grid Environments. In 8th IEEE International Symposium on Cluster Computing and the Grid, May 2008 (Poster Paper. To Appear).
  • 2. Ken Hironaka, Hideo Saito, Kei Takahashi, KenjiroTaura. A Flexible Programming Framework for Complex Grid Environments. In IPSJ Transactions on Advanced Computing Systems. (Conditional Accept)
  • 3. Ken Hironaka, Hideo Saito, Kei Takahashi, KenjiroTaura. A Flexible Programming Framework for Complex Grid Environments. In Proceedings of 2008 Symposium on Advanced Computing Systems and Infrastructures. (To Appear)
  • 4. Ken Hironaka, Shogo Sawai, KenjiroTaura. A Distributed Object-Oriented Library for Computation Across Volatile Resources. In Summer United Workshops on Parallel, Distributed and Cooperative Processing. August 2007
  • 5. Ken Hironaka, KenjiroTaura, Takashi Chikayama. A Low-Stretch Object Migration Scheme for Wide-Area Environments. In IPSJ Transactions on Programming. Vol 48 No.SIG 12 (PRO 34), pp.28-40, August 2007
problems with grid computing 2
Problems with Grid Computing (2)
  • Complexity of Programming on the Grid
    • Low Level Computing (sockets)
      • Communication
      • Multi-threaded Computing (Synchronization)
      • Heavy Burden on Non-experts
    • Flexibility and Integration
      • Grid Frameworks for task distribution
      • Independent parallel programming languages
      • Computing is not execution of many independent tasks
        • Need finer grained communication
      • Bad interface with user application
        • Java, Ruby, Python, PHP
related work
Related Work
  • Discussed with respect to criteria necessary for modern Grid computing
    • Workflow Coordination
      • Flexibility without putting the burden on the user
    • Joining Nodes / Failure of resources
      • Handling these events should not dominate the programming overhead
    • Connectivity in Wide-Area Networks
      • Adaptation to networks with NAT/firewall with little manual settings
workflow coordination 1
Workflow Coordination (1)

Central Manager

  • Condor / DAGMan [Thain ‘05]
    • “Tasks” are expressed as script files and distributed on idle nodes
    • Dependency between tasks can be expressed in DAG (Directed Acyclic Graph)
  • Ibis / Satin [Wrzesinska ‘06]
    • framework for divide-and-conquer problems
    • Tasks can be broken into smallersub-tasks, on which it depends

Assign

Busy Nodes

Cluster

Task

  • Many computation cannot be expressed as “Tasks” with dependencies
  • A task’s communication is limited to others to which it has dependencies

DAG Dependency

Relationship

object synchronization example
Object Synchronization Example

In method f(), instance a invokes blocking method g() on object b

class A:

def __init__(self, x):

self.x = x

def f(self, b):

self.x += 1

#blocking RMI

b.g()

self.x -= 1

return

a

b

only 1 thread at a time

b.g()

block

give-up

ownership

during RMI

Atomic section

Value x stays

consistent

Atomic section

adaptation to dynamic resources1
Adaptation to Dynamic Resources

signal

  • Signal delivery to objects
    • Unblocks any thread that is blocking in the object’s context
      • Can be used to notify asynchronous events
        • A joining node
  • Node Failure ⇒ RMI Failure
    • Failure returned as exception to method invocation
    • The user can catch the exception, and perform rollback procedures if necessary

object

unblock

Th

block

exception

preliminary experiments
Preliminary Experiments
  • Overlay Construction Simulation
  • A Simple Master-Worker Applicationwith dynamically joining/leaving nodes
  • A Real-life Parallel Application on the Grid
  • A Troubleshoot-Search Engine