gluepy a simple distributed python programming framework for complex grid environments n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
gluepy : A Simple Distributed Python Programming Framework for Complex Grid Environments PowerPoint Presentation
Download Presentation
gluepy : A Simple Distributed Python Programming Framework for Complex Grid Environments

Loading in 2 Seconds...

play fullscreen
1 / 72

gluepy : A Simple Distributed Python Programming Framework for Complex Grid Environments - PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on

gluepy : A Simple Distributed Python Programming Framework for Complex Grid Environments. 8/1/08 Ken Hironaka, Hideo Saito, Kei Takahashi, Kenjiro Taura The University of Tokyo. Barriers of Grid Environments. Grid = Multiple Clusters (LAN/WAN) Complex environment Dynamic node joins

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'gluepy : A Simple Distributed Python Programming Framework for Complex Grid Environments' - wiley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
gluepy a simple distributed python programming framework for complex grid environments

gluepy:A Simple Distributed Python Programming Framework for Complex Grid Environments

8/1/08

Ken Hironaka, Hideo Saito,

Kei Takahashi, KenjiroTaura

The University of Tokyo

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

barriers of grid environments
Barriers of Grid Environments
  • Grid = Multiple Clusters (LAN/WAN)
  • Complexenvironment
    • Dynamic node joins
    • Resource removal/failure
      • Network and nodes
    • Connectivity
      • NAT/firewall

Fire Wall

leave

Grid enabled frameworks are crucial to facilitate computing in these environments

join

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

what type of applications
What type of applications?
  • Typical Usage
    • Standalone jobs
    • No interaction among nodes
  • Parallel and distributed Applications
    • Orchestrate nodes for a single application
      • Map an existing application on the Grid
    • Requires complex interaction

⇒frameworks must make it

simple and manageable

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

common approaches 1
Common Approaches(1)

execute

  • Programming-less
    • Batch Scheduler
      • Task placement (inter-cluster)
      • Transparent retries on failure
    • Enables minimal interaction
      • Pass data via files/raw sockets
      • Embarrassingly parallel tasks
      • Very limited for application

SUBMIT

redo

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

common approaches 2
Common Approaches(2)
  • Incorporate some user programming
    • e.g.:Master-Worker framework
      • Program the master/worker(s)
        • Job distribution
        • Handling worker join/leave
        • Error handling
  • Enables simpleinteraction
    • Still limited in application

doJob()

error()

join()

For more complex interaction (larger problem set)

must allow more flexible/generalprogramming

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

the most flexible approach
The most flexible approach
  • Parallel Programming Languages
    • Extend existing languages: retains flexibility
    • Countless past examples
      • (MultiLisp[Halstead ‘85], JavaRMI, ProActive[Huet et al. ‘04], …)
    • Problem:not in context of the Grid
      • Node joins/leaves?
      • Resolve connectivity with NAT/firewall?
    • Coding becomes complex/overwhelming

Can we not complement this?

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

our contribution
Our Contribution
  • Grid-enableddistributed object-oriented framework
    • a focus on coping with complex environment
      • Joins, failures, connectivity
    • simpleProgramming& minimalConfiguration
      • Simple tool to act as a glue for the Grid
    • Implemented parallel applications on Grid environment with 900cores (9clusters)

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

agenda
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Evaluation
  • Conclusion

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

programming less frameworks
Programming-less frameworks
  • Condor/DAGMan [Thain et al. ‘05]
    • Batch scheduler
    • Transparent retires/ handle multiple clusters
    • Extremely limited interaction among nodes
      • Tasks with DAG dependencies
      • Pass on data using intermediate/scratch files

Task

Interaction using files

Central Manager

Assign

Busy Nodes

Cluster

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

restricted programming frameworks
“Restricted” Programming frameworks
  • Master-Worker Model: Jojo2 [Aoki et al. ‘06], OmniRPC [Sato et al. ‘01],

Ninf-C [Nakata et al. ‘04], NetSolve [Casanova et al. ‘96]

    • Event driven master code: handle join/leave
  • Map-Reduce [Dean et al. ‘05]
    • define 2 functions: map(), reduce()
    • Partial retires when nodes fail
  • Ibis – Satin [Wrzesinska et al. ‘06]
    • Distributed divide-and-conquer
    • Random work stealing: accommodate join/leave
  • Effective for specialized problem sets
    • Specialize on a problem/model, made mapping/programming easy
    • For “unexpected models”, users have to resort to out-of-band/Ad-hoc means

Join Handler

Failure Handler

Join

fib(n)

Map()

divide

Reduce()

fib(n-1)

Map()

Reduce()

Input Data

Map()

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

distributed object oriented frameworks
Distributed Object Oriented frameworks

foo.doJob(args)

  • ABCL [Yonezawa ‘90]

JavaRMI, Manta [Maassen et al. ‘99]

ProActive [Huet et al. ‘04]

  • Distributed Object oriented
    • Disperse objects among resources
  • Load delegation/distribution
    • Method invocations
    • RMI (Remote Method Invocation)
    • Async. RMIs for parallelism
  • RMI:
    • good abstraction
  • Extension of general language:
    • Allow flexible coding

compute

RMI

foo

Async. RMI

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

hurdles for doo on the grid
Hurdles for DOO on the Grid
  • Race conditions
    • Simultaneous RMIs on 1 object
    • Active Objects
      • 1 object = 1 thread
      • Deadlocks:

e.g.: recursive calls

  • Handling asynchronous events
    • e.g., handling node joins
    • Why not event driven?
      • The flow of the program is segmented, and hard to flow
  • Handling joins/failures
    • Difficult to handle them transparently in a reasonable manner

deadlock

b.f()

b

a

a.g()

if join: add

if done: give more

event

Checkpoint?

Automatic retry?

failure

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

hurdles for implementation
Hurdles for Implementation

NAT

  • Connecivity with NAT/firewall
    • Solution: Build an overlay
  • Existing implementations
    • ProActive [Huet et al. ‘04]
      • Tree topology overlay
      • User must hand write connectable points
    • Jojo2[Aoki et al. ‘06]
      • 2-level Hierarchical topology
        • SSH / UDP broadcast
      • assumes network topology/setting
        • out of user control
  • Requirements
    • Minimal user burden

Configure each link

Firewall

Connection

Configuration File

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

summarization of the problems
Summarization of the Problems
  • Distributed Object-Oriented on the Grid
    • Thread race conditions
    • Event handling
    • Node join/leave
    • underlying Connectivity

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

proposal gluepy
Proposal: gluepy
  • Grid enabled distributed object oriented framework
    • As a PythonLibrary
    • glue together Grid resources via simple and flexible coding
  • Resolve the issues in an object-oriented paradigm
    • SerialObjects
      • define “ownership” for objects
      • blocking operations unblock on events
    • Constructs for handling Node join/leave
      • Resolve the “first reference” problem
      • Failures are abstracted as exceptions
    • Connectivity(NAT/firewall)
      • Peers automatically construct an overlay

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

the basic programming model
The Basic Programming Model

Proc: A

Proc: B

  • RemoteObjects
    • Created/mapped to a process
    • Accessible from other processes (RMI)
    • Passive Objects
      • Threads are not bound to objects
  • Thread
    • Simply to gain parallelism
    • RMIs / async. invocations (RMIs) implicitlyspawn a thread
  • Future
    • Returned for async. invocation
    • placeholder for result
    • Uncaught exception is stored

and re-raised at collection

a

Spawn for RMI

a.f()

f()

Proc

a

Spawn for async

F = a.f() async

f()

store in F

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

programming in gluepy
Programming in gluepy

inherit Remote Object

  • Basics: RemoteObject
    • Inherit Base class
    • Externally referenceable
  • Async. invocation with futures
    • No explicit threads
    • Easier to maintain

sequential flow

  • mutual exclusion? events?

⇒ SerialObjects

class Peer(RemoteObject):

def run(self, arg):

# work here…

return result

futures = []

for p in peers:

f = p.run.future(arg)

futures.append(f)

waitall(futures)

for f in futures:

print f.get()

async. RMI

run() on all

wait forallresults

read forallresults

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

ownership with serialobject s
“ownership” with SerialObjects

waiting threads

owner

thread

object

  • SerialObjects
    • Objects with mutual exclusion
    • RemoteObjectsub-class
  • No explicit locks
  • Ownership for each object
    • call ⇒ acquire
    • return ⇒ release
    • Method execution by only 1 thread
      • The “owner thread”
  • Owner releases ownership on

blocking operations

    • e.g: waitall(), RMI to other SerialObject
    • Pending threads contest for ownership
    • Arbitrary thread is scheduled
    • Eliminate deadlocks for recursive calls

Th

Th

Th

Th

new

owner

thread

object

Th

Th

Th

block

Give-up

Owner

ship

Th

re-contest

for ownership

object

Th

Th

Th

Th

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

unblock

signals to serialobjects
Signals to SerialObjects
  • We don’t want event-driven loops!
  • Events → “signals”
    • Blockingop. unblock on signal
  • Signals to objects
    • Unblock a thread blocking

in object’s context

      • If none, unblock a next blocking thread
    • Unblocked thread can handle

the signal(event)

object

SIGNAL

Th

unblock

handle

object

Th

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

serialobjects in gluepy
SerialObjects in gluepy

class DistQueue(SerialObject):

def __init__(self):

self.queue = []

def add(self, x):

self.queue.append(x)

if len(self.queue) == 1:

self.signal()

def pop(self):

while len(self.queue) == 0:

wait([])

x = self.queue.pop(0)

return x

  • e.g.:A Queue
    • pop()
      • blocks on empty Queue
    • add()
      • call signal() to unblock waiter
  • Atomic Section:
    • Between blocking ops

in a method

    • Can update obj. attr.s

and do invocation on

Non-Serial Objects

Atomic Section

Signal & wake

Block until signal

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

managing dynamic resources
Managing dynamic resources

Objects in

computation

  • Node Join:
    • Python process starts
  • Node leave:
    • Process termination
  • Constructs for node joins/leaves
    • Node Join

⇒“first reference” problem

Object lookup

      • obtain ref. to existing objects in computation
    • Node Leave

⇒ RMI exception

      • Catch to handle failure

lookup

joining node

Exception!

Object on failed node

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

e g master worker in gluepy 1 3
e.g.:Master-worker in gluepy (1/3)

class Master(SerialObject):

...

def nodeJoin(self, node):

self.nodes.append(node)

self.signal()

def run (self):

assigned = {}

while True:

while len(self.nodes)>0 and

len(self.jobs)>0:

ASYNC. RMIS TO IDLE WORKERS

readys = wait(futures)

if readys == None: continue

for f in readys:

HANDLE RESULTS

  • Handles join/leave
  • code for join:
    • join will invoke signal
    • signal will unblock main

master thread

Signal for join

Block &

Handle join

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

e g master worker in gluepy 2 3
e.g. :Master-worker in gluepy (2/3)

for f in readys:

node, job = assigned.pop(f)

try:

print ”done:”, f.get()

self.nodes.append(node)

except RemoteException, e:

self.jobs.append(job)

  • Failure handling
    • Exception on collection
    • Handle exception to resubmit task

Failure

handling

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

e g master worker in gluepy 3 3
e.g.: Master-worker in gluepy (3/3)
  • Deployment
    • Master exports object
    • Workers get reference

and do RMI to join

Master init

master = Master()

master.register(“master”)

master.run()

Worker init

worker = Worker()

master = RemoteRef(“master”)

master.nodeJoin(worker)

while True:

sleep(1)

lookup on join

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

automatic overlay construction 1
Automatic Overlay Construction(1)
  • Solution for Connectivity
    • Automatically construct

an overlay

  • TCP overlay
    • On boot, acquire other peer info.
    • Each node connects to a small number of peers
    • Establish a connected connection graph

NAT

Global IP

Firewall

Attempt connection

established connections

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

automatic overlay construction 2
Automatic Overlay Construction(2)
  • Firewalled clusters
    • Automatic

port-forwarding

    • User configure SSH info
  • Transparent routing
    • P-P communication is routed
    • (AODV [Perkins ‘97])

Firewall

traversal

SSH

#config file

use src_patdst_pat, prot=ssh, user=kenny

P-to-P

communication

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

rmi failure detection on overlay
RMI failure detection on Overlay

RMI handler

  • Problem with overlay
    • A route consists of a number of connections
  • RMI failure

⇒ failure of any intermediate

connection

  • Path Pointers
    • Recorded on each forwarding node
    • RMI replyreturns the path it came
  • Failure of intermediate connection
    • The preceding forwarding node back-propagates the failure

Path pointer

RMI invoker

Backpropagate

failure

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

agenda1
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Evaluation
  • Conclusion

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

experimental environment
Experimental Environment

InTrigger Grid Platform in Japan

Max. scale:9clusters, over 900 cores

requires

SSH forwarding

Global IPs

istbs:316

tsubame:64

mirai:48

okubo:28

hongo:98

All packets dropped

hiro:88

chiba:186

kyoto:70

suzuk:72

InTrigger

imade:60

kototoi:88

Private IPs

Firewall

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

necessary configuration
Necessary Configuration
  • Configuration necessary for Overlay
    • 2clusters( tsubame, istbs) require SSH-portforwarding to other clusters

⇒ 2 lines of configuration

add connection instruction by regular expression

# istbs cluster uses SSH for inter-cluster conn.

use 133\.11\.23\. (?!133\.11\.23\.), prot=ssh, user=kenny

#tsubame cluster gateway uses SSH for inter-cluster conn.

use 131.112.3.1 (?!172\.17\.), prot=ssh, user=kenny

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

overlay construction simulation
Overlay Construction Simulation
  • Evaluate the overlay construction scheme
  • For different cluster configurations, modified number of attempted connections per peer
  • 1000 trials per each cluster/attempted connection configuration

28 Global/ 238 Private Peers Case: 95 %

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

dynamic master worker
Dynamic Master-Worker
  • Master object distributes work to Worker objects
    • 10,000tasksasRMI
  • Workers repeat join/leave
    • Tasks for failed nodes are redistributed
    • No tasks were lost during the experiment

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

a real life application
A Real-life Application
  • A combination optimization problem
    • Permutation Flow Shop Problem
    • parallelbranch-and-bound
      • Master-Worker like
      • Requires periodic exchange of bounds
    • Code
      • 250 lines of Python code as glue code
      • Worker node starts up sequential C++ code
        • Communicate with local Python through pipes

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

master worker interaction
Master-Workerinteraction
  • Master does RMI to worker
    • Worker: periodical RMI to master
    • Not your typical master-worker
    • requires a flexible framework like ours

Master

exchange_bound()

doJob()

Worker

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

performance
Performance
  • Work Rate
    • ci: total comp. time per core
    • N: num. of cores
    • T: completion time
  • Slight drop with 950 cores
    • due to master node becoming overloaded

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

troubleshoot search engine
Troubleshoot Search Engine
  • Ever stuck debugging, or troubleshooting?
  • Re-rank query results obtained from google
    • Use results from machine learning web-forums
    • Perform natural language processing on page contents

at query time

  • Use a Grid backend
    • Computationally intensive
    • Require good response time
      • in 10s of seconds

Compute!!

Compute!!

backend

Query:

“vmware kernel panic”

Search Engine

Compute!!

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

troubleshoot search engine overview
Troubleshoot Search Engine Overview

async.

doQuery()

Graph extraction

Python

CGI

doSearch()

rescoring

parsing

async.

doWork()

Leveraged sync/async RMIs to seamlessly integrate parallelism into a sequential program.

Merged CGIs with Grid backend

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

agenda2
Agenda
  • Introduction
  • Related Work
  • Proposal
  • Evaluation
  • Conclusion

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

conclusion
Conclusion
  • gluepy: Grid enabled distributed object oriented framework
    • Supports simple and flexible coding for complex Grid
      • SerialObjects
      • Signal semantics
      • Object lookup / exception on RMI failure
      • Automatic overlay construction
    • as a tool to glue together Grid resources simply and flexibly
  • Implemented and evaluated applications on the Grid
    • Max. scale: 900core (9 cluster)
      • NAT/Firewall, with runtime joins/leaves
    • Parallelized real-life applications
      • Take full advantage of gluepy constructs for seamless programming

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

questions
Questions?
  • gluepy is available from its homepage

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

overlay construction simulation1
Overlay Construction Simulation
  • Evaluate the overlay construction scheme
  • For different cluster configurations, modified number of attempted connections per peer
  • 1000 trials per each cluster/attempted connection configuration

28 Global/ 238 Private Peers Case: 95 %

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

troubleshoot search engine overview1
Troubleshoot Search Engine Overview

async.

RMI for

query

Graph extraction

extraction

asynchronously

return to CGI

parsing

rescoring

RMI

from

CGI

async.

RMI to workers

Leverage async. RMI from CGI script

to work distribution on the Grid

All coding done in Python seamlessly, using gluepy

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

utilization of the grid
Utilization of the Grid
  • Grid

= Multiple Clusters (LAN/WAN)

  • Typical Usage
    • Many stand-alone jobs in parallel
    • Little or No interaction among nodes
  • Parallel and distributed Computing
    • Utilize nodes for a single application
      • Parallelize an existing application
    • Requires complex interaction

⇒ Utilize Grid enabled frameworks

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

the demands on the grid
The demands on the Grid
  • A framework that realizes flexible/complex interaction on the Grid
  • Can we learn anything from parallel languages?

Fire Wall

App.

leave

apply

join

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

speedup
Speedup

900 coresでスケールしなくなる

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide47
累積実行時間

累積計算時間の拡大が再実行による無駄が出ていることが分かる

累積計算時間を考慮すると、スピードアップは169 cores ⇒ 948 cores (5.64 倍) で 4.94

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

micro benchmarks
オーバレイに関するMicro-Benchmarks
  • 1ノードからRMIを発行
    • ほぼすべてのノードに対して3hop以内で到達
  • Latency
    • Overlay上でno-opのRMI : ping()
  • Bandwidth
    • Overlay上で大きい引数のRMI : send_data()

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide49
オーバレイ上の遅延
  • 1ノードから 5クラスタのノード上のobjectへRMI : ping()
    • pingで測定したRTTと比較
  • overlay, 1 hop = ~1.5 [ms]

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide50
オーバレイ上のバンド幅
  • 引数 (de)serialization overhead 大
    • Full 1Gbit Ethernetで理想最大値: 40[MB/sec]
    • iperf測定値から算出される最大値と比較
    • overlayでhopをするたびにバンド幅が減少
      • store-and-forward

Overlay hop数

でクラスタ内でもバンド幅が変化

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

master worker in gluepy
Master-worker in gluepy

class Master(SerialObject):

def __init__(self, jobs):

self.nodes = []

self.jobs = jobs

def nodeJoin(self , node):

self.nodes.append(node)

self.signal()

def run (self):

assigned = {}

while True:

while len(self.nodes)>0 and len(self.jobs)>0:

node = self.nodes.pop()

job = self.jobs.pop()

f = node.doJob.future(job)

assigned[f] = (node, job)

readys = wait(assigned.keys())

if readys == None: continue

for f in readys:

node, job = assigned.pop(f)

try:

print ”done:”, f.get()

self.nodes.append(node)

except RemoteException, e:

self.jobs.append(job)

  • Full Master-Worker
    • 並列RMI
      • New RMI for idle workers
    • ノードの参加・脱退
      • Non-event driven

Signal for join

Block &

Handle join

Master init

master = Master()

master.register(“master”)

master.run()

Worker init

worker = Worker()

master = RemoteRef(“master”)

master.join(worker)

while True:

sleep(1)

Failure

handling

lookup on join

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide52
Grid上で並列分散計算の需要
  • 例:Webアプリケーション
      • CGIとのインタラクション
      • タスク分割・負荷分散
      • 結果集約・CGIへ結果加工

複雑な協調を必要とする

backend

Publicly accessible

Application

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide53
提案するモデルでのプログラミング

#非同期 RMI 実行

f1 =foo1.doJob.future(args)

f2 =foo2.doJob.future(args)

#他の計算…

#返り値を取得。 Wait

value1 =f1.get()

value2 =f2.get()

  • futureで非同期RMI
    • 呼び出しに対するplaceholder
    • いずれ結果が格納される
  • 明示的なスレッドは不要
    • 別スレッドによるcallbackなどはない
  • RMIを処理する側は?

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide54
プログラミングモデルが提供するもの
  • 柔軟性:既存のオブジェクト指向言語をライブラリ拡張
  • 並列分散計算に携わる諸問題を解決
    • ロックを意識させない
      • オブジェクトの所有権を使ったimplicitな排他制御
      • block時に所有権をimplicitに委託する:デッドロック予防
    • event-drivenに陥らないイベント処理
      • 同期モデルと協調したsignalセマンティクス
    • 参加・脱退への対応
      • object lookup: 「最初の参照」問題
      • 故障に対する例外セマンティクス
  • 分散した動的な環境(Grid)を簡潔に扱えるモデル

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide55
資源間通信と協調の実現

Central Manager

  • Condor/DAGMan [Thain et al. ‘05]
    • バッチスケジューラ
    • 「タスク」はスクリプトで表現
    • タスク間の依存関係はDAG
  • Ibis / Satin [Wrzesinska et al. ‘06]
    • 分散環境で分割統治問題
    • 子タスクに分割
    • タスクには親子関係

Assign

Busy Nodes

Cluster

Task

  • タスク間の協調は中間ファイル媒体
  • 依存関係のあるタスク間での通信に限定

DAG Dependency

Relationship

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide56
参加・脱退の処理
  • JoJo2 [Aoki et al. ‘06]
    • Master – Worker
    • Event driven
    • イベント毎にハンドラ定義
      • タスクの終了
      • 参加
      • 脱退・故障

Failure Handler

Join Handler

Join

  • 同期の問題
  • イベント・ドリブンは分かりにくい
  • 提案する同期セマンティクスに非同期イベント通知機構を導入
  • Event-drivenでない記述が可能

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

overlay construction simulation2
Overlay Construction Simulation
  • Evaluate the overlay construction scheme
  • For different cluster configurations, modified number of attempted connections per peer
  • 1000 trials per each cluster/attempted connection configuration

28 Global/ 238 Private Peers Case: 95 %

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

future work
Future Work
  • Reliable WAN communication for the Grid overlays
    • Node failure
    • Connection failure
  • Weakness of WAN connections
    • Router Policies
      • close connections after given period
    • Obscure kernel bugs with NAT
      • Connection resets

Faulty link

WAN links are more vulnerable, and failures will occur

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

some related work
Some Related Work

Fewest Hop:

High Reliability

High Power Usage

  • Robust Tree Topologies for Sensor Networks [ England ‘06]
    • Create spanning tree for data reduction
    • Flat tree for high reliability
      • Fewest Hops
    • Tree with short distance for low power consumption
      • Shortest Path

⇒ Spanning Tree that merges the two metrics for the best of two worlds

Shortest Path:

Low Reliability

Low Power Usage

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

possible future direction
Possible Future Direction
  • Our context: Grid computing
    • communication latency

= metric for link reliability

  • Fewest Hops
    • Reliability for node failure
  • Shortest Distance
    • Reliability for link failure

Short reliable links

Long faulty links

Can we construct an overlay connection topology that take the best of two worlds?

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

publications
Publications
  • 1. Ken Hironaka, Hideo Saito, Kei Takahashi, KenjiroTaura. A Flexible Programming Framework for Complex Grid Environments. In 8th IEEE International Symposium on Cluster Computing and the Grid, May 2008 (Poster Paper. To Appear).
  • 2. Ken Hironaka, Hideo Saito, Kei Takahashi, KenjiroTaura. A Flexible Programming Framework for Complex Grid Environments. In IPSJ Transactions on Advanced Computing Systems. (Conditional Accept)
  • 3. Ken Hironaka, Hideo Saito, Kei Takahashi, KenjiroTaura. A Flexible Programming Framework for Complex Grid Environments. In Proceedings of 2008 Symposium on Advanced Computing Systems and Infrastructures. (To Appear)
  • 4. Ken Hironaka, Shogo Sawai, KenjiroTaura. A Distributed Object-Oriented Library for Computation Across Volatile Resources. In Summer United Workshops on Parallel, Distributed and Cooperative Processing. August 2007
  • 5. Ken Hironaka, KenjiroTaura, Takashi Chikayama. A Low-Stretch Object Migration Scheme for Wide-Area Environments. In IPSJ Transactions on Programming. Vol 48 No.SIG 12 (PRO 34), pp.28-40, August 2007

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

problems with grid computing 2
Problems with Grid Computing (2)
  • Complexity of Programming on the Grid
    • Low Level Computing (sockets)
      • Communication
      • Multi-threaded Computing (Synchronization)
      • Heavy Burden on Non-experts
    • Flexibility and Integration
      • Grid Frameworks for task distribution
      • Independent parallel programming languages
      • Computing is not execution of many independent tasks
        • Need finer grained communication
      • Bad interface with user application
        • Java, Ruby, Python, PHP

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

related work
Related Work
  • Discussed with respect to criteria necessary for modern Grid computing
    • Workflow Coordination
      • Flexibility without putting the burden on the user
    • Joining Nodes / Failure of resources
      • Handling these events should not dominate the programming overhead
    • Connectivity in Wide-Area Networks
      • Adaptation to networks with NAT/firewall with little manual settings

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

workflow coordination 1
Workflow Coordination (1)

Central Manager

  • Condor / DAGMan [Thain ‘05]
    • “Tasks” are expressed as script files and distributed on idle nodes
    • Dependency between tasks can be expressed in DAG (Directed Acyclic Graph)
  • Ibis / Satin [Wrzesinska ‘06]
    • framework for divide-and-conquer problems
    • Tasks can be broken into smallersub-tasks, on which it depends

Assign

Busy Nodes

Cluster

Task

  • Many computation cannot be expressed as “Tasks” with dependencies
  • A task’s communication is limited to others to which it has dependencies

DAG Dependency

Relationship

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

object synchronization example
Object Synchronization Example

In method f(), instance a invokes blocking method g() on object b

class A:

def __init__(self, x):

self.x = x

def f(self, b):

self.x += 1

#blocking RMI

b.g()

self.x -= 1

return

a

b

only 1 thread at a time

b.g()

block

give-up

ownership

during RMI

Atomic section

Value x stays

consistent

Atomic section

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

adaptation to dynamic resources
Adaptation to Dynamic Resources

signal

  • Signal delivery to objects
    • Unblocks any thread that is blocking in the object’s context
      • Can be used to notify asynchronous events
        • A joining node
  • Node Failure ⇒ RMI Failure
    • Failure returned as exception to method invocation
    • The user can catch the exception, and perform rollback procedures if necessary

object

unblock

Th

block

exception

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

preliminary experiments
Preliminary Experiments
  • Overlay Construction Simulation
  • A Simple Master-Worker Applicationwith dynamically joining/leaving nodes
  • A Real-life Parallel Application on the Grid
  • A Troubleshoot-Search Engine

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide70

Global IPs

istbs(316)

tsubame(64)

okubo(28)

hongo(98)

chiba(186)

All packets dropped

suzuk(72)

kyoto(70)

kototoi (88)

imade(60)

Private IPs

Firewall

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy

slide71

Global IPs

istbs(316)

tsubame(64)

okubo(28)

hongo(98)

chiba(186)

All packets dropped

suzuk(72)

kyoto(70)

kototoi (88)

imade(60)

Private IPs

Firewall

www.logos.ic.i.u-tokyo.ac.jp/~kenny/gluepy