Cougaar Design Case Study: Mandelbrot GUI Application

Cougaar Design Case Study:Mandelbrot GUI Application Todd Wright Feb 5th, 2007

Overview • These slides present ten alternate designs for an example Cougaar application, a “Mandelbrot” fractal GUI • Each design explores various tradeoffs: • Design complexity • Modularity and how the modules (plugins/agents) interact with one another • Parallel / Distributed processing support • As we go, we’ll summarize what we’ve learned and outline basic design patterns

Mandelbrot Application • Basic idea: • The user submits an image calculation request • Cougaar application code (plugins & agents) compute the image data • The image is displayed to the user in a GUI • The example image is a “Mandelbrot” fractal • Given an (x, y) range and image size, e.g. • Range: (-1.5, -1.0) to (1.5, 1.0) • Image: 1024 x 768 • Compute the image using the “Mandelbrot” algorithm • Simple math • Entirely compute-bound (possibly network-bound if we make it distributed) Nodes GUI Agents Agents I/O Plugins Plugins

Design Comparison Matrix • For all the following designs, we’ll rank each design based on the following scales of 1-to-10, with 10 being ideal: • Simplicity • How easy is the code to understand? • Modularity • Can we easily replace parts of our solution with alternative implementations? • Scalability • Can we distribute our solution across multiple hosts? • Inter-job Parallelism • Can separate jobs run in parallel? • Intra-job Parallelism • Can a single job be subdivided and run in parallel? • Adaptability • Can we customize the behavior, e.g. using policies or runtime metrics? • This will allow us to better see the tradeoffs between the designs.

Design 1: Just a Servlet • Design: Do everything in a self-contained Servlet: • Listens for browser HTTP requests • Computes image data in the servlet “doGet” thread • Writes the image result as a JPG • Characteristics: • Easy to implementation and configuration • Few Cougaar dependencies (no need for a blackboard or other plugins) • No synchronization or threading issues (runs in the Servlet request thread) Node Servlet public void doGet(..) { read params compute image data write image as JPG } http

Analysis: Design 1

Design 2: Servlet UI + Calculator Service • Design: Move the “compute()” code out of the Servlet and into a separate Component • Primarily a refactor of the prior design • Use a service to advertise the “compute()” method • This is a typical solution for wrapping library code • Characteristics: • Still fairly easy to implement and configure • Improved modularity: • Can replace UI code while keeping calculator code (e.g. make popup Swing UI) • Can replace calculator code while keeping UI code (e.g. compute different fractal design) • No threading or synchronization issues (runs in the Servlet request thread) Node Servlet public void load() { calc = getService(Calc..); } public void doGet(..) { read params calc.compute(..); write image as JPG } Calculator public void load() { advertise(Calc, this); } public byte[] compute (..) { compute image data } http

Design point: Inlined code v.s. Services • Key design points: • Design 1: • Summary: The plugin directly calls the inlined / library code • Benefit: Easy to implement, self-contained • Downside: Difficult to switch between alternate library implementations, awkward to share non-static library instances between plugins • Design 2: • Summary: One plugin advertises a service, other plugin(s) obtain and use it • Benefit: Supports shared, pluggable services, cleans up the code • Downside: Must refactor / wrap library code into service API(s), plus add new plugins to advertise these services • Example of interest: • Plugin “A” advertises a “WindowManagerService” and pops up an empty Swing Panel • Subsequent plugins obtain this service and add their Swing “JComponent” panels to the service by calling an “add(..)” method (instead of popping up their own windows) • The “window manager” plugin decides where to place the sub-frames

Design 3: Servlet UI & Blackboard-based Calculator Plugin • Design: Instead of a service, publish the request on the blackboard • Use non-blocking blackboard operations (pub/sub) instead of a blocking method call • Characteristics: • The “calculate()” method runs in a separate plugin thread • We’re using the blackboard as both a communication and thread-switch layer • We no longer have a simple, blocking “calculate()” service API • We now have a blackboard representation of the Job • Defines our data-oriented “API” between our plugins • Other plugins can observe this interaction (e.g. for debugging, management, etc) Node Agent Servlet public void doGet(..) { Job job = new Job(params); publishAdd(job); job.waitForCompletion(); write image as JPG } Calculator public void setupSubs() { subscribe to Jobs } public void execute() { for all added Jobs { compute job notify of completion } } http Job Blackboard

Design Point: Services v.s. Blackboards • Key design points: • Design 2: • Summary: Plugins interact through blocking service method calls • Benefit: Easy blocking method APIs • Downside: Method calls run in caller’s thread and are blocking. Use of callbacks to support non-blocking APIs requires awkward thread switching. • Design 3: • Summary: Plugins interact through asynchronous blackboard pub/sub operations • Benefit: Non-blocking and parallelized, plugin “execute()” methods are single-threaded, “Job” state is visible on the blackboard • Downside: Must reorganize code to fit the pub/sub “execute()” pattern. This can introduce “bookkeeping” state, where a service-based design would keep this state “for free” on the method-call stack. • The prior example shows an awkward mixed design: • Servlet “doGet()” callbacks are blocking and must complete in that thread • The blackboard is an asynchronous pub/sub interaction • Hence the odd “job.waitForCompletion()” solution..

Design 4: Non-Blocking UI • Design: Replace Servlet UI with Plugin “Screensaver” UI • The servlet case is odd, in that the “doGet(..)” request method is a blocking, external Thread call • As a point of comparison, create a Plugin-based UI client that uses a non-blocking Cougaar thread and standard blackboard pub/sub operations • Characteristics: • The UI plugin listens for a subscription change instead of a lock “notify()” • This is a more standard Cougaar interaction pattern • This approach isn’t applicable for our Servlet-based UI (but might fit a Swing UI) Node Agent Requestor public void setupSubs() { subscribe to Jobs publishAdd(new Job) } public void execute(..) { for all changed Jobs { write image as JPG } } Calculator public void setupSubs() { subscribe to Jobs } public void execute() { for all added Jobs { compute image data publishChange(job) } } /tmp/out.jpg Job write Blackboard

Design Point: Mixed Services/BB v.s. all BB • Key design points: • Design 3: • Summary: Servlet “doGet()” callback uses awkward lock wait/notify to detect blackboard work completion instead of an asynchronous subscription • Benefit:Required due to limitations of blocking Servlet callback API • Downside: Awkward mixed-metaphor of wait/notify + subscription changes • Design 4: • Summary: All interaction is through blackboard pub/sub options • Benefit: Easy integration via subscriptions, completely asynchronous • Downside: Not applicable in the Servlet case. • Most applications fit entirely into the blackboard-friendly pub/sub pattern • The design often gets awkward when plugins must interact with both blocking/callback services plus blackboard pub/sub operations • Typically results in awkward “todo” lists to switch threads • Ideally this can be avoided

Design 5: Separate Job/Result Objects • Design: Instead of changing the Job, publish a separate Result object • This makes it clear that the result is a separate data structure • We’ll assume that we’re using the non-servlet “Requestor”, as in design 4 • Characteristics: • The subscriptions now look for different data structures (notice that arrows are “one-way”) • The Result object should have a pointer to the Job, or have a shared “unique job identifier” Node Agent Requestor public void setupSubs() { subscribe to Results publishAdd(new Job) } public void execute(..) { for all added Results { write image as JPG } } Calculator public void setupSubs() { subscribe to Jobs } public void execute() { for all added Jobs { compute image data publishAdd(new Result) } } /tmp/out.jpg Job Result write Blackboard

Design Point: Separate Results Object • Key design points: • Design 4: • Summary: Job has field for results data • Benefit:Fewer blackboard objects • Downside: Multiple writers to the same object, to fill in result slot • Design 5: • Summary: Calculator publishes a separate Results object • Benefit: Finer-grain subscriptions, “publishAdd” driven • Downside: More blackboard objects • This is more a matter of style

Design 6: Remote Processing • Design: Transfer the job to a remote agent • Wrap the job in a relay • We’ll assume that the “master” knows a-priori about the single “slave” • Characteristics: • Can run the slave on a remote host (supports remote processing) • Adds layer of Relay “wrapping” and processing code to do our data transfer • Must transfer both the Job and its result-data (two-way comms instead of shared memory) Node 1 Node 2 Master Agent Slave Agent Servlet Calculator Relay Relay copy Job Job http Blackboard Blackboard Messages

Design Point: Centralized v.s. Distributed • Key design points: • Design 5: • Summary: Single agent with shared blackboard • Benefit: Plugins can assume that everything is on their local blackboard • Downside: Limited to single host • Design 6: • Summary: Wrap job in Relay, transfer to remote agent for processing • Benefit: Distributed, partitions work and memory across hosts • Downside: Clutters plugin code with Relay “wrapping” and “addressing”. No longer a shared memory, so Relays must transfer data back & forth. • Relays (or similar mechanism) are used to transfer data between blackboards • Required because agents don’t support shared-memory blackboards • Anytime you make something distributed you run into well-known distributed processing limitations (latency, robustness, etc) • The next design separates the Relay wrapping/addressing from the non-transfer-related plugin work

Design 7: Remote Processing with Dispatcher • Design: Introduce concept of “Dispatcher” Plugin • Separates Servlet/Calculator code from remote transfer code • Still use relays to transfer jobs (an equivalent option is to use task/allocation) • Characteristics: • Can implement different kinds of dispatch policies as pluggable “Dispatcher”s • Adds job management control in the Dispatcher code • One more layer of thread switching & indirection (but that’s often a good thing) Node 1 Node 2 Master Agent Slave Agent Servlet Dispatcher Receiver Calculator http Relay copy Job Job Relay Blackboard Blackboard Messages

Design Point: Use of “Dispatcher” Plugins • Key design points: • Design 6: • Summary: Domain plugins do Relay wrapping and addressing • Benefit: Fewer plugins • Downside:Clutters domain code, difficult to enhance • Design 7: • Summary: Introduce “Dispatch / Receiver” plugins to handle Relay details • Benefit: Cleans up design, supports pluggable dispatch options • Downside: Adds more indirection, more objects on blackboard. • The “Dispatcher” design is often a good idea, except in trivial cases where the added flexibility would be overkill.

Design 8: Load-balancing • Design: Support multiple worker agents • Dispatcher can choose between slaves • A job can be sent to any slave • This allows to balance work between our slaves • Allow multiple, dynamic slaves • Slaves “register” with the master agent via a Relay • Slave “pulls” down job, replies with results, and pulls next job • Add concept of separate relays for slave-to-master v.s. slave-to-master comms • Slave sends registration & results via its relay • Master sends new jobs via its relay • Creates more of a unified “comms channel”, for better error processing • Characteristics: • Can balance jobs between slaves (if we have more jobs than slaves) • Ideally one agent per CPU, distributed across hosts according to per-host CPU count • If we only have one job then this doesn’t help, since (in this design) we can’t reduce jobs into smaller tasks • Simple configuration via slave “register”, instead of hard-coding slave names in the master • More adaptive – we can dynamically support added/removed slaves Illustration on next slide..

Design 8: Load balancing (2) Node 0 Master Agent Servlet Dispatcher http Job A from: Slave1 from: Slave2 to: Slave1 to: Slave2 Job B http Blackboard Node 1 Node 2 Slave1 Agent Slave2 Agent Calculator Receiver Calculator Receiver to: Master to: Master Job A Job B from: Master from: Master Blackboard Blackboard

Design Point: Single v.s. Load-balanced • Key design points: • Design 7: • Summary: Work is offloaded to a single remote worker • Benefit: Offloads work, relatively simple design • Downside: Only computes one job at a time, only supports a single worker • Design 8: • Summary: Work is dispatched to one of many workers • Benefit: Load-balanceswork, supports an arbitrary number of workers • Downside: More complex design, must choose which slave to send work to. Parallelism is limited to our job backlog. • The load-balanced solution is a general-purpose, parallelized “grid” computer • However, we’re still limited by the granularity of our Jobs.

Design 9: Fine-Grained Parallel processing • Design: Divide the job into subtasks, allocate tasks to remote agents • Add concept of Job-to-Task decomposition • New Expander plugin decomposes Job into Tasks • These Tasks are published on the blackboard • Expanded detects when all tasks have been completed, aggregates the result, and completes the job • Can divide our Job into an arbitrary number of Tasks, but ideally this is guided by the Dispatcher’s knowledge of how many slaves we have • Characteristics: • Maximum parallelism. • We can split a single Job across an arbitrary number of slaves • We are no longer limited by our Job backlog • Note that a complex Job representation is required to support Task decomposition & incremental result updates Illustration on next slide..

Design 9: Fine-Grained Parallel processing(2) Node 0 Master Agent Servlet Dispatcher http Expander Job Task 0 from: Slave1 from: Slave2 Task 1 to: Slave1 to: Slave2 Task N Blackboard Node 1 Node 2 Slave1 Agent Slave2 Agent Calculator Receiver Calculator Receiver to: Master to: Master Task Task from: Master from: Master Blackboard Blackboard

Design Point: Load-balanced v.s. Parallel • Key design points: • Design 8: • Summary:Entire jobs are load-balanced between workers • Benefit: Offloads work, relatively simple design • Downside: No intra-job parallelism (but separate jobs may run in parallel) • Design 9: • Summary: Uses task decomposition and balanced remote task allocation • Benefit: Highly parallelized, can parallelize a single job across multiple workers • Downside: More complex design, must track and re-assemble subtask results. Only works if the job can be decomposed into independent, parallelizable subtasks. • The primary tradeoff in this case is design complexity. • This also assumes that we can decompose our Jobs into arbitrarily small subtasks, which is not true for all applications.

Design 10: Support Dispatch Policies • All the prior designs featured hard-coded behaviors: • Hard-coded or parameterized list “slave” agents • Simple allocation rule: allocate to next available slave • As an enhancement, we could modify our plugins to support more complex, policy-based behaviors. • Example Policies: • Timeout calculations and re-allocate to alternate slave • Send same task to multiple slaves, to reduce latency and add fail-over • Send multiple outstanding subtasks per slave, to reduce network latency effects (i.e. keep working while the results are being sent on the wire) • Allocate according to slave host metadata (e.g. CPU speed, network latency, scheduling relative to other work) • All of the above examples illustrate QoS adaptation

Design Point: Hard-coded behavior v.s. Policies • Key design points: • Design 98 (and prior): • Summary: Behavior is hard-coded or only supports trivial parameterization. • Benefit: Good enough for most applications. • Downside:Inflexible behavior. • Design 10: • Summary: Add plugin behavior options controlled through policies • Benefit: Pluggable / adaptive behavior • Downside: More complex to implement. • The introduction of policies and behavior options allows for a “smarter” application.

Conclusions

Design Analysis Summary

Conclusions • The prior slides showed many different ways to build the same application but with different system properties • The first couple designs are relatively simple • Subsequent slides supported parallelism but are more complex • Each design is valid and ideal in certain environments • Each “split” of code/data introduces design complexity: • Splitting a plugin into multiple plugins requires data coordination between the plugins, requiring: • Coordination API (either a service or blackboard pub/sub) • Data structures (must be internally synchronized) • Splitting data across agents requires data partitioning and transfer code • Must decide which data resides on which agent(s) • Must transfer the data, typically via the blackboard (e.g. Relays)

Conclusions (2) • Service-based API are useful in limited cases • Ideal for wrapping simple libraries (e.g. log4j) • Should be non-blocking and not require blackboard access • Don’t block pooled threads • Requires a thread switch, otherwise you’ll get a blackboard “nested transaction” problems • See the “todo” pattern and other (awkward) workarounds • In contrast, blackboard interactions are non-blocking • This is good in that it switches threads and avoids blocking the plugin when performing remote I/O, which increases parallelism • It’s bad in that the plugin code must support an asynchronous call and subsequent “execute()” method resume when the result is published • The result is sometimes added “bookkeeping” state in the plugin, to remember where prior async calls left off. This is effectively a “continuation”.

Cougaar Design Case Study: Mandelbrot GUI Application - Design Comparisons