Grids Challenged by a Web 2.0 and Multicore Sandwich. CCGrid 2007 Windsor Barra Hotel Rio de Janeiro Brazil May 15 2007 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 firstname.lastname@example.org http://www.infomall.org.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Windsor Barra Hotel
Rio de Janeiro Brazil
May 15 2007
Computer Science, Informatics, Physics
Pervasive Technology Laboratories
Indiana University Bloomington IN 47401
‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ from its inventor John Taylor Director General of Research Councils UK, Office of Science and Technology
e-Science is about developing tools and technologies that allow scientists to do ‘faster, better or different’ research
Similarly e-Business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world.
This generalizes to e-moreorlessanything
A deluge of data of unprecedented and inevitable size must be managed and understood.
People (see Web 2.0), computers, data and instruments must be linked.
On demand assignment of experts, computers, networks and storage resources must be supported
Supports distributed science – data, people, computers
Exploits Internet technology (Web2.0) adding (via Grid technology) management, security, supercomputers etc.
It has two aspects: parallel – low latency (microseconds) between nodes and distributed – highish latency (milliseconds) between nodes
Parallel needed to get high performance on individual 3D simulations, data analysis etc.; must decompose problem
Distributed aspect integrates already distinct components
Cyberinfrastructure is in general a distributed collection of parallel systems
Cyberinfrastructure is made of services (often Web services) that are “just” programs or data sources packaged for distributed access
RMS: Recognition Mining Synthesis
Is it …?
What is …?
What if …?
Find a model
Create a model
Real-time streaming and
static – structured datasets
Very limited realism
Real-time analytics on
Intel has probably most sophisticated analysis of future “killer” multicore applications
What is a tumor?
Is there a tumor here?
What if the tumor progresses?
It is all about dealing efficiently with complex multimodal datasets
Images courtesy: http://splweb.bwh.harvard.edu:8000/pages/images_movies.html
Totally independent or nearly so (B C E F) – This used to be called embarrassingly parallel and is now pleasingly so
This is preserve of job scheduling community and one gets efficiency by statistical mechanisms with (fair) assignment of jobs to cores
“Parameter Searches” generate this class but these are often not optimal way to search for “best parameters”
“Multiple users” of a server is an important class of this type
No significant synchronization and/or communication latency constraints
Loosely coupled (D) is “Metaproblem” with several components orchestrated with pipeline, dataflow or not very tight constraints
This is preserve of Grid workflow or mashups
Synchronization and/or communication latencies in millisecond to second or more range
Tightly coupled (A) is classic parallel computing program with components synchronizing often and with tight timing constraints
Synchronization and/or communication latencies around a microsecond
At a very high level, there are three broad classes of parallelism
Coarse grain functional parallelism typified by workflow and often used to build composite “metaproblems” whose parts are also parallel
This area has several good solutions getting better
Pleasingly parallel applications can be considered special cases of functional parallelism
Large Scale loosely synchronous data parallelism where dynamic irregular work has clear synchronization points as in most large scale scientific and engineering problems
Fine grain thread parallelism as used in search algorithms which are often data parallel (over choices) but don’t have universal synchronization points
Discrete Event Simulations are either a fourth class or a variant of thread parallelism
This is a dataflow model between services where services can do useful document oriented data parallel applications including reductions
The decomposition of services onto cluster engines is automated
The large I/O requirements of datasets changes efficiency analysis in favor of dataflow
Services (count words in example) can obviously be extended to general parallel applications
There are many alternatives to language expressing either dataflow and/or parallel operations and indeed one should support multiple languages in spirit of services
Workflow Tools are reviewed by Gannon and Fox http://grids.ucs.indiana.edu/ptliupages/publications/Workflow-overview.pdf
Both include scripting in PHP, Python, sh etc. as both implement distributed programming at level of services
Mashups use all types of service interfaces and do not have the potential robustness (security) of Grid service approach
Typically “pure” HTTP (REST)Mashups v Workflow?
OtherAPIs/Mashups per Protocol Distribution
Display too large to be a Gadget
Searched on Transit/Transportation
Google Maps Server
Marion County Map Server
Hamilton County Map Server
Cass County Map Server
(OGC Web Map Server)
Must provide adapters for each Map Server type .
Browser client fetches image tiles for the bounding box using Google Map API.
Tile Server requests map tiles at all zoom levels with all layers. These are converted to uniform projection, indexed, and stored. Overlapping images are combined.
The cache server fulfills Google map calls with cached tiles at the requested bounding box that fill the bounding box.
Google Map API
A “Grid” Workflow
(built in Java!)
Uses Google Maps clients and server and non Google map APIs
GIS Grid of “Indiana Map” and ~10 Indiana counties with accessible Map (Feature) Servers from different vendors. Grids federate different data repositories (cf Astronomy VO federating different observatory collections)
The Portal is built from portlets – providing user interface fragments for each service that are composed into the full interface – uses OGCE technology as does planetary science VLAB portal with University of Minnesota
Now to Portals
Start Page technologySee http://blogs.zdnet.com/Hinchcliffe/?p=8Typical Google Gadget Structure
Portlets build User Interfaces by combining fragments in a standalone Java Server
Supports exchange of messages between threads using named ports
FromHandler: Spawn threads without reading ports
Receive: Each handler reads one item from a single port
MultipleItemReceive: Each handler reads a prescribed number of items of a given type from a given port. Note items in a port can be general structures but all must have same type.
MultiplePortReceive: Each handler reads a one item of a given type from multiple ports.
JoinedReceive: Each handler reads one item from each of two ports. The items can be of different type.
Choice: Execute a choice of two or more port-handler pairings
Interleave: Consists of a set of arbiters (port -- handler pairs) of 3 types that are Concurrent, Exclusive or Teardown (called at end for clean up). Concurrent arbiters are run concurrently but exclusive handlers are
Rendezvous exchange customized for MPI
Overhead (latency) of AMD 4-core PC with 4 execution threads on MPI style Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as custom CCR pattern. Compute time is 10 seconds divided by number of stages
Rendezvous exchange customized for MPI
Overhead (latency) of INTEL 8-core PC with 8 execution threads on MPI style Rendezvous Messaging for Shift and Exchange implemented either as two shifts or as custom CCR pattern. Compute time is 15 seconds divided by number of stages
CGL Measurements of Axis 2 shows about 500 microseconds – DSS is 10 times better
DSS Service Measurements