Interactive data analysis on the grid with jas and globus
Download
1 / 12

Interactive Data Analysis on the Grid with JAS and Globus - PowerPoint PPT Presentation


  • 163 Views
  • Uploaded on

TechXHome.com. Interactive Data Analysis on the Grid with JAS and Globus. David Alexander, Brian Miller, & John Exby Tech-X Corporation (www.techxhome.com) Boulder, Colorado Tony Johnson, Massimiliano Turri, & Booker Bense Stanford Linear Accelerator Center Menlo Park, California.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Interactive Data Analysis on the Grid with JAS and Globus' - ami


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Interactive data analysis on the grid with jas and globus l.jpg

TechXHome.com

Interactive Data Analysis on the Grid with JAS and Globus

David Alexander, Brian Miller, & John Exby

Tech-X Corporation (www.techxhome.com)

Boulder, Colorado

Tony Johnson, Massimiliano Turri, & Booker Bense

Stanford Linear Accelerator Center

Menlo Park, California

Supported by U.S. Department of Energy

Small Business Innovative Research Grant DE-FG03-02ER83556

and Stanford Linear Accelerator Center


Project overview l.jpg

TechXHome.com

Project Overview

  • Started with Java Analysis Studio (JAS)

    • Has distributed analysis system based on RMI

  • Set up test grids on Linux clusters

    • Used Globus Toolkit 2.0

    • Each node had GRAM & GridFTP servers and Java Runtime Environment

  • Wrote a JAS grid plug-in

    • Used Java CoG Kit 0.9

  • Demonstrated at SC2002

    • Hit remote and on-site cluster


Java analysis studio jas jas freehep org l.jpg

TechXHome.com

Java Analysis Studio (JAS)jas.freehep.org

  • Open source application

    • Built for interactive data analysis, but flexible & modularized

  • Publication quality plotting facilities

  • User writes Java code to analyze data


Java analysis studio jas jas freehep org4 l.jpg
Java Analysis Studio (JAS)jas.freehep.org

  • Abstracted data source interface

    • Modules are written to work with a variety of file formats (PAW, HIPPO, AIDA, Root, ODBC, flat files, SIO, HEP)

  • Distributed System Available

  • Versatile & Well used in high energy physics

    • Pure Java (Portable, Web Start installation & upgrade)

    • Flexible topology (stand-alone, client/server, cluster)

    • Integration w/ BaBar, Geant4, Wired

TechXHome.com


Design ideas added features l.jpg

Design Ideas & Added Features

Goal: clustered deployment, launch, & federation

Special JAS Job use

Minimal prerequisites:

Bare grid: Globus, Java, nothing else

Heterogeneous cluster

Off-grid (or not) client, data, codebase

Clients don’t need to be superusers

Optional background deployment

Single sign on

TechXHome.com


About resource discovery l.jpg
About Resource Discovery

  • Resource discovery

    • Software needs location of data files

    • Software needs location of Java-enabled hosts

    • Pluggable LDIF source (MDS, URL of text file)

  • Community Authorization Service

    • Fine-grained access control

    • Is resource discovery in a way


Move code to data with gridftp l.jpg

Move code to data with GridFTP

Location transparency

User sees data sets

Could also have user choice

Automatic deployment of JAS

Multi-threaded task set

Verification of code version, GridFTP codebase to node if new

GridFTP/link data to user sandbox

Deploy control and catalog servers only on cluster head node

Worker nodes wait for catalog server to run

TechXHome.com


Launch application with globusrun l.jpg

Launch Application with GlobusRun

Automatic launch of Java servers

Java Data Servers are run on specified JRE-enabled nodes

Special Grid Job is now started (exit the Wizard)

Code loaded into client or written in editor

-compiled

-automatically distributed to Java Data Servers

-results (std out, std err, & histograms) sent back

TechXHome.com


A few more impressive features l.jpg

A few more Impressive Features

User can stop analysis, change code, & restart.

Distributed debugging can catch individual node failures.

Histogram re-bin slider surprisingly responsive

TechXHome.com


Headaches and issues l.jpg

Headaches and Issues

Versions of Globus vs. Java CoG Kit

CoG properties configuration

Client & server clocks disagree

MS-Windows text line breaks

Abandoned jobs

Firewalls

TechXHome.com


Future ideas l.jpg

Future Ideas

Upgrade to Globus Toolkit 3

Pre-install code on cluster head or portal machine and deploy from there

Use more grid services (Condor, Replica)

Implement interfaces or service descriptions from PPDG CS-11 group.

TechXHome.com


Further information on jas l.jpg
Further Information on JAS

for the latest on JAS see the 3pm Catogory 9 paper

JAS3 - A general purpose data analysis framework

for HENP and beyond.

CONTACTS

David Alexander, [email protected]

Brian Miller, [email protected]

Tony Johnson, [email protected]

Massimiliano Turri, [email protected]

Java Analysis Studio, http://jas.freehep.org


ad