JAS – Distributed Data Analysis - PowerPoint PPT Presentation

hao
grid enabled analysis workshop caltech june 23 25 2003 l.
Skip this Video
Loading SlideShow in 5 Seconds..
JAS – Distributed Data Analysis PowerPoint Presentation
Download Presentation
JAS – Distributed Data Analysis

play fullscreen
1 / 33
Download Presentation
JAS – Distributed Data Analysis
148 Views
Download Presentation

JAS – Distributed Data Analysis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Grid Enabled Analysis WorkshopCaltech - June 23-25, 2003 JAS – Distributed Data Analysis

  2. Contents • JAS2 • History • client-server mode • JAS2 and the Grid • JAS3 • What’s new • JAS3 and AIDA • Plans for Gridification JAS – Distributed Data Analysis

  3. JAS History • First version of JAS2 released in 2000. • Incremental improvements released over time. JAS – Distributed Data Analysis

  4. JAS2 History – Use Cases • With WIRED event display • Online Monitoring JAS – Distributed Data Analysis

  5. JAS2 History – Use Cases • Custom Applications • Web Servlets JAS – Distributed Data Analysis

  6. Data Analysis Engine User’s Java Code DATA Padded Cell JAS Client-Server Mode GUI Experiment Extensions (Event Display) Java Compiler + Debugger JAS – Distributed Data Analysis

  7. Distributed Analysis System: Goals • Prototype for GRID enabled JAS analysis • Run analysis on a farm of machines • Use multiple CPU’s in parallel for CPU-intensive analysis • Access multiple I/O channels for data-intensive analysis • Use standard JAS (Client) as if we are running a local Job • Get interactive feedback • Create analysis modules (code) • Control job execution • View results (Plots/Histograms) • Access distributed datasets as if they were local datasets JAS – Distributed Data Analysis

  8. Distributed Analysis System: Architecture JAS DataServer JAS DataServer JAS DataServer … CatalogServer Network ControlServer ControlServer Network JAS Client JAS Client JAS Client … Users JAS – Distributed Data Analysis

  9. JAS 2 – GRID interface (Tech-X) JAS – Distributed Data Analysis

  10. JAS3 Overview • A completely new version of JAS • Design based on Application Shell, into which many (optional) modules can be plugged • Highly customizable for different application domains • HEP/Astrophysics/Other • DST analysis/Online Monitoring/GRID analysis • Experiment/User specific modules • Modules can be updated independently of shell • Possible to release bug fixes fast • Includes support for programming in many languages • Scripting: Python, Pnuts, Dynamic Java, …. • Command prompt • Java (compiled) • Analysis (histograms, tuples, fitting) based on AIDA standard • Not technically backwards compatible with JAS2 • But migration is straightforward. JAS – Distributed Data Analysis

  11. AIDA Overview • AIDA = Abstract Interfaces for Data Analysis • Covers key areas for data analysis • Histograms, Tuples, Fitting, Data Points, Plotting, Management • Developed collaboratively at series of workshops by groups at CERN, LAL, SLAC. • Next workshop June 30-July 4 -- CERN • Interfaces developed for C++ and Java ( and maybe Python?) • Several implementations/tools available • Anaphe/Lizard/LCG PI – CERN • Open Scientist – LAL • JAIDA/JAS/AIDAJNI – SLAC JAS – Distributed Data Analysis

  12. JAS3 and AIDA • JAS3 has adopted AIDA for analysis • AIDA allows us to leverage experience and skill of other developers • AIDA is functionally more complete than JAS2 analysis package • AIDA allows JAS to exchange data with other AIDA tools • AIDA provides bridge to C++ programs (e.g. Geant4) • AIDA encourages creativity and innovation • JAS3 HEP Analysis tools based on JAIDA • JAIDA = Java implementation of AIDA • JAIDA is part of FreeHEP library • Usable as standalone library for any Java Application • AIDAJNI = Interface between C++ and Java AIDA • Allows C++ programs to use JAIDA, JAS3 JAS – Distributed Data Analysis

  13. C++ program AIDA- JNI Java program AIDA AIDA AIDA JAS3, AIDA and C++ C++ AIDA Implementation .aida file (XML) JAIDA JAS3 JAS – Distributed Data Analysis

  14. JAS3 and AIDA • JAS3 supports all AIDA functionality, including • Histograms (includes arithmetic, projections, etc.) • Clouds (unbinned histograms, scatterplots) • Plotter • Tuples • Fitting – AIDA interfaces allow for multiple fitters • Uncmin -- pure java minimizer • Minuit -- Fortran called by Java Native Interface (JNI) • IO • AIDA XML, PAW, Root • JAS3 supports user interaction with AIDA in three ways • Scripting (Pnuts, Python etc) • Compiled (Java) code • GUI – Plotting, Fitting, Cuts etc. JAS – Distributed Data Analysis

  15. JAS3 Scripting • JAS3 has multi-language OO scripting support • Command line, Console, Editor • Major components (e.g. AIDA) have scripting interfaces • Currently have plugins to support • Pnuts – syntax almost identical to Java, fast, well documented and feature complete • Python (using Jython) • More scripting languages can be added • not restricted to Java implementations (e.g. could use C-Python, JPE) JAS – Distributed Data Analysis

  16. JAS3 Lightning Tour • Tour designed to give you an overview of the capabilities of JAS3, you can try them out for yourself this afternoon. Welcome Page, gives initial info and links to example scripts and programs Memory monitor JAS – Distributed Data Analysis

  17. Opening Files Use file menu Drag from explorer JAS – Distributed Data Analysis

  18. Graphical Interface to AIDA Histograms, Clouds, Tuples all presented in AIDA tree .aida files, .hbook files, .root files all presented as AIDA objects Drag items onto page, or use (popup) menus JAS – Distributed Data Analysis

  19. Printing Or copy/paste into Word, PowerPoint etc. Can send individual plots or full page direct to printer Or save as PS, EPS, PDF, SWF, SVG, PNG, GIF… JAS – Distributed Data Analysis

  20. Java Editor, Compiler and Loader Tree shows loaded programs Built-in editor for writing analysis code Unlike JAS2 which only supported “event analyzers” JAS3 allows any Java program to be loaded. This example “main routine” is taken directly from the AIDA manual Built-in Java compiler JAS – Distributed Data Analysis

  21. Scripting Can also write and run scripts Console allows direct interaction with scripting language JAS – Distributed Data Analysis

  22. Pnuts Language • Currently support Pnuts scripting language • Complete and well documented • http://javacenter.sun.co.jp/pnuts/doc/guide.html • Fast (although not as fast as compiled Java) • Syntax very similar to Java • Can easily call compiled Java classes from scripts – best of both worlds • Plan to support other languages in future • In particular Python JAS – Distributed Data Analysis

  23. Record Sources Opening record (or event) based files causes the run control toolbar to appear Works similarly to JAS2 Job control, but now also supports random access and “tagged” data sets (mainly for event displays) JAS – Distributed Data Analysis

  24. Tuple Explorer - Plots Histogram Profile Works with any tuple, read from file or dynamically created ScatterPlot XY Data (More appropriate for smaller data sets) JAS – Distributed Data Analysis

  25. Tuple Explorer – Define Columns JAS – Distributed Data Analysis

  26. Tuple Explorer - Cuts JAS – Distributed Data Analysis

  27. Tuple Explorer - Tabulate JAS – Distributed Data Analysis

  28. Tuple Explorer – Record Source To be used with record loop JAS – Distributed Data Analysis

  29. JAS3 Spreadsheet • Simple spreadsheet plugin • for • Displaying results • Calculations • Simple Plots • Supports reading/writing • .csv files • Excel files • Cut/Paste with Excel etc • Coming Soon… • Scripting interface • GUI for building plots • User defined functions • Java, scripting JAS – Distributed Data Analysis

  30. Miscellaneous Features Save/Restore configuration User Preferences Plugin Manager JAS – Distributed Data Analysis

  31. Status • Currently released JAS3 version 0.7.1 • AIDA functionality is quite solid • Compiler, Loader, Record Loop all quite recently added, • Certainly still some rough edges • Documentation limited but available • Built-in example scripts and programs • Tutorial on web • If you are used to JAS2 you will find some functionality not yet ported to JAS3 • Remote (client/server) access to data. • 3D Lego/Surface plots JAS – Distributed Data Analysis

  32. JAS3 and the GRID • We plan to add client-server/distributed capabilities to JAS3 similar to those in JAS2 • Will be based on (distributed) AIDA • Next AIDA workshop (at CERN next week) will discuss this • Want to use Grid standards where they exist • Work with others (PPDG-CS11,???) to define standards where they do not exist • Want to be compatible with C++ servers • Tech-X have submitted phase II SBIR and if approved will work closely if approved JAS – Distributed Data Analysis

  33. JAS3 Links, More Info • JAS – Java Analysis Studio - http://jas.freehep.org • JAS3 – http://jas.freehep.org/jas3 • JAIDA – http://java.freehep.org/jaida/ • AIDA – http://aida.freehep.org • FreeHEP - http://www.freehep.org • FreeHEP Java Libraries - http://java.freehep.org • WIRED – http://wired.freehep.org JAS – Distributed Data Analysis