1 / 46

Java Analysis Studio & Object Oriented Data Analysis (in Java)

Java Analysis Studio & Object Oriented Data Analysis (in Java). KEK 25 th May 2000 Tony Johnson - SLAC tony_johnson@slac.stanford.edu. Contents. Overview of Java Why Java for Data Analysis Java Analysis Studio Recently added features Using Java for Reconstruction

dalit
Download Presentation

Java Analysis Studio & Object Oriented Data Analysis (in Java)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Java Analysis Studio &Object Oriented Data Analysis(in Java) KEK 25th May 2000 Tony Johnson - SLAC tony_johnson@slac.stanford.edu

  2. Contents • Overview of Java • Why Java for Data Analysis • Java Analysis Studio • Recently added features • Using Java for Reconstruction • Linear Collider Simulation Framework • Is Java fast enough for Data Analysis? • HEP-wide java libraries • Conclusions • Demo

  3. 1991 James Gosling at Sun creates Java language (néeOak) Targeted at consumer electronics - cable top boxes, VCR, TV etc. Goal was reliability not speed 1994 Hot Java Web browser written (in Java) Supports Applets - Downloadable programs that run inside web browser Java licensed by Netscape, Oracle, Microsoft many others Huge hype surrounding “Web Programming language” 1997 Java 1.1 released with many standard libraries Sun’s mantra becomes “Write Once Run Anywhere” Enthusiastically supported by all major hardware and many software vendors Microsoft begins to have second thoughts 1998 Java 2 released, even more standard libraries Now truly general purpose language Sun (and DOJ) sue Microsoft History of Java

  4. Java Source code Compiler Java “Bytecodes” Mac Unix PC JIT Compiler Bytecode Interpreter Machine Code Java Architecture • More than just a Web Tool • Java is a fully functional, platform independent, object-oriented language • Powerful set of machine independent libraries, including GUI library. • Totally Buzzword Compliant • Simple, Object Orientated, Distributed, Dynamic, Robust, Secure, Architecture Neural, Portable, High Performance, Multithreaded. • Interpreted? • Compiled + Interpreted. • Dynamic Optimizationmay make Java faster than statically compiled languages (in principle).

  5. Simple But not trivial…you need to read a book Syntax very close to C++ No backwards compatibility issues Some features of C++ which add undue complexity dropped. Good stepping stone to (or from) C++ Clean and Efficient Object-Oriented Language Language features guide programmer toward reliable programming habits Robust Extensive Compile-Time checking of code Second level of run-time checking of code Memory management done by system, not by programmer No pointers to mess up (Java uses references rather than pointers) Chances of program running as designed without the need for time-consuming debugging is greatly increased. Java Features

  6. Java Features (continued) • Highly Portable • Java works today on NT, Win95/98, Unix (including Linux), Mac, VMS • Personal Java - Windows CE, Palm Pilot • Programs written in Java are very portable • Move to another platform and it just works • Care needed with AWT GUI components (obsolete) and web browsers • Lifetime of HEP experiments > OS lifetime. • Lifetime of Java > Lifetime of HEP experiment?? • Encourages true modularity • Build entire framework for HEP experiment in Java • Abstract away underlying systems (batch system, IO system etc.)

  7. Java Features (continued) • Distributed • Built in support for Internet protocols, URL’s, HTTP, Remote Method Invocation, Corba, Database access etc. • Secure • Bytecode “verifier”, padded cell (c.f. Web Browser) • Multithreaded • Language has direct support for multithreading • Dynamic • Libraries canchange without recompiling programs that use them • Can dynamically load and unload code during program execution • Can move objects across the network (agents), or store them in databases and retrieve them later.

  8. Java Libraries and API’s • Standard Libraries and API’s • 2D + 3D graphics + GUI (Swing) + Imaging + Printing • Database connectivity (JDBC) + ODMG • Collections, IO (Serialization), Data Compression • Networking, Sockets, SSL, Corba, RMI • Java Beans (components), Help • Multimedia, Sound, Speech • Security, Code Signing, Cryptography • Math, Arbitrary Precision Math • Shared Data (Collaborative Applications) • Huge “Community-Ware” software archive • IBM alone has hundreds of Java resources on its Alphaworks site

  9. Java Tools • Popularity of Java = many tools • And they are cheap (or even free) • Development Environments (IDE’s) • Editor, Compiler, Debugger, WYSIWYG GUI designer, Source control • Automatic Documentation generators • Memory and CPU Optimizers • Since debugging time is minimal you might actually have time to use them • Object Modelers • Many commercial sets of components

  10. No operator overloading Annoying for complex numbers, matrices, 3/4-vectors Perhaps more often abused than sensibly used Lightweight Objects (value semantics) may overcome this Bugs sometimes slow to be fixed Printing, Imaging existed for >1 year Perhaps “Community Source License” will help Little control overMemory Allocation Integration with C++ could be better Standardization lacking Sun had promised to submit Java to ISO for standardization, but has so far failed to deliver Java Limitations?

  11. Previous generation of experiments used Fortran + Data ManagementSystem (== Jazelle, Zebra, BOS) Solves Three Problems Ability to Represent Complex Data Structures Persistence (i.e. read in and write out complex structures) Run time access to named data in structures (for analysis) Now time has marched on and modern experiments use C++ Represent Complex Data Persistence Run time access to data Still need to build (or buy and deploy) data management system (e.g. Root, Objectivity) Java Represent Complex Data Persistence (serialization) Run time access to data (reflection) support built-in to language Why Java for HEP Computing?

  12. GUI systems online + control (not really any alternative) Event Display Reconstruction+Simulation packages? Data Analysis tasks Offline Online Event Generators Where would HEP use Java?

  13. Java Analysis Studio Experiment independent analysis tools for High Energy Physics data

  14. Introduction to JAS • JAS starts from experience with SLD interactive data analysis • IDA (Toby Burnett) + SLD extensions • Integrates ideas from • Reason, Hippodraw, LHC++, Histoscope, … • Exploit advantages of Java • Cross platform, dynamic loading, GUI, many standard API’s – networking, HTML, etc. • Aim is to solve real life physicist problems • Want to get input from as many people as possible. • System is flexible enough to change.

  15. JAS Overview • Modular Java Toolkit for Analysis of HEP data • Data Format Independent • Experiment Independent • Supports arbitrarily complex analysis modules written in Java • Rich Graphical User Interface (GUI) with: • Data Explorer • Flexible Histogram + Scatterplot display • Histogram manipulation+fitting • Built-in Editor/Compiler (for writing analysis modules) • Extensible via plugins • User extensible via Object Orientated API's • Written entirely in Java so will run on any platform with a Java VM (JDK 1.1 or better) • Support: Windows 95/98/NT/2000 + Linux + Solaris • Works on: DEC + SGI + Mac

  16. JAS Components Histo/Plot Adaptor Analysis Framework Plugin GUI Framework Network Adapter 3-4 Vector Utilities Histogram Accumulation JASHist (Plot Bean) Fitting Framework Particle Properties Data Interface Jet Finder Functions Fitters PAW SQL stdHEP

  17. Local Data Oracle Paw Root Jazelle Flat File Oracle Paw Root DIM Objectivity Hippo Jazelle Flat File Remote Data Objectivity Hippo Network Data Server DIM Data Access Classes • Analyze local or remote data Desktop Client • User interface independent of Data Location • Does not assume fast network (works well at 28.8 bps] • Analysis code moves (transparently) to data

  18. Users Java Code Java Compiler + Debugger Remote Data Analysis TCP/IP Network Data Analysis Engine GUI Padded Cell Experiment Extensions (Event Display) • Data • Zebra • Jazelle • Paw • Root • Objectivity Experiment Interface C++ Code

  19. Oracle Paw Root Jazelle Flat File Objectivity Hippo Distributed Data Analysis Desktop Client Network Data Server Network Data Controller Distributed Data Data Server Data Server DIM Data Server DIM Data Server DIM Data Server DIM Data Server DIM DIM

  20. Plot Display Package • 1-d/2-d Histogram/ScatterPlot Display • multiple axes, direct user interaction, overlays, fitting

  21. Java Analysis Studio GUI

  22. Example AnalysisCode (TrackRecon)

  23. Demo

  24. New Features • Modular Plot Component • Can be used in other applications • GUI, servlets • Model-view-controller design • Supports many display styles, 1d, 2d, scatterplot, fitting, slices, user interaction, • XML for data interchange with other apps. • jEdit Editor • Full featured program editor • Syntax highlighting, indenting, bracket matching • Expect to be able to integrate advanced features • Debugging, auto-completion

  25. New Features – HTML support

  26. New Features – WIRED Plugin

  27. Java AIDA AIDA C++ Program JNI JAS New Features – AIDA support • AIDA is attempt to standardize HEP histogram interface • Abstract interface • C++ and Java supported • Multiple implementations • JAS now supports AIDA interface • Now possible to create JAS histograms from C++

  28. New Features – G4 interface

  29. Future Features - 3D Support

  30. Usage • Babar using for Online Monitoring • Using Online Monitoring API • HTML Pages with embedded plots • Custom Overlays • US Linear Collider Studies • Have an entire recon+analysis package written in Java • Using JAS as analysis interface • Making use of remote data access using repository at University of Pennsylvania • CLEO • Using plot bean for online displays • Other smaller scale users • All giving very valuable feedback • Helping to produce more reliable solution

  31. OpenSource – Anyone can Contribute! • All source code now stored in CVS • Use any CVS client for anonymous (read-only) access • We recommend jCVS (pure Java CVS client) • Source code all web browsable • Implemented using jCVS servlet • Write access can be given to interested developers • Intend to put entire code under LGPL • Platform independent build system • Uses jmk - pure java make-like tool • To build entire system on any platform with CVS and Java cvs co jas cd jas java -jar jmk.jar

  32. Documentation • LCD Tutorial exists • Nice step by step tutorial for beginners • Examples are all based on LCD but can be used by anyone • Starts from very beginning • Slowly adding information to Users Guide • Still nowhere near complete • How To being created to cover specific topics • Servlets How To • HTML How To • XML How To • Online API How To • Working on Fitting How To • JavaDoc generated API documentation available • Documentation remains weak link • We are aware of this and are working on producing more documentation • Also need more design specs/internals documentation to make open source model more effective

  33. Java for Reconstruction/Simulation Dual Goals: Contribute to Linear Collider Detector/Physics Studies Experiment with using Java for full offline reconstruction and analysis package

  34. Goals: Detailed Study of physics processes in a variety of possible LC Detectors. Reference Small and Large detectors Full simulation with GISMO Switch to Geant4, when ready Analysis using Paw C++ & Root Java & JAS Software Requirements Flexibly handle different detector geometries and technologies Rapid development of variety of reconstruction and analysis algorithms LC Detector studies in US

  35. Reconstruction Processors Track finder+fitter written Interface to Fortran fitter in progress Several clustering algorithms Parameterized MC Processors Can read generator input or Gismo output Track and Cluster smearing Analysis Utilities Event Shape + Thrust utilities Jet finder [Jade, Durham] Histograming Event Displays Simple 2D Event display Full 3D WIRED event display Java package hep.lcd • Framework • Driver framework • interactively control • calling of processors • debugging/histograming • Parameter (Constant) access • driven by detector geometry • MC event input (StdHEP format) • IO system based on Java IO • random access files • Can be run inside JAS or standalone

  36. Event Display

  37. Event Display

  38. Event Display

  39. Event Display

  40. Java for Reconstruction/Simulation • Looks very promising • Have been able to develop framework very fast • People have no problem learning and using it • Performance looks good • Future • Java interface to Geant4?

  41. Reconstruction Performance

  42. Java Performance Summary • Is Java Fast Enough for Physics Analysis? • Yes • Time gained in development well worth runtime overhead • Good design has more effect on final speed than language • Many tools available to help optimize code • Java will continue to get faster • More information - • ACM 1999 Java Grande Conference • http://www.cs.ucsb.edu/conferences/java99/ • THE JAVA PERFORMANCE REPORT • http://www.javalobby.org/features/jpr/

  43. HEP-wide Java libraries • FreeHep java library • Extract common code from JAS+WIRED • Add other utilities (not highly hep specific) • Encapsulated Postscript generator • JACO – Java to C++ interface • Encourage others to look at what is there • We welcome contributions from others • HEP library – more physics specific • 3 and 4 vectors, jet finders, MC generators • Histograming package (AIDA)

  44. HEP-wide Java libraries • FreeHEP library already has useful stuff in it, HEP library just getting started • Both libraries in CVS • Read access available to anyone • Write access to qualified developers • Web Site • http://java.freehep.org • Contributions welcome

  45. Conclusions • Java is a very useful language+environment that could be very beneficial to HEP in many areas. • Could Java be used for entire offline for major experiment? • Technically - Yes • Will Java Survive long enough? • Need ISO standard • Need to see how market forces play out. • Programming in Java is Fun!! • Spend time architecting an elegant solution to problem to be solved • Not • Reinventing the wheel, • Debugging someone else’s problem • Porting to different platforms

  46. More Information… • Java Analysis Studio • http://jas.freehep.org • FreeHEP library • http://java.freehep.org • US Linear Collider Reconstruction • http://www-sldnt.slac.stanford.edu/nld • WIRED • http://wired.cern.ch • AIDA • http://wwwinfo.cern.ch/asd/lhc++/AIDA/index.html

More Related