What we’re doing Why we’re doing it What we’ve learned by doing it

PHENIX Offline Computing David Morrison Brookhaven National Laboratory What we’re doing Why we’re doing it What we’ve learned by doing it

a word from our sponsors ... • large collaboration (>400 physicists) • large, complex detector • ~300,000 channels • 11 different detector subsystems • large volume of data, large number of events • 20 MB/sec for 9 months each year • 109 Au+Au events each year • broad physics program • partly because RHIC itself is very flexible • Au+Au at 100+100 GeV/A, spin polarized p+p, and everything in-between • muons, electrons, hadrons, photons

from the PHENIX photo album DPM, in hardhat

know your physics program for PHENIX, event processing rather than event selection know your constraints money, manpower ... and tape mounts avoid “not invented here” syndrome: beg, borrow, collaborate doesn’t automatically imply use of commercial products focus on modularity, interfaces, abstract base classes viciously curtail variety of architecture/OS Linux, Solaris data management and data access are really hard problems don’t rely on fine-grained random access to 100’s of TB of data everyone has their favorite reference works... Design Patterns (Gamma et al) run-time aggregation, shallow inheritance trees The Mythical Man-Month (Brooks) avoid implementation by committee the eightfold way of PHENIX offline computing

small group of “core” offline developers M. Messer, K. Pope, M. Velkovsky, M. Purschke, D. Morrison, (M. Pollack) large number of computer-savvy subsystem physicists recruitment via “help wanted” list of projects that need people PHENIX object-orented library, PHOOL (see talk by M. Messer) object-oriented analysis framework analysis modules all share common interface type-safe, flexible data manager extensive use of RTTI, avoids (void *) casts by users ROOT I/O used for persistency “STL” operations on collection of modules or data nodes varied OO views on analysis framework design ranging from passive data to “event, reconstruct thyself” PHOOL follows a hybrid approach migrated to PHOOL from STAF in early 1999 no user code modified (~120,000 LOC) building blocks

more blocks • lots of physics-oriented objects in PHENIX code • geometry, address/index objects, track models, reconstruction • file catalog • metadata management, tracks related files, tied in with run info DB • “data carousel” for retrieving files from HPSS • retrieval seen as group-level activity (subsystems, physics working groups) • carousel optimizes file retrieval, mediates resource usage between groups • scripts on top of IBM-written batch system • event display(s) • very much subsystem-centered efforts; all are ROOT-based • clearly valuable for algorithm development and debugging • value for PHENIX physics analysis much less clear • GNU build system, Mozilla-derived recompilation (poster M. Velkovsky) • autoconf, automake, libtool, Bonsai, Tinderbox, etc. • capable, robust, widely used by large audience on variety of platforms • feedback loop for code development

Objectivity used for “archival” database needs Objy used in fairly “mainstream” manner all Objy DBs are resident online (not storing event data) autonomous partitions, data replicated between counting house, RCF RCF (D. Stampf) ported Objy to Linux PdbCal class library aimed at calibration DB application insulates typical user from Objectivity objects stored with validity period, versioning usable interactively from within ROOT mySQL used for other database applications Bonsai, Tinderbox system uses mySQL heavily used in “data carousel” databases in PHENIX

simplified data flow disk NFS disk HPSS counting house analysis farm calibrations & conditions Objectivity federated DB

OO ubiquitous, mainstream in PHENIX • subclasses of abstract “Eventiterator” class used to read raw data • from online pool, file, or fake test events - user code unchanged • online control architecture based on CORBA “publish-subscribe” • Java used in counting house for GUIs, CORBA • subsystem reconstruction code uses STL, design patterns • not unusual to hear “singleton”, “iterator” at computing meetings • OO emerging out of subsystems faster than from core offline crew

no Fortran in new post-simulation code sidestepped many awkward F77/C++ issues, allowed OO to permeate loosely coupled, short hierarchy design working well information localization on top of information encapsulation allows decoupled, independent development no formal design tools, but lots of cloudy chalkboard diagrams usually just a few interacting classes social engineering as important as software engineering OO not science-fiction, not difficult ... and it’s here to stay lots of hands-on examples, people are usually pleasantly surprised OO experiences

more OO experiences • OO was oversold (not by us!) as a computing panacea • does make big computing problem tractable, not trivial • occasional need for internal “public-relations” • cognizance of “distance” between concepts advocated by developers and those held by users • e.g., CORBA IDL a great thing; tough to sell to collaboration at-large • takes time and effort to “get it”, to move beyond “F77++” • general audience OO and C++ tutorials have helped • also work closely with someone from each subsystem - helps the OO “meme” take hold

summary • PHENIX computing is essentially ready for physics data • use of PHOOL proven very successful during “mock data challenge” • ObjectivityDB is primary database technology used throughout PHENIX • reasonably conventional file-oriented data processing model • loosely coupled, shallow hierarchy OO design • common approach across online and offline computing • several approaches to recruiting, stretching scarce manpower • deliberate, explicit choice by collaboration to move to OO • recruit manpower from detector subsystems • loosely coupled OO design aids loosely coupled development • OO has slowed implementation, but has been indispensable for design • PHENIX will analyze physics data because of OO, not in spite of it

What we’re doing Why we’re doing it What we’ve learned by doing it