The evolution of hep software
This presentation is the property of its rightful owner.
Sponsored Links
1 / 41

The Evolution of HEP software PowerPoint PPT Presentation


  • 65 Views
  • Uploaded on
  • Presentation posted in: General

The Evolution of HEP software. 12 September 2013, NEC2013/Varna René Brun/CERN*. plan. In this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP.

Download Presentation

The Evolution of HEP software

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The evolution of hep software

The Evolutionof HEP software

12 September 2013, NEC2013/Varna

René Brun/CERN*


The evolution of hep software

plan

  • In this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP.

  • Having been involved in the design and implementation of many systems, my views are necessarily biased by my path in several experimentsand the development of some general tools.

  • I plan to describe the creation and evolution of the main systems that have shaped the current HEP software, with some views for the near future.

R.Brun : Evolution of HEP software


Machines

Machines

From Mainframes ===== Clusters

Walls of cores

GRIDs

&

Clouds

R.Brun : Evolution of HEP software


Machine units bits

Machine Units (bits)

16 32 36 48 56 60 64

pdp 11

univac

many

cdc

many

nord50

besm6

A strong push to develop portable machine independent I/O systems

With even more combinations of exponent/mantissa size

or byte ordering

R.Brun : Evolution of HEP software


User machine interface

User machine interface

R.Brun : Evolution of HEP software


General software in 1973

General Software in 1973

  • Software for bubble chambers: Thresh, Grind, Hydra

  • Histogram tool: SUMX from Berkeley

  • Simulation with EGS3 (SLAC), MCNP(Oak Ridge)

  • Small Fortran IV programs (1000 LOC, 50 kbytes)

  • Punched cards, line printers, pen plotters (GD3)

  • Small archive libraries (cernlib), lib.a

R.Brun : Evolution of HEP software


Software in 1974

Software in 1974

  • First “Large Electronic Experiments”

  • Data Handling Division == Track Chambers

  • Well organized software in TC with HYDRA, Thresh, Grind, anarchy elsewhere

  • HBOOK: from 3 routines to 100, from 3 users to many

  • First software group in DD

R.Brun : Evolution of HEP software


Geant 1 in 1975

GEANT1 in 1975

  • Very basic framework to drive a simulation program, reading data cards with FFREAD, step actions with GUSTEP, GUNEXT, apply mag-field (GUFLD).

  • Output (Hist/Digits) was user defined

  • Histograms with HBOOK

  • About 2,000 LOC

R.Brun : Evolution of HEP software


Zbook in 1975

ZBOOK in 1975

  • Extraction of the HBOOK memory manager in an independent package.

  • Creation of banks and data structures anywhere in common blocks

  • Machine independent I/O, sequential and random

  • About 5,000 LOC

R.Brun : Evolution of HEP software


Geant 2 in 1976

GEANT2 in 1976

  • Extension of GEANT1 with more physics (e-showers based on a subset of EGS, mult-scattering, decays, energy loss

  • Kinematics, hits/digits data structures in ZBOOK

  • Used by several SPS experiments (NA3, NA4, NA10, Omega)

  • About 10,000 LOC

R.Brun : Evolution of HEP software


Problems with geant 2

Problems with GEANT2

  • Very successful small framework.

  • However, the detector description was user written and defined via “if” statements at tracking time.

  • This was becoming a hard task for large and always evolving detectors (case with NA4 and C.Rubbia)

  • Many attempts to describe a detector geometry via data cards (a bit like XML), but the main problem was the poor and inefficient detector description in memory.

R.Brun : Evolution of HEP software


Geant 3 in 1980

GEANT3 in 1980

  • A data structure (ZBOOK tree) describing complex geometries introduced , then gradually the geometry routines computing distances, etc

  • This was a huge step forward implemented first in OPAL, then L3 and ALEPH.

  • Full electromagnetic showers (first based on EGS, then own developments)

R.Brun : Evolution of HEP software


Systems in 1980

Systems in 1980

End user

Analysis software

10 KLOC

Experiment

Software

100 KLOC

Libraries

HBOOK, Naglib, cernlib

500 KLOC

RAM

1 MB

OS & fortran

1000 KLOC

Tapes

CDC, IBM

Vax780

R.Brun : Evolution of HEP software


Geant 3 with zebra

GEANT3 with ZEBRA

  • ZEBRA was very rapidly implemented in 1983.

  • We introduced ZEBRA in GEANT3 in 1984.

  • From 1984 to 1993 we introduced plenty of new features in GEANT3: extensions of the geometry, hadronic models with Tatina, Gheisha and Fluka, Graphics tools.

  • In 1998, GEANT3 interface with ROOT via the VMC (Virtual Monte Carlo)

  • GEANT3 has been used and still in use by many experiments.

R.Brun : Evolution of HEP software


The evolution of hep software

PAW

  • First minimal version in 1984

  • Attempt to merge with GEP (DESY) in 1985, but take the idea of ntuples for storage and analysis. GEP was written in PL1.

  • Package growing until 1994 with more and more functions. Column-wise ntuplesin 1990.

  • Users liked it, mainly once the system was frozen in 1994.

R.Brun : Evolution of HEP software


Vectorization attempts

Vectorization attempts

  • During the years 1985->1990 a big effort was invested in vectorizing GEANT3 (work in collaboration with Florida State University) on CRAY/YMP, CYBER205,ETA10.

  • The minor gains obtained did not justify the big manpower investment. GEANT3 transport was still essentially sequential and we had a big overhead with vectors creation, gather/scatter.

  • However this experience and failure was very important for us and many messages useful for the design of GEANT5 many years later.

R.Brun : Evolution of HEP software


Parallelism in the 80s early 90s

Parallelism in the 80s & early 90s

  • Many attempts (all failing) with parallel architectures

  • Transputers and OCCAM

  • MPP (CM2, CM5, ELXI,..) with OpenMP-like software

  • Too many GLOBAL variables/structures with Fortran common blocks.

  • RISC architectures or emulators perceived as a cheaper solution in the early 90s.

  • Then MPPs died with the advent of the Pentium Pro (1994) and farms of PCs or workstations.

R.Brun : Evolution of HEP software


1992 chep annecy

1992: CHEP Annecy

  • Web, web, web, web…………

  • Attempts to replace/upgrade ZEBRA to support/use F90 modules and structures, but modules parsing and analysis was thought to be too difficult.

  • With ZEBRA the bank description was within the bank itself (just a few bits). A bank was typically a few integers followed by a dynamic array of floats/doubles.

  • We did not realize at the time that parsing user data structures was going to be a big challenge!!

R.Brun : Evolution of HEP software


Consequences

Consequences

  • In 1993/1994 performance was not anymore the main problem.

  • Our field invaded by computer scientists.

  • Program design, object-oriented programming , move to more sexy languages was becoming a priority.

  • The “goal” was thought less important than the “how”

  • This situation deteriorates even more with the death of the SSC.

R.Brun : Evolution of HEP software


1993 warning danger

1993: Warning Danger

  • 3 “clans” in my group

    • 1/3 pro F90

    • 1/3 pro C++

    • 1/3 pro commercial products (any language) for graphics, User Interfaces, I/O and data bases

  • My proposal to continue with PAW, develop ZOO(ZEBRA Object-Oriented) and GEANT3 geometry in C++ is not accepted.

  • EvolutionvsRevolution

R.Brun : Evolution of HEP software


1995 roads for root

1995: roads for ROOT

  • The official line was with GEANT4 and Objectivity, not much room left for success with an alternative product when you are alone.

  • The best tactic had to be a mixture of sociology , technicalities and very hard work.

    • Strong support from PAW and GEANT3 users

    • Strong support from HP (workstations + manpower)

  • In November we were ready for a first ROOT show

  • Java is announced (problem?)

R.Brun : Evolution of HEP software


1998 work smile

1998: work & smile

  • RUN II projects at FNAL

    • Data Analysis and Visualization

    • Data Formats and storage

  • ROOT competing with HistoScope, JAS, LHC++

  • CHEP98 (September) Chicago

  • ROOT selected by FNAL, followed by RHIC

    • Vital decision for ROOT

  • But official support at CERN only in 2002

R.Brun : Evolution of HEP software


Root evolution

ROOT evolution

  • No time to discuss the creation/evolution of the 110 ROOT shared libs/packages.

  • ROOT has gradually evolved from a data storage, analysis and visualization system to a more general software environment replacing totally what was known before as CERNLIB.

  • This has been possible thanks to MANY contributors from experiments, labs or people working on other fields.

  • ROOT6 coming soon includes a new interpret CLING and supports all the C++11 features

R.Brun : Evolution of HEP software


Input output major steps

Input/Output: Major Steps

User written streamers

filling TBuffer

member-wise streaming

for STL collections<T*>

streamers generated

by rootcint

TreeCache

automatic streamers

from dictionary

with StreamerInfos

in self-describing files

parallel

merge

member-wise streaming

for TClonesArray

R.Brun : Evolution of HEP software


Geant4 evolution

GEANT4 Evolution

  • GEANT4 is an important software tool for current experiments with more and more physics improvements and validation procedures.

  • However, the GEANT4 transport system is not any more suitable for parallel architectures. Too many changes are required.

  • GEANT5: keep the Geant4 physics and a radically new transport system.

R.Brun : Evolution of HEP software


Tools libs

Tools & Libs

Geant4+5

geant4

geant3

geant1

geant2

bos

minuit

hbook

Root 1,2,3,4,5,6

paw

zbook

zebra

hydra

R.Brun : Computing in HEP


Systems today

Systemstoday

End user

Analysis software

Networks

10 Gbit/s

0.1 MLOC

Disks

1o PB

Experiment

Software

4MLOC

Frameworks like

ROOT, Geant4

5MLOC

CLOUDS

RAM

16 GB

OS & compilers

20 MLOC

GRIDS

Hardware

Hardware

Hardware

Hardware

Clusters of multi-core machines

10000x8

R.Brun : Evolution of HEP software


Systems in 2025

Systems in 2025 ?

End user

Analysis software

Networks

100 Gbit/s

0.2MLOC

Networks

100 Gbit/s

Networks

10 Tbit/s

Experiment

Software

10MLOC

Disks

1o00 PB

Frameworks like

ROOT, Geant5

10MLOC

CLOUDS

on demand

RAM

10 TB

OS & compilers

40 MLOC

GRIDS

Hardware

Hardware

Multi-level parallel machines

10000x1000x1000

Hardware

Hardware

R.Brun : Evolution of HEP software


The evolution of hep software

BUT !!!!!

  • It looks like the amount of money devoted to computing is not going to increase with the same slope as it used to increase in the past few years.

  • The Moore’s law does not apply anymore for one single processor.

  • However, the Moore’s law looks still OK when looking at the amount of computing delivered/$, € when REALLY using parallel architectures.

  • Using these architectures is going to be a big challenge, but we do not have the choice!!!!

R.Brun : Evolution of HEP software


Software and hardware

Software and Hardware

  • GRIDs/Clouds are inherently parallel. However, because the hardware has been relatively cheap, GRIDs have pushed towards job-level parallelism at the expense of parallelism within one job.

  • It is not clear today what will be the winning hardware systems: supercomputer?, walls of cores with accelerators?, zillions of ARM-like systems?,..

  • Our software must be upgraded keeping in mind all these possible solutions. A big challenge!

R.Brun : Evolution of HEP software


Expected directions

Expected Directions

  • Parallelism: Today we do not exploit well the existing hardware (0.6 instructions/cycle in average) because our code was designed “sequential”. Important gains foreseen (10?), eg in detector simulation.

  • Automatic Data Caches: Many improvements are required to speed-up and simplify skimming procedures and data analysis.

R.Brun : Evolution of HEP software


Data caches

Data caches

  • More effort is required to simplify the analysis of large data sets (typically ROOT Trees).

  • When zillions of files are distributed in Tiers1/2, automatic, transparent, performing, safe caches are becoming mandatory on Tiers2/3 or even laptops.

  • This must be taken into account in the dilemma: sending jobs to data or vice-versa.

  • This will require changes in ROOT itself and in the various data handling or parallel file systems.

R.Brun : Evolution of HEP software


Parallelism key points

Parallelism: key points

Minimize the sequential/synchronization parts (Amdhallaw): Verydifficult

Run the same code (processes) on all cores to optimize the memory use (code and read-only data sharing)

Job-levelisbetterthanevent-levelparallelism for offline systems.

Use the good-oldprinciple of data localityto minimize the cache misses.

Exploit the vectorcapabilitiesbut becarefulwith the new/delete/gather/scatterproblem

Reorganizeyour code to reducetails

R.Brun : Evolution of HEP software


Data structures parallelism

Data Structures & parallelism

C++ pointers

specific to a process

event

event

vertices

Copying the structure implies a relocation of all pointers

I/O is a nightmare

tracks

Update of the structure from a different thread implies a lock/mutex

R.Brun : Evolution of HEP software


Data structures locality

Data Structures & Locality

sparse data structures defeat the system memory caches

For example: group the cross-sections for all processes per materialinstead of all materials per process

Group objectelements/collections suchthat the storage matches the traversalprocesses

R.Brun : Evolution of HEP software


Create vectors exploit locality

Create Vectors& exploit Locality

  • By making vectors , you optimize the instruction cache (gain >2) and data cache (gain >2)

  • By making vectors, you can use the built-in pipeline instructions of existing processors (gain >2)

  • But, there is no point in making vectors if your algorithm is still sequential or badly designed for parallelism, eg:

    • Too many threads synchronization points (Amdhal)

    • Vectors gather/scatter

R.Brun : Evolution of HEP software


Conventional transport

Conventional Transport

T2

o

Eachparticletrackedstep by stepthroughhundreds of volumes

o

o

o

o

o

o

o

o

o

o

o

o

T4

o

o

o

o

o

o

o

T1

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

when all hits for all tracks are in memory summable digits are computed

o

o

o

T3

LPCC workshop Rene Brun


Analogy with car traffic

Analogywith car traffic

LPCC workshop Rene Brun


New transport scheme

New Transport Scheme

T2

o

o

o

o

All particles in the same volume type are transported in parallel.

Particles entering new volumes or generated are accumulated in the volume basket.

o

o

o

o

o

o

o

o

o

T4

o

o

o

o

o

o

o

T1

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

Events for which all hits are available are digitized in parallel

o

o

o

o

o

T3

LPCC workshop Rene Brun


Towards parallel software

TowardsParallel Software

  • A long way to go!!

  • There is no point in justmakingyour code thread-safe. Use of parallel architectures requires a deeprethinking of the algorithms and dataflow.

  • One suchprojectis GEANT GEANT4+5launched 2 yearsago. Westarthavingveryniceresults. But still a long way to go to adapt (or write radically new software) for the emerging parallel systems.

R.Brun : Evolution of HEP software


A global effort

A global effort

  • Software development is nowadays a world-wide effort with people scattered in many labs developing simulation, production or analysis code.

  • It remains a very interesting area for new people not scared by big challenges.

  • I had the fantastic opportunity to work for many decades in the development of many general tools in close cooperation with many people to whom I am very grateful.

R.Brun : Evolution of HEP software


  • Login