1 / 39

Disciplinary Case Study 1 Physical Sciences - Particle Physics at CERN and elsewhere

Disciplinary Case Study 1 Physical Sciences - Particle Physics at CERN and elsewhere. Jürgen Knobloch CERN/IT ERPANET/CODATA International Archiving Workshop on the Selection, Appraisal, and Retention of Digital Scientific Data Biblioteca Nacional, Lisbon, Portugal 15-17 December 2003.

seva
Download Presentation

Disciplinary Case Study 1 Physical Sciences - Particle Physics at CERN and elsewhere

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disciplinary Case Study 1Physical Sciences - Particle Physics at CERN and elsewhere Jürgen Knobloch CERN/IT ERPANET/CODATA International Archiving Workshop on the Selection, Appraisal, and Retention of Digital Scientific Data Biblioteca Nacional, Lisbon, Portugal 15-17 December 2003

  2. Overview • Physics data retention past and future • Particle physics laboratory – CERN • Particle physics – data flow – data types – data volume • Keeping digital data available • In the longer term • For the public use • Some examples • LEP data long-term availability • QUAERO – opening data to the public • Particle Data Group • Limits of re-using data • Conclusion – Points for discussion

  3. Physics data … “Laboratory”, Karnak, Egypt > 3000 years

  4. … are not always cast in stone Law of motion Galileo’s notebook ~ 1638

  5. Enrico Fermi - 1942 Notebook recording the first controlled, self-sustaining nuclear chain reaction, December 2, 1942; Records of the Atomic Energy Commission; Record Group 326; National Archives.

  6. … or on film … Bubble chamberCERN, 1973

  7. … now it is all electronic! CERN – UA1 1983

  8. CERN: Annual budget: ~1000 MSFr (~700 M€) Staff members: 2650 Member states: 20 + 225 Fellows, + 270 Associates + 6000 CERN users CERN (founded 1954) = “Conseil Européen pour la Recherche Nucléaire” “European Organisation for Nuclear Reseach” Particle Physics 27 km circumference tunnel CERN

  9. CERN – where the Web was born Tim Berners-Lee Fist Web Server WSIS, Geneva, October 10-12, 2003

  10. CERN’s 20 member states CERN Convention: … shall provide … research of pure scientific and fundamental character… … shall have no concern with work for military requirements and the results of its experimental and theoretical work shall be published or otherwise made generally available.

  11. … and scientists from the rest of the world: • OBSERVERS: • UNESCO • EU • Israel • Turkey • SPECIAL OBSERVERS • (for LHC): • USA • Japan • Russia

  12. Other accelerators around the world

  13. Particle Physics Establish a periodic system of the fundamental building blocks andunderstandforces

  14. Methods of Particle Physics The most powerful microscope Creating conditions similar to the Big Bang

  15. e+ f Z0 f e- Detector response apply calibration, alignment Fragmentation, Decay, Physics analysis Basic physics Results Particle physics data From raw data to physics results 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 _ Raw data Convert to physics quantities Interaction with detector material Pattern, recognition, Particle identification Analysis Reconstruction Simulation (Monte-Carlo)

  16. Detector alignment Detector description Reconstruction parameters Detector calibration Generate Events Build Reconstruction Geometry Physics Build Simulation Geometry Reconstuction geometry Reconstruct Events Analyze Events Simulation geometry Simulate Events Raw Data ESD AOD HEP Data Analysis

  17. CERN data archiving policy • "CERN is not just another laboratory. It is an institution that has been entrusted with a noble mission which it must fulfil not just for tomorrow but for the eternal history of human thought.“ • (Albert Picot, 3rd Session of CERN Council, Geneva, 1955) • "Rules applicable to archival material and archiving at CERN“ – CERN Operational circular No. 3 (1997) • Historical and scientific archives • Implemented by the CERN archivist – Anita Hollier • Does not specifically cover digital physics data

  18. Technical issues – shelf lifetechnology cycle - metadata

  19. Data archiving - LEP • Large Electron-Positron collider (LEP) was running from 1989 – 2000 (80 – 200 GeV) • Accelerator and experiments dismantled to make room for LHC • Four experiments ALEPH, DELPHI, L3, OPAL • Officially “terminating” in the near future • Experiments request to keep the analysis of data alive (as long as possible/reasonable) • In case that other experiments (e.g. at LHC) see new phenomena that are within the reach of LEP • Require also to be able to re-run simulation (Monte Carlo).

  20. LEP data agreement • Keep the (FORTRAN) software running “as is”. • No further development or maintenance to central software such as CERNLIB • Have the required data available in the standard CERN hierarchical mass storage system CASTOR. • Carry the data forward in case of MSS evolution. • Have the software and data access running on a special cluster of LINUX computers. • Carry software with the operating system and compiler forward as much as possible without major effort • Run the whole as a “Museum system”

  21. Issues and concerns • Event displays of some experiments depend on external commercial software. • Security updates for a given OS version are available for a limited time only. • The system requires – hopefully limited – manpower for central support. • The experiments need to keep it alive and perform regular tests. • Major changes in hardware technology cannot be accommodated (what about emulators?). • The data cannot be analyzed in a meaningful way by people who were not involved in the original collaboration.

  22. QUAERO – Making HEP data publicly available • Developed by Bruce Knuteson

  23. Data – Monte CarloSignal - Background

  24. Quaero - choices

  25. Quaero policy

  26. HEP-wide data compilation:Particle Data Group (PDG)

  27. PDG: “Rosenfeld tables” • First review of particle properties & data: • “Hyperons and Heavy Mesons (Systematics and Decay)” by M. Gell-Mann and A. H. Rosenfeld, Ann.Rev.Nucl.Sci. 7 (1957) 407 • Separate efforts at LBL and CERN joined in 1964 • Rev.Mod.Phys. 36 (1964) 977 • Particle Data Group: • Maintaining a data-base of experimental results. • More than 20,000 measurements from 6000 papers. • Review and the Booklet are published in even-numbered years. • Web version updated between printed editions

  28. PDG: Particle properties

  29. PDG: Compilations Cross-sections Structure functions

  30. PDG: Following history … … can be revealing!

  31. The next step:LHC (Large Hadron Collider) 2000 M€ cost 27 km circumference 100 m underground

  32. Challenge 1: Large, distributed community CMS “Offline” software effort: 1000 person-yearsper experiment ATLAS Software life span: 20 years ~ 5000 Physicistsaround the world- around the clock LHCb

  33. Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Concorde (15 Km) 6 cm Mt. Blanc (4.8 Km) 50 CD-ROM = 35 GB Challenge 2: Data Volume Annual data storage: 12-14 PetaBytes/year

  34. Challenge 3a: Find the Needle in a Haystack

  35. Challenge 3a: Find the Needle in a Haystack All interactions 9 orders of magnitude! The HIGGS Rare phenomena Huge backgroundComplex events

  36. Challenge 3b: Provide mountains of CPU CalibrationReconstructionSimulationAnalysis For LHC computing,some 100 Million SPECint2000 are needed! 1 SPECint2000 = 0.1 SPECint95 = 1 CERN-unit = 4 MIPS - a 3 GHz Pentium 4 has ~ 1000 SPECint2000

  37. We count on the Grid- what about archiving?

  38. Limits of re-using data • Example: • Fischbach et al. re-analysing after 100 years data from Eötvös’ classic experiment concerning the proportionality of inertial and gravitating masses. • Eötvös had an accuracy of 1/100 000 000 • Fischbach et al. used the data to claim the discovery of the fifth force! • Influence of the architecture of the building, the geology and people moving about make the conclusion rather difficult – to say the least!

  39. Points for discussion • Experimental physics is publicly funded and expensive • LHC costs 1010 € • Not easy to repeat • Experimental data are useless without documentation, metadata, and software • Technology evolution is a serious burden • Have to keep data alive (re-copying) • What is the right level of archiving? • A la Particle Data Group – reliable, long-term • Four vectors – limited possibilities – not always useful • (Mini)-DST – requires more expertise • Making raw and intermediate data publicly available • Education – a clear case that is pursued by several groups • Scientists are rather hesitant to do it in general

More Related