1 / 20

9th Topical Seminar on Innovative Particle and Radiation Detectors 23 - 26 May 2004   Siena, Italy

G.A.P.Cirrone, S.Donadio, S.Guatelli, A. Mantero, B.Mascialino, S.Parlati, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo. A Statistical Toolkit for Data Analysis . 9th Topical Seminar on Innovative Particle and Radiation Detectors 23 - 26 May 2004   Siena, Italy. Data analysis in HEP.

sereno
Download Presentation

9th Topical Seminar on Innovative Particle and Radiation Detectors 23 - 26 May 2004   Siena, Italy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. G.A.P.Cirrone, S.Donadio, S.Guatelli, A. Mantero, B.Mascialino, S.Parlati, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo A Statistical Toolkit for Data Analysis 9th Topical Seminar on Innovative Particle and Radiation Detectors23 - 26 May 2004   Siena, Italy

  2. Data analysis in HEP Provide tools for the statistical comparison of distributions in terms of: • Equivalent reference distributions; • Experimental measurements; • Data from reference sources; • Functions deriving from theoretical calculations or fits;

  3. Applications • Validation of Geant4 electromagnetic physics models • Attenuation coefficients, CSDA ranges, Stopping Power, distributions of physics quantities • Quantitative comparisons to experimental data and recognised standard references • Detector monitoring; • Simulation validation; • Reconstruction vs. Expectation; • Regression testing; • Physics analysis;

  4. Example of Applications I Transmitted photons (I) NIST Photon beam (Io) G4Standard G4 LowE Photon mass attenuation coefficient Absorber Materials: Be, Al, Si, Ge, Fe, Cs, Au, Pb, U

  5. Example of Applications II Electron stopping power and CSDA range Absorber Materials: Be, Al, Si, Ge, Fe, Cs, Au, Pb, U

  6. GoF statistical toolkit Qualitative evaluation Quantitativeevaluation Comparison of distributions Goodness of fit testing A project to develop a statistical comparison system

  7. Software Process guidelines SPIRAL APPROACH • United Software Development Process, specifically tailored to the project • practical guidance and tools from the RUP • both rigorous and lightweight • mapping onto ISO 15504 • Guidance from ISO 15504 • Incremental and iterative life cycle model with

  8. Architectural guidelines • The project adopts a solid architectural approach • to offer the functionality and the qualityneeded by the users • to be maintainable over a large time scale • to be extensible, to accommodate future evolutions of the requirements • Component-based approach • to facilitate re-use and integration in different frameworks • AIDA • adopt a (HEP) standard • no dependence on any specific analysis tool

  9. The algorithms are specialised on the kind of distribution (binned/unbinned) Every algorithm has been rigorously tested Documentation available : http://www.ge.infn.it/geant4/analysis/HEPstatistics/

  10. Chi-Squared test • Applies to binned distributions • It can be useful also in case of unbinned distributions, but the data must be grouped into classes • Cannot be applied if the counting of the theoretical frequencies in each class is < 5 • When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached • Otherwise one could use Yates formula

  11. More sophisticated algorithms EMPIRICAL DISTRIBUTION FUNCTION ORIGINAL DISTRIBUTIONS unbinned distributions • Kolmogorov-Smirnov test • Goodman approximation of KS test • Kuiper test Dmn SUPREMUM STATISTICS

  12. More powerful algorithms unbinned distributions • Cramer-von Mises test • (Tiku test) • Anderson-Darling test TESTS CONTAINING A WEIGHTING FUNCTION These algorithms are so powerful that we decided to implement their equivalent in case of binned distributions: binned distributions • Fisz-Cramer-von Mises test • (Tiku test) • k-sample Anderson-Darling test

  13. How to decide the power of an algorithm? Supremum statistics tests Tests containing a weight function 2 < < A test is considered powerful if the probability of accepting the null hypothesis when null hypothesis is wrong is low • 2 loses information in a test for unbinned distribution by grouping the data into cells (Kac, Kiefer and Wolfowitz (1955) showed that Kolmogorov-Smirnov test requires n4/5 observations compared to n observations for 2 to attain the same power) • Cramer-von Mises and Anderson-Darling statistics are expected to be superior to Kolmogorov-Smirnov’s, since they make a comparison of the two distributions all along the range of x, rather than looking for a marked difference at one point. . . . This is now work in progress . . .

  14. User’s point of view • Simple user layer • Only deal with AIDA objects and choice of comparison algorithm The user is completely shielded from both statistical and computing complexity. STATISTICAL RESULT USER TOOLKIT EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE

  15. Results and practical applications Collaborations with:

  16. Microscopic validation of physics 2N-S=0.532 =28 p=1 2N-S=0.532 =28 p=1 NIST 2N-L=1.928 =28 p=1 2N-L=1.928 =28 p=1 Geant4 Standard Geant4 LowE 2N-S=0.373 =28 p=1 2N-S=0.267 =28 p=1 2N-L= 5.882 =28 p=1 2N-L=1.315=28 p=1 Geant4 simulations are statistically comparable with reference data (NIST database http://www.nist.gov) Chi-squared test

  17. Test beam at Bessy Bepi-Colombo Mission X-ray fluorescence spectrum in Iceand basalt (EIN=6.5 keV) Counts Energy (keV) Chi2 not appropriate (< 5 entries in some bins, physical information would be lost if rebinned) Very complex distributions Experimental measurements are comparable with Geant4 simulations Anderson-Darling Ac (95%) =0.752 A.Mantero, M.Bavdaz, A.Owens, A.Peacock, M.G.Pia Simulation of X-ray Fluorescence and Application to Planetary Astrophysics

  18. Medical applications in hadron therapy KOLMOGOROV-SMIRNOV Experimental measurements are comparable with Geant4 simulations DEXP-GEANT4=0.11 p=n.s. Goodman approximation KOLMOGOROV-SMIRNOV 2EXP-GEANT4=3.8=2 p=n.s. G.A.P.Cirrone, G.Cuttone, S.Donadio, S.Guatelli, S.Lo Nigro, B.Mascialino, M.G.Pia, L.Raffaele, G.M.Sabini Implementation of a new Monte Carlo Simulation Tool for the Development of a proton Therapy Beam Line and Verification of the Related Dose Distributions

  19. Conclusions • This is a newup-to-dateeasy to handle and powerfultool for statistical comparison in particle physics. • It the first tool supplying such a variety of sophisticated and powerful statistical tests in HEP. • AIDA interfaces allow its integration in any other data analysis tool. Applications in: HEP, astrophysics, medical physics

More Related