1 / 12

P latform stability and track-fit problems

P latform stability and track-fit problems. M. Moulson, T. Spadaro, P. Valente Tracking Meeting, 18 Jul 2001. Warning sign: platform dependence. Test of DBV-10 on SunOS and AIX: Input: 1000 raw events Output in ksl stream: AIX: 11 events, incl. 3 not found on SunOS

oria
Download Presentation

P latform stability and track-fit problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Platform stability and track-fit problems M. Moulson, T. Spadaro, P. Valente Tracking Meeting, 18 Jul 2001

  2. Warning sign: platform dependence • Test of DBV-10 on SunOS and AIX: • Input: 1000 raw events • Output in ksl stream: • AIX: 11 events, incl. 3 not found on SunOS • SunOS: 10 events, incl. 2 not found on AIX • AIX or SunOS: 13 events • Mostly KSTAG (one INTERTAG) • 2 events found on both platforms with different length • General caveat (?’s): • Parameter space is huge; this is a quick survey • Most tests done on very small (single-event) samples • Tests should be done methodically once direction is clear

  3. Where do the differences arise? • Differences appear at track-fit level • Reconstruction through PR identical • including DTCE(1), DHRE(1) banks • Differences appear in DBV-9 • Test DBV’s 7, 8, 9, 10 on a single event (?) • Reconstruction in general different in each version • Same on AIX and SunOS platforms for DBV-7, 8 • CVS history: changes in DBV-9 • dconvr: Spatial resolutions from data • vtxfin: Various small bug fixes • Suspect effect from new parameterization of hit resolution

  4. First-crack diagnostics • Cannot eliminate effect just by switching off algorithms • Kink finding, track joining, M.S., hit add./rej., etc. • Hit flipping not switchable • Fine t-s relations a possible exception • Known changes correspond to onset of differences (?) • Provide a plausible mechanism for effect • Cannot eliminate effect by disabling code-optimization

  5. Summary of first-crack diagnostics Input: 1000 raw events from run 18805 Table summarizes differences in ksl stream

  6. DFITER: Fundamental track-fit routine DFITER Get space points from track pars. (q) DFTRAC Time-space conversion i < 1 DFBCOR DFDRV Get residuals, c2, V = dc2/dq c2 > c2(old) LEQU64 Vdq = q; q = q + dq Dc2 < cut CONVERGED Max iter? FAILED

  7. Issues with DFITER and call limits • DFITER called at various points • at start of event • after each hit flipped in DFLIP • after DFMUSC, DFDEDX, etc. • at end of event • Max iterations in a single call: 8 • On failure, convergence criterion relaxed and called again • up to 15 times per track from most places • up to 15 times more for dE/dx and at end of event • Most tracks reconstructed differently show convergence problems

  8. Beginnings of an explanation Track • At first call to DFITER, parameters different by 10-5-10-6 • Inside DFITER, after LEQU64, difference increases to 10-5-10-4 • Differences accumulate with each call to DFITER • Eventually jump bins in fine t-s • Differences in % • Problem exacerbated when convergence difficult • Most critical track parameter: z • Can diverge by tens of cm, esp. in DFLIP Tries DFITER End Tries Hits Tries DFITER DFLIP End Tries End Hits Other Alg

  9. Notes on machine precision • Why do we see differences at 10-5 level at input to DFITER? • In principle possible to have exact agreement between platforms for single calculation • In practice depdends on optimization, autopromotion, rounding modes • E.g., AIX: our standard compiler flags do not round in single precision but autopromote single to double • Fair amount of code before this point; numerical errors accumulate rapidly • Part of solution will involve: • Tuning compilation parameters • Promoting key parts of track fit to double precision • NB: Matrix inversion already in double, looks OK • Worst case: V-1V = 1 to within < 10-12 (diag), 10-9 (off diag)

  10. 1. Residuals and drift distances not updated if hit not flipped • Basis for choice of next hit to flip • Always assume previous hit was flipped • 2. Failed DFITER calls count against max. retries • Looser convergence if lots of hits to flip (lots of failures) • Fewer calls to DFITER allowed • If a hit won’t flip, do we want to retry? • 3. Criterion for keeping flip: c2 < input c2 • Flip is kept even if c2 worse than best so far • 4. More subjective issues • No use of information on z-progression? Problems with DFLIP DFLIP Get hits to flip Sort; pick worst Store track pars. Flip hit DFITER (15 times) c2 < input Restore track pars. Return

  11. A possible wish list • More study of the problem • More sensible calling strategy for DFITER • Small fixes to DFLIP (for now) • Use l/2 instead of sampling for small drift distances • Double precision at key points • Compiler flags to consistently handle numerical inaccuracy • Smoothing of t-s relations, resolution curves • Evaluate efficacy of changes based on: • parameter resolution, track c2, L/R resolution accuracy, track splitting, machine stability • While monitoring traditional quantities: • efficiency, hit efficiency, purity, CPU time

  12. A final note: Time is critical! August downtime will be only point in near future when large-scale reprocessing possible

More Related