130 likes | 273 Views
This document discusses the optimization strategies and challenges faced in enhancing LHCb software compilation using various compilers, including GCC and ICC. Key areas of focus include optimizing compiler flags, utilizing vectorization techniques, and improving performance testing methodologies. The need for reference jobs for performance comparisons is highlighted, as well as the integration of static code analysis tools for further optimization. The results should lead to an incremental improvement in code quality and efficiency, addressing licensing issues and enhancing the overall performance of the LHCb software stack.
E N D
Software Optimization Stefan Roiser
Compilers • Compiler candidates in addition to gcc • icc • Is the licensing issue sorted out with openlab? • Status: AA nightlies OK, LHCbnightlies almost OK • Performance testing needed • llvm • Performance should be now in the same range as gcc • AA nightlies are currently broken • We need one / several “reference job/s” for performance comparisons S. Roiser - LHCb SW Programme of Work
Compiler options • Currently -O3 is being tested for LHCbsw stack • Again we need a “reference job” • If we optimize, how deep should be the stack? • Only LHCb projects • “Major impact” AA packages (e.g. Boost, GSL, …) • All AA packages S. Roiser - LHCb SW Programme of Work
Optimization • Vectorization, compile with SSE on • How much could we possibly gain by using it? • Is the code prepared for it? probably not… • Many more compiler flags available • What is the proper combination? • Optimize on the CPU architecture level? S. Roiser - LHCb SW Programme of Work
User code vectorization • “Bottom up” approach • Find patterns in user code that can be vectorized • Provide an abstraction with use of intrinsics • Optimized on the CPU architecture (== SIMD instruction set) • MP boxes appear ~1 yr after new micro architecture • Advantage -> it’s not a “big bang” but rather an incremental improve of the code base S. Roiser - LHCb SW Programme of Work
Dictionary generation • Currently we use gccxml to generate dictionary information for persistency and interactive usage. The gcc version internal to gccxml is 4.2 and will not be upgraded • Closely connected to ROOT upgrade scenario for production releases • June ‘12, cling will be enabled • Dec ’12, persistency will work, maybe with genreflex replacement? • June ‘13, genreflex replacement available, S. Roiser - LHCb SW Programme of Work
Possible scenarios for reflection • Stay as we are with gccxml • We need to carry new compiler options forward • Is already problematic for gcc 4.6 (disable warnings) • Use the possibility of gcc plugins • Extract “gccxml” code to produce reflection XML • Prototype exists, could be a stop-gap solution • Do nothing, we wait for ROOT/clang • Possibility to test full chain at latest in June ‘13 S. Roiser - LHCb SW Programme of Work
Profiling • Little effort so far within LHCb • What do we want to profile? • CPU, Memory, Disk I/O, Network, … • Areas? • HLT, Offline productions, User analysis • Probably can be generalized for all Gaudi projects S. Roiser - LHCb SW Programme of Work
Possible Tools • Static code analyzer • E.g. Coverity, can check for special patterns • Probably very good as a first approximation • Is available for LHCb software • Profiler • Flat – valgrind • Sampling – gprof, vtunes, S. Roiser - LHCb SW Programme of Work
Existing Tool • CPU profiling based on intelVtune Amplifier • Used for HLT profiling so far • Extension for Gaudi algorithms (via Gaudi Auditor) CPU / Gaudi algorithm CPU / line of code S. Roiser - LHCb SW Programme of Work
Final thoughts • Most of the areas of software optimization need much more investigation • All together a lot of work is needed in this area • All ranges of efforts available • Few weeks -> medium size projects • Dedicated manpower definitely necessary S. Roiser - LHCb SW Programme of Work