80 likes | 165 Views
Wahid Bhimji is conducting Xrootd testing for redirectors used in High Energy Physics, with DPM support enhanced and ECDF adopting xrootd for file access. The project includes regional redirectors, traffic federation, and growth in production levels. Further developments involve testing SRM at ECDF to join the “Storage Interfaces'' Working Group. Additionally, HEPDOOP, a proposal bridging Big Data and HEP, aims to provide advanced data processing tools for academia and industry, starting with ATLAS Higgs analysis using non-HEP tools.
E N D
Future developments: storage Wahid Bhimji
Xrootd testing • Xrootd as a file access protocol used by HEP that offers both performance in file access as well as failover / redirection. Recently DPM support of this improved. • ECDF (and Glasgow) now use xrootd copying instead of DPM’s legacy protocol rfio. • We are involved in testing the redirection aspects for ATLAS (“FAX”) too. • http / WebDav offers a more widely used alternative
Federation traffic Modest levels now will grow when in production • In fact inc. local traffic UK sites dominate • Oxford and ECDF switched to xrootd for local traffic
Systematic FDR load tests in progress EU cloud results Slide Stolen from I. Vukotic Absolute values not important (Affected by CPU /HT Etc.) and setup Point is remote read can be good but varies
Other stuff • Puppet: Testing DPM modules for ECDF storage • But: we don’t use for WNs or anything else … • S3: (with Imperial Swift instance not ECDF) • DPM integration: some problems with accessing swift storage - new development version to test… • Access of files from cluster via ROOT - not done yet • SRM: making ECDF a non-SRM site for Atlas • As part of WLCG “Storage Interfaces” Working Group • Stage-out; FTS3 copies; space reporting– all in progress
“HEPDOOP” – a proposal • “Big data” – not a buzzword: plenty of industry activity • HEP uses little of the same tools • HEPDOOPbridges the divide 1st Phase:, 1 year : Technical review via demonstrators • Workshops with interspersed development activities • Use-case focused: Deliver ATLAS Higgs analysis with non-HEP tools • Milestones: • BigData Workshop Imperial 28th June • CHEP2013 (poster + possible birds-of-a-feather session) 2nd Phase: Possible ongoing activity providing a technical-level bridge between GridPP and wider Big Data communities: • Continuing interoperability in the case of common aims • Delivering advanced data processing and management tools for HEP, wider academia, and industry.
Ntuple making Data Filtering Skimming/ Slimming Data Mining Cuts Multivariate Analyses Statistical Analysis Visualisation Initial development areas Principle: focus on ease-of-use and access to wide community not (just) performance Typical HEP analysis flow: Starting with skimming and mining Python (scikit) version of H->bb analysis implemented Next step: map / reduce skimming code on local Hadoop cluster (or cloud resources)