30 likes | 154 Views
This document outlines the distribution and validation of EWK datasets, noting the deletion of deprecated Summer08 datasets from T2. Recent transfers to CIEMAT and Wisconsin are highlighted alongside the need for code duplication in PAT for enhanced cross-validation. There are significant concerns regarding duplicate events within the Zjets-Madgraph sample which require attention to ensure robust validation. Furthermore, potential problems in merging logs and other datasets necessitate a brute force search for duplicate events, emphasizing the importance of accurate mass distributions and event selection bias in future central productions.
E N D
EWK Dataset Distribution/Validation • Deprecated Summer08 datasets have been deleted from all T2 • https://hypernews.cern.ch/HyperNews/CMS/get/ewk/201.html • Datasets transferred to CIEMAT and Wisconsin since last EWK meeting • https://hypernews.cern.ch/HyperNews/CMS/get/ewk/199/1.html • https://hypernews.cern.ch/HyperNews/CMS/get/ewk/202.html • Standard validation jobs run over GEN-SIM-RECO, AOD-SIM • Validation jobs run locally at Wisconsin • Action item: Need to duplicate code in PAT for cross-validation 20 September 2014
Duplicate Madgraph Events • “Large number of duplicates in Zjets-madgraph sample” (found by Dmytro Kovalskyi) • https://hypernews.cern.ch/HyperNews/CMS/get/generators/477.html • Possibly problems in merging but log files are lost • https://hypernews.cern.ch/HyperNews/CMS/get/generators/477/1.html • Brute force search over Madgraph (and other?) datasets looking for duplicate sets of events almost certainly required • Solution for future needed in central production 20 September 2014
Another Madgraph Problem ... • Generator validation of VV->4l Madgraph gridpack shows several undesirable features which may (or may not) be common to all Madgraph production • https://hypernews.cern.ch/HyperNews/CMS/get/generators/443/2/2.html • W and Z mass distributions correlated with Pt • Some final state particles have extremely unphysical masses (e.g. tau mass > 100 GeV) • Very tight (+/- ~10 GeV) W and Z mass windows • Tight windows may be intrinsically necessary for Madgraph? • Problem which may be present in all Madgraph datasets is a bias in event selection based on the absolute numbers of jobs and events per job used in central production • Will produce high-statistics generator runs for all current Madgraph gridpacks to test this 20 September 2014