analysis framework status n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Analysis Framework - status PowerPoint Presentation
Download Presentation
Analysis Framework - status

Loading in 2 Seconds...

play fullscreen
1 / 8

Analysis Framework - status - PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on

Analysis Framework - status. Andrei Gheata , Mihaela Gheata , Andreas Morsch ALICE Offline Week – 16 Nov 2010. Few words of introduction. The AF provides handful tools and freedom to do analysis at large scale

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Analysis Framework - status' - naiya


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
analysis framework status

Analysis Framework - status

Andrei Gheata, MihaelaGheata, Andreas Morsch

ALICE Offline Week – 16 Nov 2010

few words of introduction
Few words of introduction
  • The AF provides handful tools and freedom to do analysis at large scale
    • … but the resources allowing to do that are stretched to the limits while operated in emergency + chaotic mode.
  • Tolerance limits had to be introduced, but these cannot prevent all kind of misuses or abuses
    • … so everybody is kindly asked to follow few basic rules:
      • Read the documentation – it is better to understand than copy code with bugs
      • Develop tasks locally, not on CAF/GRID
      • Check not only if your task works, but also how much resources it needs (memory, CPU time)
      • Do NOT process entire productions for a single task – if you need to do that it is maybe the time to move the code in AliRoot so that it can be run in a central train – talk to your PWG software coordinators
      • Use par files only if you have to and DO NOT use the framework par files (ESD,AOD, ANALYSIS*,CORRFW). The framework supports compiling your task against the core libraries.
alien handler staged merging
AliEn handler – staged merging
  • Previously merging done accessing remote files from AliEn
    • In groups defined via SetNMaxMergeFiles, resuming supported, but scaling as Nfiles
  • New SetMergeViaJDL, supporting merging in stages in alien
    • Number of files per chunk set using the same SetNMaxMergeFiles
    • Sending Nfiles/Nper_chunk jobs for each merging stage, scaling as log(Nfiles)
    • When running locally the plugin in “terminate” mode, the output of the merging jobs for the current stage is checked and the meging jobs for the missing outputs are resubmitted
      • NOTE: check in aliensh that all merging jobs are in a final status.! It is preferable to resubmit failed jobs from aliensh.
    • If using a list of runs or run ranges, the final merging step will merge the outputs per run on the local client and run Terminate for the connected tasks
  • Possible improvement: create a collection of the output files, then merge according the splitting
    • Requires splitting to be uniform
    • Could be made dynamic if we would have:
      • InputCollection = “FIND: basedir wildcard”
    • Would be very useful for automatic merging of AODs
slide4

MyAnalysis_merge(“…/Output, stage=1, chunk)

MyAnalysis_merge(“…/Output, stage=2, chunk)

MyAnalysis.root

wn.xml

Output/001/

AnalysisResults.root

Output/00n/

AnalysisResults.root

Output/006/

AnalysisResults.root

Output/005/

AnalysisResults.root

Output/004/

AnalysisResults.root

Output/003/

AnalysisResults.root

Output/002/

AnalysisResults.root

Output/

AnalysisResults_

Stage01_002.root

Output/

AnalysisResults_

Stage01_001.root

Output/

AnalysisResults_

Stage01_000.root

MyAnalysis.root

wn.xml

Output/

AnalysisResults.root

plugin->SetMergeViaJDL()

plugin->SetMaxMergeFiles(3)

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

123456.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

123456.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

MyAnalysis.root

wn.xml

plugin->SetRunMode(“full”)

StartAnalysis(“grid”)

plugin->SetRunMode(“terminate”)

StartAnalysis(“grid”)

plugin->SetRunMode(“terminate”)

StartAnalysis(“grid”)

mgr->Terminate()

(per run)

Running again with SetMergeViaJDL(kFALSE) will merge all runs on the client.

MyAnalysis.root

wn.xml

single task optimization
Single task optimization
  • Loading manually only the requested branches at task level (C.Loizides)
  • Calls only for the selected eventsbranch->GetEntry()instead of tree->GetEntry()
  • Can highly reduce I/O for single task (or few task) analysis
    • Specially when looking for rare events w/o tags and/or needing well-localized information
  • This is an expert setting: when misused will silently make data belonging to not loaded branches unavailable !!!

STEERING MACRO:

analysisManager->SetAutoBranchLoading(kFALSE);

TASK:

//______________________________________________________________

void AliAnalysisTaskPt::UserExec(Option_t *)

{

...

AliAnalysisManager *am = AliAnalysisManager::GetAnalysisManager();

am->LoadBranch("AliESDHeader.");

am->LoadBranch("AliESDRun.");

/* should have meaningful check here, use dummy just to illustrate example*/

if (some_condition_on_event_header_or_ESDRun_object) {

return;

}

// We can load the interesting branches:

am->LoadBranch("Tracks");

// Track loop to fill a pT spectrum

printf("There are %d tracks in this event\n", fESD->GetNumberOfTracks());

... track loop

proof analysis via the alien handler
PROOF analysis via the AliEn handler
  • New API to add configuration related to the proof cluster
  • Completely transparent for the user task and for the steering macro
    • mgr->StartAnalysis(“proof”)
  • Most AF features available
    • Plugin “test” mode will run the analysis in proof lite mode on a local chain described bi the file used in SetFileForTestMode

/*********************************************************

*** PROOF MODE SPECIFIC SETTINGS ************

*********************************************************/

// Proof cluster

plugin->SetProofCluster("alice-caf");

// plugin->SetProofCluster("skaf.saske.sk");

// Dataset to be used

// plugin->SetProofDataSet("/alice/data/LHC10e_000128175_p1#esdTree");

plugin->SetProofDataSet("/alice/data/LHC10e_000128452_p1#esdTree");

// May need to reset proof. Supported modes: 0-no reset, 1-soft, 2-hard

plugin->SetProofReset(0);

// May limit number of workers

plugin->SetNproofWorkers(0);

// May limit the number of workers per slave

plugin->SetNproofWorkersPerSlave(1);

// May use a specific version of root installed in proof

plugin->SetRootVersionForProof("current");

// May set the aliroot mode. Check http://aaf.cern.ch/node/83

plugin->SetAliRootMode("default"); // Loads AF libs by default

// May request ClearPackages (individual ClearPackage not supported)

plugin->SetClearPackages(kFALSE);

// Plugin test mode works only providing a file containing test file locations

plugin->SetFileForTestMode("files.txt");

// Request connection to alien upon connection to grid

plugin->SetProofConnectGrid(kFALSE);

return plugin;

analysis statistics information
Analysis statistics information
  • EventStat_temp.root available in AOD analysis
    • Added method AliInputEventHandler::GetStatistics(Option_t *option) to retrieve the physics selection histograms in ESD and AOD analysis.
    • This method will return the statistics TH2F histograms filled by the AliPhysicsSelection in case the task AliPhysicsSelectionTask is used in the ESD train (or used during AOD production).
    • To use, the user task must call this method during FinishTaskOutput (executed on the worker after all events are processed)

AliAnalysisManager *am = AliAnalysisManager::GetAnalysisManager();

AliInputEventHandler *inputH = dynamic_cast<AliInputEventHandler*>(am->GetInputEventHandler());

if (!inputH) return;

TH2F *histStat = dynamic_cast<TH2F*>(inputH->GetStatistics());

TH2F *histBin0 = dynamic_cast<TH2F*>(inputH->GetStatistics("BIN0"));

  • AliAnalysisManager::AddStatisticsMsg() to add user messages lelated to the processed statistics.
    • The analysis manager dumps all messages in a file called <nevents>.stat (nevents in format %09d) that is written after processing on the slave but also during Terminate (client)
cdb access and run number
CDB access and run number
  • New task created for CDB access in the QA train
    • PWG1/AliTaskCDBconnect.h/.cxx
    • To be used by tasks in central trains
  • Generally the run number is not accesible in UserCreateOutputObjects
    • Added new static method AliAnalysisManager::GetRunFromAlienPath() that extracts the run number from the path to data (must be an alien path to data or MC). This is used by the plugin to set the new data member AliAnalysisManager::fRunFromPath which is available in UserCreateOutputObjects of any task if running via the plugin in grid mode.
  • Use mgr->GetRunFromPath() in your task to get this