1 / 18

A first look at CitusDB & in-database physics analysis

A first look at CitusDB & in-database physics analysis. M. Limper 19/06/2014. Introduction. Physics Analysis is currently file-based Scanning through large datasets can be cumbersome The idea: send jobs to the computing grid

kaipo
Download Presentation

A first look at CitusDB & in-database physics analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A first look at CitusDB & in-database physics analysis M. Limper 19/06/2014

  2. Introduction • Physics Analysis is currently file-based • Scanning through large datasets can be cumbersome • The idea: send jobs to the computing grid • In practice: bored waiting for grid-job to finish, scientists filter datasets, throwing away data until it fits on the physicist’ laptop • What if we could provide access to large datasets via a database?

  3. Introduction In-database physics analysis: • SQL goes in, results come out! J/ψ Ψ(3686)

  4. My data Test sample of 127 ntuple-files of collision-data recorded by ATLAS experiment => subset re-presenting 3 fat ‘LHC runs’, ~0.5% of total dataset • 7.1 million events total • 6022 “branches” per event • 2053 “scalar”-type branches • 3527 “vector”-type branches • 379 “vector-of-vector”-type branches • 63 “vector-of-vector-of-vector”-type branches • ~200 GB of data

  5. Ntuple branch examples • 2053 scalar-type variables MissingEnergy: one value per branchevent Float_tMET_RefFinal_em_etx; Float_tMET_RefFinal_em_ety; Float_tMET_RefFinal_em_phi; Float_tMET_RefFinal_em_et; Float_tMET_RefFinal_em_sumet; Float_tMET_RefFinal_etx; Float_tMET_RefFinal_ety; Float_tMET_RefFinal_phi; Event Filter: one value per branch per event Bool_t EF_2b55_loose_j145_j55_a4tchad; Bool_t EF_2e12Tvh_loose1; Bool_t EF_2e5_tight1_Jpsi; Bool_t EF_2e7T_loose1_mu6; Bool_t EF_2e7T_medium1_mu6; Bool_t EF_2g15vh_medium_g10_medium; Bool_t EF_2g20vh_medium; Lots of variables but relatively small fraction of the total dataset

  6. Ntuple branch examples • 3527 vector-type variables One value per electron per event vector<float> *el_E; vector<float> *el_Et; vector<float> *el_pt; vector<float> *el_m; vector<float> *el_eta; vector<float> *el_phi; vector<float> *el_px; vector<float> *el_py; vector<float> *el_pz; vector<float> *el_charge; vector<int> *el_author; One value per muon per event vector<unsigned short> *mu_allauthor; vector<int> *mu_author; vector<float> *mu_beta; vector<float> *mu_isMuonLikelihood; vector<float> *mu_matchchi2; vector<int> *mu_matchndof; vector<float> *mu_etcone20; vector<float> *mu_etcone30; vector<float> *mu_etcone40; vector<float> *mu_nucone20; vector<float> *mu_nucone30; vector<float> *mu_nucone40; One value per photon per event vector<float> *ph_CaloPointing_eta; vector<float> *ph_CaloPointing_sigma_eta; vector<float> *ph_CaloPointing_zvertex; vector<float> *ph_CaloPointing_sigma_zvertex; vector<float> *ph_HPV_eta; vector<float> *ph_HPV_sigma_eta; vector<float> *ph_HPV_zvertex; vector<float> *ph_HPV_sigma_zvertex; vector<int> *ph_NN_passes; vector<float> *ph_NN_discriminant; Representing the bulk of the data (many particles per event!) Analysis relies heavily on filtering events by selection particles with certain properties

  7. Ntuple branch examples • 379 vector-of-vector type variables One value per ‘SpaceTime’-measurement on each muon per event: vector<vector<int> > *mu_SpaceTime_detID; vector<vector<float> > *mu_SpaceTime_t; vector<vector<float> > *mu_SpaceTime_tError; vector<vector<float> > *mu_SpaceTime_weight; One value per vertex per photon per event: vector<vector<float> > *ph_vx_px; vector<vector<float> > *ph_vx_py; vector<vector<float> > *ph_vx_pz; vector<vector<float> > *ph_vx_E; vector<vector<float> > *ph_vx_m; vector<vector<int> > *ph_vx_nTracks; Used for certain reconstruction performance studies To be stored in CLOB or separate table…

  8. Ntuple branch examples • 63 vector-of-vector-vector type variables One value per track per vertex per photon per event: vector<vector<vector<int> > > *ph_vx_convTrk_nSiHits; vector<vector<vector<float> > > *ph_vx_convTrk_TRTHighTHitsRatio; vector<vector<vector<float> > > *ph_vx_convTrk_TRTHighTOutliersRatio; vector<vector<vector<float> > > *ph_vx_convTrk_eProbabilityComb; Not using any of these in my queries, typically used for final corrections or certain in-depth studiesof reconstruction performance To be stored in CLOB or separate table…

  9. Converting ntuples to tables • Self-made program to convert ntuples into database tables • One physics-object is represented by one table • Each table still has hundreds of columns!

  10. SQL analysis SQL analysis involves predicate filtering to select good objects and JOINs to put information from different tables together: CitusDB+column-store extension looks interesting: • Object selection involves only a few out of many columns => would benefit from column storage • When preselection passes many objects, JOINs can potentially become huge => would benefit from sharding, with shard-distribution based on EventNumber to reduce JOIN-size

  11. Storing ntuple-data into to CitusDB • Re-wrote my program to store data in CitusDB • Read data from all branches with specific prefix • Write data as comma-delimited values in temporary value • After csv-file passes 5000 lines of data, store data into CitusDB • Program trigger command-line argument for psql to execute psql-macro • Psql-macro uses \STAGE command to load data

  12. Create table statement CREATE FOREIGN TABLE eventdata203779_c (RunNumber INTEGER NOT NULL,EventNumber INTEGER NOT NULL, lbn INTEGER NOT NULL,"bunch_configID" INT,"timestamp" INT,"timestamp_ns" INT,"bcid" INT,"detmask0" INT,"detmask1" INT,"actualIntPerXing" FLOAT,"averageIntPerXing" FLOAT,"pixelFlags" INT,"sctFlags" INT,"trtFlags" INT,"larFlags" INT,"tileFlags" INT,"fwdFlags" INT,"coreFlags" INT,"pixelError" INT,"sctError" INT,"trtError" INT,"larError" INT,"tileError" INT,"fwdError" INT,"coreError" INT,"streamDecision_Egamma" BOOLEAN,"streamDecision_Muons" BOOLEAN,"streamDecision_JetTauEtmiss" BOOLEAN,"isSimulation" BOOLEAN,"isCalibration" BOOLEAN,"isTestBeam" BOOLEAN,"el_n" INT,"v0_n" INT,"ph_n" INT,"mu_n" INT,"tau_n" INT,"trk_n" INT,"jet_n" INT,"vxp_n" INT,"top_hfor_type" INT,"Muon_Total_Staco_STVF_etx" FLOAT,"Muon_Total_Staco_STVF_ety" FLOAT,"Muon_Total_Staco_STVF_phi" FLOAT,"Muon_Total_Staco_STVF_et" FLOAT,"Muon_Total_Staco_STVF_sumet" FLOAT,"Muon_Total_Staco_STVF_top_etx" FLOAT,"Muon_Total_Staco_STVF_top_ety" FLOAT,"Muon_Total_Staco_STVF_top_phi" FLOAT,"Muon_Total_Staco_STVF_top_et" FLOAT,"Muon_Total_Staco_STVF_top_sumet" FLOAT,"mb_n" INT,"collcand_passCaloTime" BOOLEAN,"collcand_passMBTSTime" BOOLEAN,"collcand_passTrigger" BOOLEAN,"collcand_pass" BOOLEAN) DISTRIBUTE BY APPEND (EventNumber) SERVER cstore_server OPTIONS(filename '', compression 'pglz'); • Create foreign tables stored using column-store extension • Distribute shards by EventNumber • Keep data from the same event together • Facilitate joins between different tables • One table per RunNumber:distribute shards by (RunNumber,EventNumber) not possible

  13. \STAGE statement \STAGE eventdata203779_c (RunNumber,EventNumber,lbn,"bunch_configID","timestamp","timestamp_ns","bcid","detmask0","detmask1","actualIntPerXing","averageIntPerXing","pixelFlags","sctFlags","trtFlags","larFlags","tileFlags","fwdFlags","coreFlags","pixelError","sctError","trtError","larError","tileError","fwdError","coreError","streamDecision_Egamma","streamDecision_Muons","streamDecision_JetTauEtmiss","isSimulation","isCalibration","isTestBeam","el_n","v0_n","ph_n","mu_n","tau_n","trk_n","jet_n","vxp_n","top_hfor_type","Muon_Total_Staco_STVF_etx","Muon_Total_Staco_STVF_ety","Muon_Total_Staco_STVF_phi","Muon_Total_Staco_STVF_et","Muon_Total_Staco_STVF_sumet","Muon_Total_Staco_STVF_top_etx","Muon_Total_Staco_STVF_top_ety","Muon_Total_Staco_STVF_top_phi","Muon_Total_Staco_STVF_top_et","Muon_Total_Staco_STVF_top_sumet","mb_n","collcand_passCaloTime","collcand_passMBTSTime","collcand_passTrigger","collcand_pass") FROM '/data1/citus_db/csv/NTUP_TOPEL.00872780.NTUP_TOPEL.00872780._000001.root.1.eventdata.csv' (FORMAT CSV) \STAGE eventdata203779_c FROM '/data_citusdb/csv/NTUP_TOPEL.00872780.NTUP_TOPEL.00872780._000001.root.1.eventdata.csv' (FORMAT CSV) • Gives: \copy: ERROR:  copy column list is not supported • I can’t define columns when inserting into foreign tables using \STAGE • Too bad, I found it useful to specify the columns as different ntuple can contain different branches: if column is not specified in csv, it should insert null • Similarly I’d like to have an option to add columns (is this possible? Didn’t look at it yet) • Instead I’ll use simple \STAGE command:

  14. Primary Key issues • I’d like to set primary key set on (RunNumber,EventNumber,ObjectNumber). • Ntuple-files occassionaly store the same event twice • Due to the way experiments records data from ‘streams’, some overlap from different streams • Sorting out doubles is yet another hassle for physicists to deal with, using a database with primary-key ensures unique event are store… but: • Currently using \STAGE insert of all data in the entire .csv-file fails when it find 1 double among the 50000 lines • Work-around= no primary key constraint for now…

  15. Other issues While testing, I’m frequently deciding to recreate some tables, but how do I drop FOREIGN table including the shards?

  16. Storing ntuple-data into to CitusDB Example of insert-program churning through the data…

  17. Query test 15:34 Do I still have time to test something before the call??

  18. To-do • Insert all my data: • I need to find a good way to use my 18 disks per node (mystery errors were coming from my raid setup) • Maybe I just mount each disk separately and run one worker per disk? • Get some queries going!

More Related