1 / 30

Belle II Data Handling System

May 23-25, 2011 4 th Belle II Computing Workshop Ljubljana, Slovenia. Belle II Data Handling System. Kihyeon Cho High Energy Physics Team KISTI (Korea Institute of Science and Technology Information). Contents. Belle II Data Handling Group Belle II Data Handling Scenario

ham
Download Presentation

Belle II Data Handling System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. May 23-25, 2011 4th Belle II Computing Workshop Ljubljana, Slovenia Belle II Data Handling System Kihyeon Cho High Energy Physics Team KISTI (Korea Institute of Science and Technology Information)

  2. Contents • Belle II Data Handling Group • Belle II Data Handling Scenario • Embedment metadata system @ grid farm • Standalone metadata registration tool kit • Snapshot vs. query • Products • To do list • Summary

  3. Belle II Data Handling Group • KISTI • HEP Team – JungHyun Kim, Kihyeon Cho, YoungJin Kim, Taegil Bae, … • AMGA Team – Soonwook Hwang, Sunil Ahn, Taesang Huh, Geunchul Park,… • Melbourne • Tom Fifield, Martin Sevior, … • Krokow • Maciej Sitarz, Mitosz Zdybat, Rafat Grzymkowski, … • KEK, Karlsruhe, etc…

  4. Belle II Data Handling Group meeting • First and Third Thursday 5:00 PM (KEK Time) • Latest meeting • - May 12, 5:00 PM • Upcoming meeting • - June 2, 5:00 PM

  5. The 3rd Belle II Computing Workshop • Date: November 22-24 (Mon-Wed.) 2010 • 1st day (Monday): Off-line and HLT software • 2nd day (Tuesday): Distributed Computing and Data Handling • 3rd day (Wednesday) • Morning – Overflow • Afternoon- Excursion • Place: KISTI, Daejeon, Korea • Participants: More than 30 persons from 9 countries • Just after Belle II General Meeting at KEK

  6. News @ Belle II DH group • Manpower • AMGA team - Sunil Ahn, Taesang Huh, Geunchul Park, Soonwook Hwang (P.I)… • Sunil Ahn will be at USC, USA from June for one year. He will work there. • However, Taesang Huh is mainly in charge of AMGA stuffs. => To give a talk about his activities on Wednesday

  7. Meetings • KISTI DH meeting • Date: April 27, 2011 11:00-13:00 • Participants: Kihyeon Cho, Junghyun Kim, Sunil Ahn, Taesang Huh, Soonwook Hwang, etc. • Contents: The status and plan • Korea Belle/Belle II Collaboration Meeting • Date: April 30, 2011 (Saturday) • Place: Hanyang University, Seoul, Korea • Report on Belle II DH Group – Kihyeon Cho

  8. Meetings • Bell II DH meeting • Date: May 12, 2011 17:00-18:00 • Talks • Status of Belle II Data Handling group – Kihyeon Cho • Activities for data handling system – Taesang Huh • The 4th Belle II computing workshop • Date: May 23-25, 2011 • Place: Ljubljana, Slovenia ⇒ Three talks by KISTI • Kihyeon Cho (Monday) – status report • Junghyun Kim (Wednesday) – metadata system • Taesang Huh (Wednesday) – user-created metadata set

  9. Upcoming meetings (cont’d) • Korea Belle/Belle II Collaboration meeting • Date: End of August, 2011 • Place: KISTI, Daejeon, Korea • BAM/BAS • Date: Sep. 26-27, 2011 (BAM) Sep. 28-30, 2011 (BAS) • Place: Lotte Hotel, Jeju Island, Korea • Host: KISTI, Korea U., Yonsei U. SNU, etc.

  10. Belle II Data Handling scenario Raw Data Storage And Processing Tape Raw Data KEK Grid Site Grid Site CPU mDST Data Disk mDST MC Ntuples AMGA Client gbast2 UI Data Tools DIRAC Data Tools Data Tools MC Production (optional) Data Tools MC Production And Ntuple Production Cloud Ntuple Analysis Local Resources Local Resources Local Resources Local Resources Belle II computing model

  11. USER ① KEK Detector ②,③ AMGA system data Tape Metadata Storage Analysis File Location LFC Computing Farm ④ File Replication GRID Site Storage File Location ① Metadata Query ⑥ ② List of Files & Events ③ List of Grid Sites ④Request Job Execution ⑤ ⑤ LFN to PFN Computing Farm LFC ⑥File Read Belle II data handling scenario

  12. Status =>done • Constructed Belle II Data Handling System • Data Handling and Job Management for Belle II Grid => done • Test of Large Scale Data Handling • Belle Data at GSDC farm at KISTI => done • Around 125 TB for Belle data • Belle II Data (Random data) => • Realistic test (file level) => done Based on TDR    Raw data: 100 M files    Real : 4.3 M files     MC:     12.5 M files • Test of High Frequency Test => done • Embedment metadata system @ grid farms

  13. Embedment of metadata system @ grid farms UI Grid Farm Belle VO Belle II VO @ KISTI @ Melbourne @ Beijing @ KISTI @ Melbourne @ Ljubljana (not yet) AMGA Server • Sunil has written documentation of how to submit grid jobs. • We had tested AMGA Sever at HEP team(150.183.246.196). • AMGA Server at Melbourne is in Grid. • => To write draft for JKPS (Journal of the Korean Physical Society)

  14. Scheme of test of metadata system @ grid farms KEK WN Tape Disk Melbourne KISTI UI UI AMGA System AMGA System Disk WN Disk WN Beijing UI

  15. Test Results

  16. Output files

  17. Log files => To write the draft for JKPS (Journal of the Korean Physical Society)

  18. Standalone metadata registration tool kit Martin’s mail (April 14) => This is requested by the distributed computing group Data Now that many Belle skim datasets have been created and distributed around the world, I think it is very important we implement a dataset registration tool for our distributed computing solution. This is defined in redmine feature 196: http://ekpbelle2.physik.uni-karlsruhe.de/redmine/issues/196The feature would place a dataset on a grid enabled storage element, register it with the LFC and place the appropriate metadata associated with the skim in the AMGA meta-database. With this tool we can begin to use our distributed data analysis system to analyse Belle data. This project is particularly vital given the situation with B-computer at KEK. We have access to large amounts of computing power on the grid but without this feature we can't really use it. I thought that you might be interested working together in developing this feature since it involves data handling, AMGA and the python interface to AMGA. It would also give you a chance to become familiar with gbasf2.

  19. Stand-alone metadata registration tool kit • Junghyun and Taesang Huh • C++ modulefor basf2 1st step. To install basf2 @ KISTI • The difference between basf2 and gbasf2? • The concept of basf2 compared to roabasf @Belle • To check log file message and metadata => Working on progress

  20. Stand-alone metadata registration tool kit • Junghyun and Taesang Huh • C++ modulefor basf2 2nd step. C++ Module • Open metadata system • New metadata system with registration • Close metadata system • => To do list

  21. Issues (Snap shot vs. Query) • Methods • Snap shot • Query • Snap-shot like method • Snapshot method • Official data → 1.5TB • User created data • User file (signal) → 70 MB • Subset of official data → up to1.5 TB/user =>huge => Talk by Junghyun on Wednesday

  22. Snap shot vs. Query Metadata A 1. Snap shot B Duplicated Metadata Duplicated Metadata C … 2. Query Metadata A Query Query B C …

  23. 3. Snapshot-like Method Delete Delete OK? yes Metadata A No Keep metadata (Change Permission) Query Query B C … • Snap-shot style ⇒ snap-shot like style for user created data • To store queries which are not duplicated • To keep the metadata for old data

  24. Snap shot vs. Query Can users create official data set? 3. Snapshot-like Method Yes Up to 1.5TB/user ⇒ We will work this if necessary. No Partially allowed? 2. Query Yes ※ We need Belle II policy. • 1.Can users create official data set? • (ex. Grand reprocessing data) • 2. Can users create data set partially? A few GB No 1. Metadata Snapshot Only for official data → 1.5 TB only

  25. Products 2011 Products • International Conference talk • Meta-information system for Belle II experiment • J.H. Kim, YongPyong 2011 Conference, Korea, Feb. 2011 • Drafts for JKPS (Journal of the Korean Physical Society) • The embedment of metadata system at grid farms at the Belle II Experiment • Working on progress • by this month • User created metadata set • This summer

  26. Newsletter(2011.3.9)

  27. To do list To do list • Working on • Stand-alone metadata registration tool kit • Snap-shot like style • To restart event level R&D • Delayed • Data Transfer from HLT at Experiment hall to Computing center at KEK ⇒ Need a network expert • To move master node to KEK for making meta system for Belle II • To extract Metadata from Belle Data at KEK

  28. Summary Summary • In order to handle 50 times more data of Belle, Belle II Data Handling Group works on: • Metadata system and data cache system based on grids • The scalability of Large Scale Data Handling • using Belle and Belle II data • Full service for user friendly • Keep going on

  29. Plan 2014 • To do metadata system on grid ⇒ Full service • To do realization facility for production • To continue supporting and upgrade

  30. Thank you.

More Related