update on atlas storage layout and performance n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Update on ATLAS storage layout and performance PowerPoint Presentation
Download Presentation
Update on ATLAS storage layout and performance

Loading in 2 Seconds...

play fullscreen
1 / 8

Update on ATLAS storage layout and performance - PowerPoint PPT Presentation


  • 119 Views
  • Uploaded on

Update on ATLAS storage layout and performance. Peter van Gemmeren, David Malon WAN data access and caching Meeting. Outline. Introduction Release 17 ATLAS storage layout ROOT TTreeCache developments Event server design ideas Outlook. Release 17 ATLAS storage layout.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Update on ATLAS storage layout and performance' - sileas


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
update on atlas storage layout and performance

Update on ATLAS storage layout and performance

Peter van Gemmeren, David Malon

WAN data access and caching Meeting

outline
Outline
  • Introduction
  • Release 17 ATLAS storage layout
  • ROOT TTreeCache developments
  • Event server design ideas
  • Outlook

Peter van Gemmeren: Update on ATLAS storage layout and performance

release 17 atlas storage layout
Release 17 ATLAS storage layout
  • In release 17 we introduced new ROOT storage layout for RDO, ESD and AOD data to better match the transient event store and support single event reads more efficiently.

POOL/ROOT

File

POOL/ROOT

File

m_a,b,c,d

Basket for few events streamed member-wise

m_b

Bas-kets

m_c

Bas-kets

m_d

Bas-kets

Separate Baskets for all members in Container 2…

Single Baskets for all members in Container 2…

m_a1

m_a2

StoreGate

StoreGate

OLD

m_a

Bas-

ket 2

Baskets for subsequent events

Container 2…

m_a

m_b

m_c

m_d

Event 2

Event 3

NEW:

  • member-wise streaming
  • Auto-flush 1/5/10 events

Event 4 … last-1

Event last

Peter van Gemmeren: Update on ATLAS storage layout and performance

slide4
Release 17 AOD read performanceReading all events sequentially Selective reading 1% of events via TAGs

Further Performance Results:

  • File sizes approximately unchanged
  • Write speed increased by 50%, due to relaxed compression
  • Memory savings of more than 50 MB for (each) reading and writing.

Peter van Gemmeren: Update on ATLAS storage layout and performance

root ttreecache
ROOT TTreeCache
  • “TTreeCacheshown to be of substantial benefit to direct I/O performance, essential for WAN”
  • ROOT’s TTreeCache optimizes read performance of a TTree by minimizing the number of and ordering disk reads:
    • In a ‘learning phase’ it discovers the TBranches that are needed.
    • After that, a read on any TBranch will also read the rest of the TBranches in the cache with a single read.
  • ROOT TTreeCache can have a huge impact on read performance:
    • Reduce number of disk-reads by several orders of magnitude
      • Impact depends on system setup and use case.
  • However, there are restrictions in the usability of TTreeCache:
    • Only one automatic TTreeCache per TFile.
    • Slow learning phase.
      • No caching while learning.

Peter van Gemmeren: Update on ATLAS storage layout and performance

ttreecache developments
TTreeCache developments
  • Support multiple TTreeCache’s per TFile automatically.
    • ATLAS (and other experiments) use severaltrees per file.
      • Event data, references, auxiliary t-p converter extensions.
      • Partly due to ROOT restriction that allbaskets need to be in sync.
    • Extends benefits of read caching to alltrees in the file.
  • Investigate and optimize TTreeCache start-up and learning phase.
    • Largely done by ANL summer student.
    • Work resulted in several small improvements to TTreeCache start-up time.
    • Learning phase to be optimized by pre-reading all branches (if possible), rather than individual branch reads.
      • Still under development

TTree

Cache

Store

Gate

Main Event

TTree

New feature: Multiple TTreeCache per File

TTree

Cache

Auxiliary TTree

Peter van Gemmeren: Update on ATLAS storage layout and performance

design idea persistent event delivery service
Design idea: Persistent event delivery service
  • Everybody knows that ATLAS uses Transient- Persistent separation and that this has been a great success.
    • Simplifies the data model so ROOT can reliably store it in files.
    • Enables schema changes beyond the capabilities of ROOT (or any storage backend).
    • Significantly improves I/O performances: read & write speed, storage footprint…
  • But the existence of a persistent event data model has even more potential:
    • Exchanging event data between processes is easier using the simpler persistent representation of the objects.
  • However, currently, the T/P conversion layer is tightly intertwined with the Athena-POOL conversion service and needs to be overhauled:
    • Possible approach: Create a separate Gaudi ConversionSvc to trigger T-P conversion (at the end of an event, before the output streams) and store representations in a ‘persistent’ StoreGate (just another instance). Output streams then can write persistent objects from the new store.
      • And other clients could use this feature to send event data to other processes.

Peter van Gemmeren: Update on ATLAS storage layout and performance

outlook
Outlook
  • Event selection reading
  • Multi-core event processing
  • And WAN data access and caching share common/similar requirements on the I/O framework and persistency:
    • Event data retrieval granularity:
      • Single processes do not need all the events
    • Sending event data between processes
      • E.g.: Event Scatter/Gather for AthenaMP
  • But may also have conflicts
    • …?
  • Therefore it is most important to be aware of all use cases during the ongoing redesign of our I/O framework and persistency.
    • https://indico.cern.ch/conferenceDisplay.py?confId=171724

Peter van Gemmeren: Update on ATLAS storage layout and performance