1 / 10

PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx development team

PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx development team Computing and Offline Monitoring Workshop 11/05/2011. Outline. PhEDEx monitoring PhEDEx webpage & datasvc PhEDEx plots PhEDEx shift monitoring PhEDEx latency monitoring PhEDEx agent monitoring

jmcwhorter
Download Presentation

PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx development team

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PhEDEx Monitoring NicolòMagini CERN IT-ES-VOS For the PhEDEx development team Computing and Offline Monitoring Workshop 11/05/2011

  2. Outline PhEDEx monitoring PhEDEx webpage & datasvc PhEDEx plots PhEDEx shift monitoring PhEDEx latency monitoring PhEDEx agent monitoring PhEDEx storage monitoring NOTE: This will be a quick summary, for more details see talk at O&C week https://indico.cern.ch/materialDisplay.py?contribId=9&sessionId=21&materialId=slides&confId=132001 11/5/2011

  3. PhEDEx Datasvc PhEDExDatasvc will be the unique service to access info in PhEDEx DB For PhEDEx web For external monitoring tools For your own PhEDEx monitoring tool! https://cmsweb.cern.ch/phedex/datasvc/doc Main areas of work in 2011 Performance Validation of writable APIs Consistency of arguments and output Adding new APIs as requested/needed SlowFiles, SlowSubscriptions, DataTypeUsage… 11/5/2011

  4. PhEDEx webpage Existing webpage not mantainable Single file of 10000 lines of perl code Next-gen prototype not widely used Unfamiliar, missing functionality  Gradually replace pages in old webpage with next-gen modules using datasvc as backend First example: new request panel Upcoming: subscriptions page Eventually with shopping cart Rest over the course of 2011 https://cmsweb.cern.ch/phedex 11/5/2011

  5. PhEDEx monitoring plots Porting to Overview/Plotfairy framework Note: Plotfairy backend support independent from maintenance of Overview page Working to complete by this summer Other options explored e.g. protovis 11/5/2011

  6. PhEDEx shift monitoring First next-gen monitoring panel for shifters available since a few months Others will be added in next months Other specialized monitoring panels already provided in next-gen prototype - but not widely used 11/5/2011

  7. Block latency monitoring Latency monitor schema/agents Debugging/understanding content of current table Will extend schema to record more events e.g. 25%/50%/75%/95% block completion mark In progress, should be on Testbed by end of the month Latency visualisation – in Summer Datasvc API Latency plots To explore: Publish per-file latency stats from FilePump logs 11/5/2011

  8. PhEDEx agent monitoring PHEDEX_4_0_0 includes improvements for site agent health monitoring Information from all site agents is collected by local watchdog agent Watchdog now produces a daily report on agent activity Agent alerts, agent CPU/mem usage, etc. Report content can be customized with site-specific plugin Watchdog report can then be notified to site admins with various methods Could be also collected centrally for shifter monitoring, complementing the Agent Status webpage 11/5/2011

  9. PhEDEx storage monitoring PhEDEx Namespace Framework for efficient interaction with local storage Caching, storage dumps, directories… Currently used by BlockDownloadVerify agent Could also be more widely used by other scripts or local tools e.g. FileDownloadVerify scripts Evaluating also use of Namespace to generate space accounting reports of local storage Including storage areas not in PhEDEx e.g. /store/user 11/5/2011

  10. Summary Datasvc framework for providing information from TMDB. Any operator SQL can (should!) become an API Website take the good from next-gen prototype lesson: richer navigation, filtering and presentation Watchdog agent Report summaries and alerts, as desired by the user Namespace framework Generic, lightweight framework for SE interaction, can be used by sites for all sorts of SE tools 11/5/2011

More Related