1 / 21

Real-time Data Access Monitoring in Distributed, Multi Petabyte Systems

Real-time Data Access Monitoring in Distributed, Multi Petabyte Systems. Tofigh Azemoon Jacek Becla Andrew Hanushevsky Massimiliano Turri SLAC National Accelerator Laboratory. Soon a typical running HEP experiment will look like this. 100s of users    10s of thousands of batch nodes

christmas
Download Presentation

Real-time Data Access Monitoring in Distributed, Multi Petabyte Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Real-time Data Access Monitoring in Distributed, Multi Petabyte Systems Tofigh Azemoon Jacek Becla Andrew Hanushevsky Massimiliano Turri SLAC National Accelerator Laboratory Tofigh Azemoon

  2. Soon a typical running HEP experiment will look like this 100s of users  10s of thousands of batch nodes  1000s of data servers  10s of millions of files Containing 10s of PB of data + geographically distributed crossing many time zones. Tofigh Azemoon

  3. Mission Statement • Provide real time overall view of system performance • Respond to detailed queries  to identify bottle necks  to optimize the system  to aid in planning system expansion Tofigh Azemoon

  4. The SLAC ¼PB “kan” Cluster Clients Managers bbr-rdr03 bbr-rdr04 Data Servers ( 320 TB) kan059 kan001 kan002 kan003 kan004 Tofigh Azemoon

  5. Monitored Objects user type_2 client type_1 job file session server Tofigh Azemoon

  6. File classes to monitor aggregate values for groups of files BaBar Examples: type_1 ( dataType )  /store/PR/R22/AllEvents/0006/70/22.0.3/AllEvents_00067045_22.0.3V03.02E.root /store/SP/R22/000998/200406/22.0.3/SP_000998_068468.01.root /store/PRskims/R22/22.1.1c/IsrIncExc/79/IsrIncExc_57978.01.root /store/SPskims/R22/22.1.1c/Tau1N/001235/200212/Tau1N_001235_49553.01.root  type_2 (skims) File path  getFileType  ( type_1 value, type_2 value) Tofigh Azemoon

  7. Xrootd Server • Highly scalable server • Posix like access to files • Load balancing • Transparent recovery from server crashes • Fault tolerant • Very low latency Tofigh Azemoon

  8. Minimal impact on client requests Robustness in multimode failure Precision & specificity of collected data Real time scalability Use UDP datagrams Data servers insulated from monitoring. But Packets can get lost Outsource client serialization Low bounded resource usage Use of time buckets Monitoring Implementation in xrootd  Tofigh Azemoon

  9. Start Session • sessionId, user, PId, client, server, start T • Staging • stageId, user, PId,client, file path, stage T, duration, server • Open File • fileId, user, PId, client, server, file path, open T • Close File • fileId, bytes read, bytes written, close T • End Session • sessionId, duration, end T • + Xrootd restart time for each server R T d a t a Tofigh Azemoon

  10. Single Site Monitoring Tofigh Azemoon

  11. Multi-site Monitoring Tofigh Azemoon

  12. Collector/Decoder (C++) MySQL database (5.0) Database Applications (Perl, Perl DBI) Create Load Prepare Upgrade Reload Backup Web application ( JSP3) DB access via JDBC Data servers xrootd enabled Database Server Hosting DB & running DB application Web Server Tomcat For security reasons DB & Web servers on different hosts COMPONENTS Tofigh Azemoon

  13. Configuration File • dbName: xrdmon_kan_v005 • MySQLUser: xrdmon • webUser: reader • MySQLSocket: /tmp/mysql.sock • baseDir: /u1/xrdmon/allSites • ctrPort: 9931 • thisSite: SLAC • fileType: dataType 100 • fileType: skim 500 Tofigh Azemoon

  14. site: 1 SLAC PST8PDT 2005-06-13 00:00:00 • site: 2 RAL WET 2005-08-08 10:14:00 • site: 2 CCIN2P3 CET 2006-10-16 00:00:00 • site: 3 CNAF CET 2006-12-18 00:00:00 • site: 3 GRIDKA CET 2006-10-16 00:00:00 • site: 3 UVIC PST8PDT 2007-05-04 21:00:00 • backupInt: SLAC 1 DAY • backupIntDef: 1 DAY Tofigh Azemoon

  15. fileCloseWaitTime: 10 MINUTE • maxJobIdleTime: 15 MINUTE • maxSessionIdleTime: 12 HOUR • maxConnectTime: 70 DAY • closeFileInt: 15 MINUTE • closeIdleSessionInt: 1 HOUR • closeLongSessionInt: 1 DAY • nTopPerfRows: 20 • yearlyStats: ON • allYearsStats: OFF Tofigh Azemoon

  16. Basic View users unique files jobs all files Tofigh Azemoon

  17. Top Performers Table Tofigh Azemoon

  18. User Information Tofigh Azemoon

  19. Skim Information Tofigh Azemoon

  20. Statistics (BaBar, September 2007) • DB size: > 40 GB • # tables: > 200 • many with 10’s of millions rows • Largest table > 132,000,000 rows • # jobs recorded > 30,000,000 • At peak times • over 100 concurrent users • running 10’s of thousands of jobs Tofigh Azemoon

  21. Future Developments & Expansions • DB and RT Data Backup • File and User Filtering • Staging Monitoring • Never enough disk to hold entire data sample • Disk uses power even when files are not accessed • Fraction of Data Accessed • For each file type • In specific time intervals • … • Multi Experiment Monitoring • Many experiments sharing computing resources Tofigh Azemoon

More Related