Version 1 0 meeting edition 06 november 2008 rob kennedy and adam lyon attending
Download
1 / 10

D0 Grid Data Production Initiative: Coordination Mtg 9 - PowerPoint PPT Presentation


  • 95 Views
  • Uploaded on

Version 1.0 (meeting edition) 06 November 2008 Rob Kennedy and Adam Lyon Attending: …. D0 Grid Data Production Initiative: Coordination Mtg 9. Outline. Summary and News Open Action Items: none to call out Deployment “Feature List”: drives what is critical No change since last week

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' D0 Grid Data Production Initiative: Coordination Mtg 9' - regis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Version 1 0 meeting edition 06 november 2008 rob kennedy and adam lyon attending

Version 1.0 (meeting edition)

06 November 2008

Rob Kennedy and Adam Lyon

Attending: …

D0 Grid Data Production Initiative:Coordination Mtg 9

D0 Grid Data Production


Outline

D0 Grid Data Production

Outline

  • Summary and News

  • Open Action Items: none to call out

  • Deployment “Feature List”: drives what is critical

    • No change since last week

  • Task Status (4 slides): Most of our time.

  • Deployment 1 Planning: start today


Summary and news

D0 Grid Data Production

Summary and News

  • Summary

    • Umbrella Packages, Installation Manuals: good

    • Post-Install Tests: not good yet, seems more like “add user” issues than “FWD Node” issues

    • A day or so behind, deployment next week.

  • News:

    • A job did run via FWD4 in the first test!


Open action items green effectively done yellow added notes blue coming week

D0 Grid Data Production

Open Action Items(Green = effectively done, Yellow = added notes, Blue = coming week)

  • <none to call out>


Current deployment feature lists

D0 Grid Data Production

Current Deployment “Feature” Lists

  • Deployment 1: Split Data/MC Production Services (NO CHANGE)

    • Time frame: November 13-17, with 1 week+ observation before holidays

    • 1. Config: Basic Splitting of Fwd,Que Services between Data and MC Production with 2 Fwd nodes assigned to each, plus 1 Fwd dedicated to all Merging

    • 2. Fwd4 deployed (w/o virtualization)

    • 3. Fwd5 deployed

    • 4. Que2 deployed, with client software to enable parallel use of 2 QUE nodes

    • 5. New SAM Station (moved off of FWD1)

    • 6. Condor 7 via “new” 1.10.1m official release from UWisc

    • 7. FileMax increase on all Fwd nodes to handle large nJob actions

    • 8. D0Runjob Upgrade for Data Production: Prerequisite for deploying new SAM-Grid release

  • Deployment 2: Optimize Data and MC Production Configurations (NO CHANGE)

    • Time frame: December 8-10, with 1 week+ observation before holidays

    • 1. Config: Optimize Configurations separately for Data and MC Production, especially to increase Data Production “queue” length

    • 2. New SAM-Grid Release with support for new Job status value at Queuing node


Task status 1 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (1 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.1.1 Forwarding Node 4 (Fwd4)

    • <Snip some completed tasks>

    • 1.1.1.13 Fwd4: Install with Version-Based FWD Umbrella Product AL JB Thu 10/30/08 Thu 10/30/08 1d

    • 1.1.1.9 Fwd4: Few Jobs FileMax=As-Is Test AL JB Mon 11/3/08 Wed 11/5/08 3d

    • 1.1.1.10 Fwd4: Pre-Deployment FileMax=16k Test AL JS Thu 11/6/08 Mon 11/10/08 3d

    • 1.1.1.11 Milestone: Fwd4 Ready to Deploy AL Mon 11/10/08Mon 11/10/08 0d

  • 1.1.2 Forwarding Node 5 (Fwd5)

    • <Snip some completed tasks>

    • 1.1.2.10 Fwd5: Install with Version-Based FWD Umbrella Product AL JB Thu 10/30/08 Thu 10/30/08 1d

    • 1.1.2.7 Fwd5: Few Jobs FileMax=As-Is Test AL JB Mon 11/3/08 Wed 11/5/08 3d

    • 1.1.2.8 Fwd5: Pre-Deployment FileMax=16k Test AL JS Thu 11/6/08 Mon 11/10/08 3d

    • 1.1.2.9 Milestone: Fwd5 Ready to Deploy AL Mon 11/10/08Mon 11/10/08 0d

  • 1.1.8 FWD and QUE Packaging with Version-Based Umbrella Product

    • <Snip some completed tasks>

    • Milestone: FWD Umbrella Product ready to use "GG,AL" Wed 10/29/08 Wed 10/29/08 0d

    • 1.1.8.6 Umbrella Product: Update FWD Installation Procedure AL JB Fri 11/7/08 Mon 11/10/08 2d

  • Change in scheme: Red = ALL critical tasks for deployment 1 completion.

  • Notes…


Task status 2 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (2 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.1.8 FWD and QUE Packaging with Version-Based Umbrella Product

    • 1.1.8.7 Umbrella Product: Initial QUE Umbrella Product Release GG PM Thu 10/30/08 Thu 10/30/08 1d

    • 1.1.8.8 Umbrella Product: Rework QUE Installation Procedure AL PM Fri 10/31/08 Fri 10/31/08 1d

    • 1.1.8.9 Milestone: QUE Umbrella Products ready to use GG PM Fri 10/31/08 Fri 10/31/08 0d

    • 1.1.8.12 Umbrella Product: Update QUE Umbrella… AL PM Mon 11/3/08 Mon 11/3/08 0.5d

    • 1.1.8.10 Umbrella Product: Update QUE Installation Procedure AL JB Mon 11/10/08Tue 11/11/08 2d

    • 1.1.8.13 Umbrella Product: FWD, QUE Installation Procedures archive ALREX Wed 11/12/08Thu 11/13/08 2d

    • 1.1.8.11 Milestone: FWD and QUE Packaging … done "GG,AL" Thu 11/13/08 Thu 11/13/08 0d

  • 1.1.3 Queuing Node 2 (Que2)

    • <Snip some completed tasks>

    • 1.1.3.12 Que2: Install with Version-Based FWD Umbrella Product AL JB Tue 11/4/08 Tue 11/4/08 1d

    • 1.1.3.10 Que2: Jim_Client 2-QUE Support: Client Deployment AL REX Wed 11/5/08 Wed 11/5/08 1d

    • 1.1.3.8 Que2: Regression Test w/1-QUE Client (skipped by ABa) AL JB Thu 11/6/08 Fri 11/7/08 2d

    • 1.1.3.9 Que2: Integration Test w/2-QUE Client AL JB Mon 11/10/08Mon 11/10/08 1d

    • 1.1.3.11 Milestone: Que2 Ready to Deploy AL Mon 11/10/08Mon 11/10/08 0d

  • Notes…


Task status 3 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (3 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.1.5 New Distinct Sam Station

    • <Snip some completed tasks>

    • 1.1.5.4 SAM Station: Install and Setup Station AL RI? Thu 11/6/08 Fri 11/7/08 2d

    • 1.1.5.5 SAM Station: Pre-Deployment Test AL RI? Mon 11/10/08Mon 11/10/08 1d

    • 1.1.5.6 SAM Station: Deployment Plan (Deactivate old/Activate new) AL AL Tue 11/11/08 Tue 11/11/08 1d

    • 1.1.5.7 Milestone: SAM Station Ready to Deploy AL Tue 11/11/08 Tue 11/11/08 0d

    • 1.1.5.8 SAM Station: Setup Context Server AL AL Thu 11/13/08 Fri 11/14/08 2d

  • Not done, original resource busy. Now, this is late and at risk

  • Notes…

  • 1.1.6 Deployment Stage 1

    • 1.1.6.1 Dep 1: Plan: Split Data/MC Prod Services AL ALL Mon 11/10/08 Wed 11/12/08 3d

    • 1.1.6.2 Deployment 1: Execute AL REX Thu 11/13/08 Mon 11/17/08 3d

    • 1.1.6.3 Deployment 1: Monitor AL REX Tue 11/18/08 Mon 11/24/08 5d

    • 1.1.6.4 Deployment 1: Sign-off AL REX Tue 11/25/08 Tue 11/25/08 1d

    • 1.1.6.5 MILE 1: Deployment 1 Completed AL Tue 11/25/08 Tue 11/25/08 0d

  • Bootstrap this today to work ahead: rough work list and known order/priorities

  • Meeting on Monday (RDK to arrange, I propose 9-10:30am) to work out the details

  • 17 November 2008 is the drop-dead date to be deployed, what we run for one week before sign-off.


Task status 4 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (4 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.3.1 SAM-Grid Job Status Info

    • <snip some tasks>

    • 1.3.1.4 Upgrade D0Runjob version used by Data Production AL "MD,AL“ Thu 10/23/08 Fri 10/24/08 2d

  • 1.3.2 Slow Fwd-CAB Job Transition

    • Note: FileMax change requires a schedd restart (ST). Work into deployment plans.

  • 1.3.3 Improved H/w Uptime

  • 1.4 Metrics

    • nSubmissions plot for Sep ’08 Mike?

    • Ganglia-base D0Farm plot from Keith 

    • Notes…


Deployment 1 work

D0 Grid Data Production

Deployment 1 Work

  • Rough Work List

    • Verify FWD4-5,QUE2 installed; FWD4-5 FileMax increased

    • FWD1-3 install/upgrade via umbrella package; Increase FileMax

    • QUE1 install/upgrade via umbrella package

    • Deactivate SAM station on FWD1

    • Activate new SAM station

    • Configure FWD1-5

    • Configure QUE1-2

    • Test system: Data Production, MC Production, Reco/MC Merge Jobs

    • Work on Client Side: adapt to use new jim client package

    • Post-Deployment Work: move context server?


ad