Version 1 0 meeting edition 06 november 2008 rob kennedy and adam lyon attending
This presentation is the property of its rightful owner.
Sponsored Links
1 / 10

D0 Grid Data Production Initiative: Coordination Mtg 9 PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on
  • Presentation posted in: General

Version 1.0 (meeting edition) 06 November 2008 Rob Kennedy and Adam Lyon Attending: …. D0 Grid Data Production Initiative: Coordination Mtg 9. Outline. Summary and News Open Action Items: none to call out Deployment “Feature List”: drives what is critical No change since last week

Download Presentation

D0 Grid Data Production Initiative: Coordination Mtg 9

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Version 1 0 meeting edition 06 november 2008 rob kennedy and adam lyon attending

Version 1.0 (meeting edition)

06 November 2008

Rob Kennedy and Adam Lyon

Attending: …

D0 Grid Data Production Initiative:Coordination Mtg 9

D0 Grid Data Production


Outline

D0 Grid Data Production

Outline

  • Summary and News

  • Open Action Items: none to call out

  • Deployment “Feature List”: drives what is critical

    • No change since last week

  • Task Status (4 slides): Most of our time.

  • Deployment 1 Planning: start today


Summary and news

D0 Grid Data Production

Summary and News

  • Summary

    • Umbrella Packages, Installation Manuals: good

    • Post-Install Tests: not good yet, seems more like “add user” issues than “FWD Node” issues

    • A day or so behind, deployment next week.

  • News:

    • A job did run via FWD4 in the first test!


Open action items green effectively done yellow added notes blue coming week

D0 Grid Data Production

Open Action Items(Green = effectively done, Yellow = added notes, Blue = coming week)

  • <none to call out>


Current deployment feature lists

D0 Grid Data Production

Current Deployment “Feature” Lists

  • Deployment 1: Split Data/MC Production Services (NO CHANGE)

    • Time frame: November 13-17, with 1 week+ observation before holidays

    • 1. Config: Basic Splitting of Fwd,Que Services between Data and MC Production with 2 Fwd nodes assigned to each, plus 1 Fwd dedicated to all Merging

    • 2. Fwd4 deployed (w/o virtualization)

    • 3. Fwd5 deployed

    • 4. Que2 deployed, with client software to enable parallel use of 2 QUE nodes

    • 5. New SAM Station (moved off of FWD1)

    • 6. Condor 7 via “new” 1.10.1m official release from UWisc

    • 7. FileMax increase on all Fwd nodes to handle large nJob actions

    • 8. D0Runjob Upgrade for Data Production: Prerequisite for deploying new SAM-Grid release

  • Deployment 2: Optimize Data and MC Production Configurations (NO CHANGE)

    • Time frame: December 8-10, with 1 week+ observation before holidays

    • 1. Config: Optimize Configurations separately for Data and MC Production, especially to increase Data Production “queue” length

    • 2. New SAM-Grid Release with support for new Job status value at Queuing node


Task status 1 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (1 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.1.1Forwarding Node 4 (Fwd4)

    • <Snip some completed tasks>

    • 1.1.1.13Fwd4: Install with Version-Based FWD Umbrella Product ALJBThu 10/30/08Thu 10/30/081d

    • 1.1.1.9Fwd4: Few Jobs FileMax=As-Is Test ALJBMon 11/3/08Wed 11/5/083d

    • 1.1.1.10Fwd4: Pre-Deployment FileMax=16k Test ALJSThu 11/6/08Mon 11/10/083d

    • 1.1.1.11Milestone: Fwd4 Ready to Deploy ALMon 11/10/08Mon 11/10/080d

  • 1.1.2Forwarding Node 5 (Fwd5)

    • <Snip some completed tasks>

    • 1.1.2.10Fwd5: Install with Version-Based FWD Umbrella Product ALJBThu 10/30/08Thu 10/30/081d

    • 1.1.2.7Fwd5: Few Jobs FileMax=As-Is Test ALJBMon 11/3/08Wed 11/5/083d

    • 1.1.2.8Fwd5: Pre-Deployment FileMax=16k Test ALJSThu 11/6/08Mon 11/10/083d

    • 1.1.2.9Milestone: Fwd5 Ready to Deploy ALMon 11/10/08Mon 11/10/080d

  • 1.1.8FWD and QUE Packaging with Version-Based Umbrella Product

    • <Snip some completed tasks>

    • Milestone: FWD Umbrella Product ready to use"GG,AL"Wed 10/29/08Wed 10/29/080d

    • 1.1.8.6Umbrella Product: Update FWD Installation ProcedureALJBFri 11/7/08Mon 11/10/082d

  • Change in scheme: Red = ALL critical tasks for deployment 1 completion.

  • Notes…


Task status 2 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (2 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.1.8FWD and QUE Packaging with Version-Based Umbrella Product

    • 1.1.8.7Umbrella Product: Initial QUE Umbrella Product ReleaseGGPMThu 10/30/08Thu 10/30/08 1d

    • 1.1.8.8Umbrella Product: Rework QUE Installation ProcedureALPMFri 10/31/08Fri 10/31/08 1d

    • 1.1.8.9Milestone: QUE Umbrella Products ready to useGGPMFri 10/31/08Fri 10/31/08 0d

    • 1.1.8.12Umbrella Product: Update QUE Umbrella…ALPMMon 11/3/08Mon 11/3/08 0.5d

    • 1.1.8.10Umbrella Product: Update QUE Installation ProcedureALJBMon 11/10/08Tue 11/11/08 2d

    • 1.1.8.13Umbrella Product: FWD, QUE Installation Procedures archive ALREXWed 11/12/08Thu 11/13/08 2d

    • 1.1.8.11Milestone: FWD and QUE Packaging … done"GG,AL"Thu 11/13/08Thu 11/13/08 0d

  • 1.1.3Queuing Node 2 (Que2)

    • <Snip some completed tasks>

    • 1.1.3.12Que2: Install with Version-Based FWD Umbrella ProductALJBTue 11/4/08Tue 11/4/08 1d

    • 1.1.3.10Que2: Jim_Client 2-QUE Support: Client DeploymentALREXWed 11/5/08Wed 11/5/08 1d

    • 1.1.3.8Que2: Regression Test w/1-QUE Client(skipped by ABa)ALJBThu 11/6/08Fri 11/7/08 2d

    • 1.1.3.9Que2: Integration Test w/2-QUE ClientALJBMon 11/10/08Mon 11/10/08 1d

    • 1.1.3.11Milestone: Que2 Ready to DeployALMon 11/10/08Mon 11/10/08 0d

  • Notes…


Task status 3 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (3 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.1.5New Distinct Sam Station

    • <Snip some completed tasks>

    • 1.1.5.4SAM Station: Install and Setup StationALRI?Thu 11/6/08Fri 11/7/082d

    • 1.1.5.5SAM Station: Pre-Deployment TestALRI?Mon 11/10/08Mon 11/10/081d

    • 1.1.5.6SAM Station: Deployment Plan (Deactivate old/Activate new)AL ALTue 11/11/08Tue 11/11/081d

    • 1.1.5.7Milestone: SAM Station Ready to DeployALTue 11/11/08Tue 11/11/080d

    • 1.1.5.8SAM Station: Setup Context ServerALALThu 11/13/08Fri 11/14/082d

  • Not done, original resource busy. Now, this is late and at risk

  • Notes…

  • 1.1.6Deployment Stage 1

    • 1.1.6.1Dep 1: Plan: Split Data/MC Prod ServicesALALLMon 11/10/08Wed 11/12/083d

    • 1.1.6.2Deployment 1: ExecuteALREXThu 11/13/08Mon 11/17/083d

    • 1.1.6.3Deployment 1: MonitorALREXTue 11/18/08Mon 11/24/085d

    • 1.1.6.4Deployment 1: Sign-offALREXTue 11/25/08Tue 11/25/081d

    • 1.1.6.5MILE 1: Deployment 1 CompletedALTue 11/25/08Tue 11/25/080d

  • Bootstrap this today to work ahead: rough work list and known order/priorities

  • Meeting on Monday (RDK to arrange, I propose 9-10:30am) to work out the details

  • 17 November 2008 is the drop-dead date to be deployed, what we run for one week before sign-off.


Task status 4 of 4 red critical tasks green done blue in progress yellow added notes

D0 Grid Data Production

Task Status (4 of 4)(Red = critical tasks, Green = done, Blue = in progress,Yellow = added notes)

  • 1.3.1SAM-Grid Job Status Info

    • <snip some tasks>

    • 1.3.1.4Upgrade D0Runjob version used by Data ProductionAL"MD,AL“ Thu 10/23/08Fri 10/24/082d

  • 1.3.2Slow Fwd-CAB Job Transition

    • Note: FileMax change requires a schedd restart (ST). Work into deployment plans.

  • 1.3.3Improved H/w Uptime

  • 1.4Metrics

    • nSubmissions plot for Sep ’08 Mike?

    • Ganglia-base D0Farm plot from Keith 

    • Notes…


Deployment 1 work

D0 Grid Data Production

Deployment 1 Work

  • Rough Work List

    • Verify FWD4-5,QUE2 installed; FWD4-5 FileMax increased

    • FWD1-3 install/upgrade via umbrella package; Increase FileMax

    • QUE1 install/upgrade via umbrella package

    • Deactivate SAM station on FWD1

    • Activate new SAM station

    • Configure FWD1-5

    • Configure QUE1-2

    • Test system: Data Production, MC Production, Reco/MC Merge Jobs

    • Work on Client Side: adapt to use new jim client package

    • Post-Deployment Work: move context server?


  • Login