1 / 12

Network Move & Upgrade 2008/2009: October 2008

Les Cottrell SLAC for SCCS core services network group (Antonio Ceseracciu, Jared Greeno,Yee Ting Li, Gary Buhrmaster), Presented at the OU Admin Group Meeting October 16, 2008 www.slac.stanford.edu/grp/scs/net/racks/netmove-oct08.ppt. Network Move & Upgrade 2008/2009: October 2008.

Download Presentation

Network Move & Upgrade 2008/2009: October 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Les CottrellSLAC for SCCS core services network group (Antonio Ceseracciu, Jared Greeno,Yee Ting Li, Gary Buhrmaster), Presented at the OU Admin Group Meeting October 16, 2008 www.slac.stanford.edu/grp/scs/net/racks/netmove-oct08.ppt Network Move & Upgrade 2008/2009:October 2008

  2. Why move • ~ 70 Building Switches connected to old core switch that has to “move” to seismically retrofitted area • While at it, replace old, beyond end of life, limited capability switches to provide better service

  3. Move Types • Already done: Kavli, MCC, LCLS, SSRL (70 switches, 17 need replacing, will probably need to re-address later, SSRL decision) • Migrate: Switch beyond end of life, features missing (auto negotiation, higher speeds) = replace switch, connect to new core, re-address hosts • (CGB1), TL1, (WHS), CLA1, CLA2, 280, CL1..2, B267, CGB3 • Move - 1: Switch OK = use same switch but connect to temporary core switch, readdress later (after April 15th 2009) • B214, B031, B210, B005, B275, B279, CLR113, CLR224, CLR343, HFB1, HFB2, MCC-CORE1..2, MCC- WAPCORE1..2, ROB, Research Yard: SWH-RY, B062, B104A, B113, B121, B124, B128, B211, B225, B231, B420 • Move – 2: Switch beyond end of life etc., but not central responsibility to upgrade = connect to temporary core switch • Guest House has 2, PEP ring has 4 but ring de-commissioned at moment • Move – 3: Switch shares trunk cable, requires long (2 days) workday outage, or overtime (cost depends on what cables have to move etc., estimating costs probably $5K (2 technicians for 2 days) • Guest House 1 &2, ESA, CRYO, IR12, CGB2 (1 day). • Will send an email to OU Admins with head’s up so can contact and warn users, get account if need non-working hours and schedule.

  4. Long Outage Switches • Contact users, group leaders to see if can take outage in normal work hours or get an account for overtime (could be $5K), schedule outage • ESA (21): Tyler Adams (11), Nicholas Arias (2), Rafael Gomez (5), Zen Szalata (3) • Cryo (7 hosts): Agustin Burgos (5), Tom Galeto (2) • IR12 (4 hosts): Tala Cadorna (1), Raymond Lo (3) • www.slac.stanford.edu/grp/scs/net/racks/slaconly/switches/ gives details of hosts on switches

  5. Experience with Moves • Moves are easy: • Each building switch has two (for redundancy) fibre pairs to two old core routers on to B050 floor 2 • Prepare port in 2 temporary (probably ~ 1 year) switches in seismically retrofitted area • Identify pairs and prepare jumpers • Move backup pair to backup temporary switch • Move primary pair to primary temporary switch. • Two ~ 5 second outages, users unlikely to notice. • No need for detailed coordination with OU admins, users, can do whenever we get to it etc. • Could publish a schedule in future to all OU admins, but will require more effort, scheduling, easier to notify when done, or 5 mins before do it

  6. Migrations • Require re-addressing & close coordination • ID Admins (can be many) & switch ports etc. create web page documenting what has to be done, addresses, set up tracking tickets etc. • Email to admins request them to validate CANDO info and read web page: • Three types of hosts: printers, SLAC only, open access to world. • Meet with admins, explain, schedule time • Install replacement switch when appropriate, configure • With each admin, a network tech and a network engineer move cables one by one from old switch to replacement, re-address host, check things work etc. • During or shortly after migration, network engineer will update CANDO with new IP address. • To date, have been migrating all of one OU Admin’s machines at a time.

  7. Migration Experience • Two switches almost done (CGB1, TL1), elapsed > 1 week • Difficult, labor intensive, requires lots of coordination, availability, impacts users • Problems with devices not being in the documented place, patch panel labeling being wrong, patch cables not being long enough • Be wary of old, non-standard devices • Devices that have been turned off do not show up on our spreadsheets • Takes time to get print queues changed on Windows, but can be requested in advance • Will be setting a hard deadline depending on # devices etc.

  8. Lessons learned • New networks require different subnet mask and default gateway; make sure this is clear. • Make sure all devices have an IP assigned in advance to reduce confusion. • Confirm which devices should be SLAC Only (IFZ) vs Public in advance. • When replacing the switch, can take up to 15 minutes per device (walk to machine, log in, change IP, request cable change, test), so be prepared and patient. • Use ipconfig /registerdns on Windows computers to make sure Windows DNS gets updated, then test and inform windows-admin if IP is still wrong. • Still working on developing automation to change Windows system IPs.

  9. Progress Temporary switches CORE3OLD in seismically retrofitted area – Sep 08 Need reconfig and connect up & CORE4OLD CORE4OLD in seismically retrofitted area – Oct 08 CORE4OLD in place too

  10. Documentation • See “Seismic Retrofitting Rack Move 2008” site • https://confluence.slac.stanford.edu/display/NetMan/Seismic+Retrofitting+Rack+Move+2008 • Contains background information, overview of procedures, milestones, drill down to lots more details (tickets, spreadsheets, subnet allocations, hosts on individual switches etc.) • This is where to go to get detailed information. It is very dynamic. • If you need more, let us know we will add as appropriate • Email to core-neteng • There is an FAQ at https://confluence.slac.stanford.edu/display/NetMan/Frequently+Asked+Questions

  11. New Area • New area circa Aug 21 ‘08 • New area circa Oct 15 ‘08 • New area circa Sep 18 ‘08

  12. Central Routers CORE3OLD in seismically retrofitted area – Sep 08 Need reconfig and connect up & CORE4OLD • SWH-CORE1&2-OLD in old racks

More Related