1 / 39

Implementing and Maintaining an ISP Backbone

Implementing and Maintaining an ISP Backbone. Kevin Butler. Tier 1 ISP Backbones. Comprise some of the world’s largest IP networks Tier 1 companies include Sprint, AT&T, PSINet

hue
Download Presentation

Implementing and Maintaining an ISP Backbone

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementing and Maintaining an ISP Backbone Kevin Butler

  2. Tier 1 ISP Backbones • Comprise some of the world’s largest IP networks • Tier 1 companies include Sprint, AT&T, PSINet • UUNET has the world’s largest IP data network (by number of POPs), presence on five continents (North and South America, Europe, Asia, Australia)

  3. Service Level Agreements • SLAs are an important and prestigious tool in attracting and maintaining customers • Comprised of uptime guarantees and bounds on latency through various geographic regions • most ISPs currently have latency < 65ms monthly average between regional hubs in the US

  4. Current SLA latency times • Looking at the North American Backbone over past 24 hours (ICMP tests) • UUNET: 64.9 ms • SprintLink: 69.3 ms • AT&T: 68.7 ms • Cable & Wireless: 60.8 ms • PSINet: 80 ms source: http://ratings.miq.net

  5. Supporting the Customer • Quality and expertise of first-line customer support varies wildly between companies • depending on size, geographic location and company focus, some front-line support teams outsourced to third parties • some in-house high level support teams have skills equivalent or superior to NOCs

  6. Network Operations Centres • Generally the teams concerned with backbone maintenance and support • trend towards consolidation into “Super-NOCs” (eg. one for Americas, one for Europe) • specialisation within NOC for product support (eg. dial, VPN, backbone NOCs)

  7. NOC Tools • NOCOL - Network Operations Centre On Line (freeware UNIX) • Mediahouse monitoring (mainly web) • Micromuse Netcool - used by WorldCom, PSINet, BT

  8. Some Circuit Terminology • DS-1 = 1.544 Mbps, refers to “digital signal”, the actual physical layer component • Often used interchangeably with “T1”, referring to the carrier on the line • DS-3 (T3) = 44.736 Mbps or 28 DS-1s • PRI: “primary rate interface”, equivalent to a DS-1 • BRI: basic rate interface, made up of 2 B (bearer) channels and 1 D channel: B channel is 56/64 kbps (depending on switching limitations), 23 B + 1 64 kbps D channel make a PRI (each B channel is a DS-0 circuit) • Note: 24 DS-0 = 1.536 Mbps – remainder of bandwidth comes as a synchronizing Frame bit after a byte transferred from all 24 channels (so this is bit 193)

  9. Optical Carrier • OC-x rates based on multiplexing SONET streams • SONET – synchronous optical network: defines a standard optical TDM system with common standards and compatibility across continents (devised at Bellcore) – Europe uses SDH, very similar to SONET • OC-3 = 155.54 Mbps, commonly goes up in multiples of four in North America and Europe (OC-12 = 622 Mbps, OC48 ~ 2.5 Gbps, OC-192 ~ 10 Gbps)

  10. Dial Access • Dial is a major selling point, especially with customers who travel a lot or are their own ISPs • connections made through a dial concentrating unit eg. Ascend (Lucent) MAX TNT, which can support up to 720 concurrent callers • back-end is a DS-3 into a backbone router, routers advertised by an IGP (eg. RIP)

  11. Dial-Related Technologies • COBRA (Central Office Based Remote Access) allow building of virtual POPs by backhauling PRIs • RADIUS (Remote Authentication Dial In User Service) – authenticates and can provide some routing and netblock information about customer logging in

  12. Integrated Services Digital Network • ISDN customers authenticate by RADIUS similar to dial users • Most customers use BRI (2 B channels for 128 kbps data rate) • underlying architecture similar but dial equipment often administrated differently • ISDN maintained within same AS as backbone whereas dial often in its own AS

  13. DS-1 and high-speed access • Customer connections usually multiplexed, come into DSU (data service unit) as a channelised DS-3 • gateway routers on ISP side usually Cisco 7500 series, increasingly using Cisco 12000 • customers connect using Cisco 1604, 2621, some 3600 series, very large customers use 7500 series routers

  14. Gateway Routers • obtain routes from customers usually statically, but sometimes by BGP • usually run link-state IGP within AS (eg. OSPF, IS-IS) • Cisco 7513 backplanes 1.8 Gbps while 12008 does 40 Gbps

  15. Where does traffic go from here? • Most ISPs have two levels of networks above the access router • Metropolitan networks aggregate gateway traffic, generally city-wide if multiple points of presence (POPs) in city • transit networks aggregate metro network’s traffic, responsible for inter-city transport

  16. TR TA TA TR TRANSIT METRO TheBig Picture XR XR HA HA HA HA EDGE DR GW GW DR

  17. POPs and NAPs as real estate • Often located in the centre of cities (Ameritech NAP in Chicago, right) • 60 Hudson St, NYC is a “telco hotel”, large number of telecoms companies have equipment there • Industrial buildings (because of high HVAC use) and often nondescript (both for cost and security reasons)

  18. ATM Switches • Terminate long-haul OC-12, OC-48 circuits and metro rings • Choice of vendor contingent on ISP, commonly Newbridge, Fore Systems (ASX-1000 and ASX-4000)

  19. Example of an ATM interface TR1.EG1: interface ATM2/0 description To HA13.BLAH1 3C1 atm vc-per-vp 512 atm pvc 16 0 16 ilmi ! interface ATM2/0.195 point-to-point description To XR1.BLAH1 ATM6/0 ip address 146.188.200.98 255.255.255.252 ip router isis Net-Backbone atm pvc 195 0 195 aal5snap clns router isis Net-Backbone

  20. Tying it all Together • ATM devices perform switching functions at layer two level • Within regional areas, routers use intra-domain routing protocols • To communicate with other regions and across peering points, an inter-domain routing protocol is used

  21. Slash Notation • Subnet masks can be an unwieldy thing to deal with, eg. 255.255.255.240 • Slash notation simplifies this: the number after the slash refers to the number of bits to be ANDed to create the network identifier • 192.168.1.0 255.255.255.0 = 192.168.1.0/24 • Nifty trick: number of hosts in a netblock easy to determine with slash notation - # usable hosts in /x = 2^(32-x) – 2 • Therefore, there are 256 addresses in a /24, 254 usable

  22. Routing Protocols • Intra-domain (IGPs) • Distance-vector (RIP, IGRP) • Link-state (OSPF, IS-IS, EIGRP) • Inter-domain (EGPs) • Path-vector: BGP • Routes by number of hops between autonomous systems, hence uses a vector comprised of AS sequence numbers instead of next IP address

  23. Autonomous Systems • An autonomous system (AS) is a group of routers with a single routing policy, running under a single administration • Different ISPs, and large companies, can have their own AS number • Where to get a number? In North America, ARIN (American Registry for Internet Numbers), in Europe, RIPE (Réseaux IP Européens), in Asia APNIC (Asia-Pacific Network Information Centre) – also the places for getting IP addresses

  24. Implementation of BGP • BGP runs between autonomous systems and peers, as well as multi-homed customers • monolithic AS broken up into BGP confederations for ease of work • Why BGP? Policies can be defined and routes controlled to a highly customisable degree using access lists and route maps – one can choose what routes to distribute to which neighbours • BGP can run inside an AS – internal (IBGP) carries transit traffic through the AS (like an Interstate through a county)

  25. Communities are destinations that share common attributes (eg. through access-list filters) BGP table version is 23718690, local router ID is 205.150.242.2 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *>i24.64.0.0/19 198.133.49.7 100 0 6327 6172 i *>i24.64.0.0/14 198.133.49.7 100 0 6327 i *>i24.64.32.0/19 198.133.49.7 100 0 6327 6172 i *>i24.64.64.0/19 198.133.49.7 100 0 6327 6172 i *>i24.64.96.0/19 198.133.49.7 100 0 6327 6172 i *>i24.64.192.0/19 198.133.49.7 100 0 6327 6172 i *>i24.64.224.0/19 198.133.49.7 100 0 6327 6172 i *>i24.65.0.0/19 198.133.49.7 100 0 6327 6172 i *>i24.65.96.0/19 198.133.49.7 100 0 6327 6172 i *>i24.65.128.0/19 198.133.49.7 100 0 6327 6172 i BGP

  26. Advantages of BGP for User • Allows for load-sharing and redundancy • routes can be biased through AS path prepending (adding the same AS number to a route multiple times to make it a less favourable route to take) • requirement is high-quality router with close to 100% uptime to avoid connection flaps and subsequent route dampening (BGP gets annoyed if connections go up and down frequently and will penalise the offending network)

  27. Common Customer Issues • Static routes on backbone - often difficult to spot, can cause very strange routing results (very conducive to routing loops) • pull-up routes for netblocks smaller than /24, required to avoid BGP dampening (smaller customers tend to reset their equipment more often) • BGP recalculations - if done on a transit router, entire backbone segments can experience outages (tables are huge, currently over 103,000 prefixes in table)

  28. Customer Requirements of the Backbone • Redundancy - networks are redundant but card failures can take down whole routers • physical connection to POP from customer is SPF • low latency - massive increases in demand on backbone makes this difficult • over $2 million a day spent on global backbone upgrades

  29. DSL: low cost, high speed • DSL might phase out ISDN connections • difficult to troubleshoot from network standpoint • connections pass through telco’s frame or ATM cloud between DSLAM (DSL access multiplexor – separates voice and data traffic by frequency) and VR • RedBack SMS (Subscriber Management System) 1000 commonly used as VR, though currently the SMS 10000 is the largest “carrier-class” routing switch, can take in 24 OC-12s)

  30. RedBack SMS 1000 • Supports up to 4000 sessions • OC-3 out to metro network • traffic-shaping accomplished with profiles atm profile samplecust counters shaping vbr-nrt pcr 1000 cdvt 100 scr 100 bt 10

  31. Increasing Capacity • Backbone capacity increasing at a huge rate • Traffic engineering combined with high backplane becoming increasingly important • many ISPs turning to Juniper routers • UUNET rolled out production OC-192c with Juniper M160 running MPLS

  32. Juniper Routers • Specialises in huge routers (M160 backplanes 160 Gbps) • JUNOS supports MPLS and RSVP isis { interface all; } ospf { area 0.0.0.0 { interface so-0/0/0 { metric 15; retransmit-interval 10; hello-interval 5; } } } [edit]

  33. Network Abuse • Spam-killing – looking at SMTP header for IP address, null-routing it • Open relay detection – ORBS et. al. • DDoS attacks can be very detrimental to backbone (even causing switch crashes) • Combated by rate-limiting ICMP on routers • Most effective defense is community-wide egress filtering; requires co-operation throughout the Internet

  34. Network Challenges eg. Canada • Geographically, population resides in virtually a straight line across the south • major focus is on southbound capacity to the US • CRTC regulations on telcos create different arrangements • heterogeneous network to the US, integration a big issue

  35. Costs • Network equipment not cheap: a Cisco GSR can cost upwards of a quarter million dollars • Fibre and transceivers can be expensive to lay ($100K/mile near rail, over $300K/mile in the city) • Interesting note: Sprint grew its all fibre network quickly because it was laid on railway right-of-way (the SPR in Sprint initially stood for Southern Pacific Railway) • Costs for backbone access? Currently ~ $1300 CDN + local loop cost for burstable 128k T1, up to ~ $50 K CDN for a full T3, much more for OC3+ (USD costs similar)

  36. Questions? • Anything I can clarify or expand on... • Thank you!

More Related