1 / 22

Structure Preserving Anonymization of Router Configuration Data

Structure Preserving Anonymization of Router Configuration Data. David A. Maltz, Jibin Zhan, Geoffrey Xie, Hui Zhang Carnegie Mellon University Gisli Hjalmtysson, Albert Greenberg, Jennifer Rexford ATT Labs Research. Why Configuration Files are Valuable.

salena
Download Presentation

Structure Preserving Anonymization of Router Configuration Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structure Preserving Anonymization of Router Configuration Data David A. Maltz, Jibin Zhan, Geoffrey Xie, Hui Zhang Carnegie Mellon University Gisli Hjalmtysson, Albert Greenberg, Jennifer Rexford ATT Labs Research

  2. Why Configuration Files are Valuable • Configuration file = program loaded on each router • Controls operation of router • Controls interactions between routers • Configuration files allow researchers to study of the details of real networks • The problem is getting access to them • We have developed a technique for anonymizing configuration files • We have a proposal for how configs could be made accessible to the research community

  3. Why Configuration Files are Valuable - 2 • The set of configurations defines the network • Captures many of the network’s properties • Topology (node degree, interconnectivity) • Policies (CoS, QoS, packet filters, reachability) • Routing (neighbors, OSPF weights, BGP policies) • Security (vulnerabilities, mitigations) • Only source of insight for Enterprise networks • 10K+ networks that are currently a mystery • Interesting! 10 – 1200 routers, global scale • Configs are the only way to look at them • Networks firewalled, external probes dropped

  4. Topology Internet Router 1 Config Router 2 Config interface Serial1/0.5 ip address 1.1.1.1/30 interface Serial2/1.5 ip address 1.1.1.2/30

  5. Quality of Service • class-map GoodCustomer • match access-group 136 • policy-map GoldService • class GoodCustomer • bandwidth 2000 • queue-limit 40 • class class-default   • fair-queue 16 • queue-limit 20 • interface Serial0/0 • service-policy output GoldService Class definition CB-WFQ parameters CB-WFQ policy name

  6. Routing AS Numbers • router bgp 65501 • neighbor EdgeSwitch peer-group • neighbor EdgeSwitch remote-as 64740 • neighbor EdgeSwitch distribute-list 11 in • neighbor EdgeSwitch route-map exportRoutes out • neighbor 192.168.96.8 peer-group EdgeSwitch • neighbor 192.168.96.9 peer-group EdgeSwitch • neighbor 10.217.248.14 remote-as 65500 • neighbor 10.217.248.14 ebgp-multihop 5 Policies Peers

  7. Security Issues Access list 143: Drops packets that can attack Cisco interfaces • access-list 143 deny 53 any any • access-list 143 deny 55 any any • access-list 143 deny 77 any any • access-list 143 permit ip any any • interface Serial0.2 multipoint • ip access-group 143 in • ip address 66.248.162.13 255.255.255.224 • interface Ethernet0 • ip address 144.201.41.59 255.255.255.0 This interface is safe This interface is not

  8. How to Get Configuration Files? • Considered proprietary secrets of network owners • Discloses business strategy • Discloses vulnerabilities • Anonymization breaks tie between data and owner • Anonymized configs will show some network is vulnerable, but which/where to attack? • We developed method for anonymizing configuration files • Approach convinced some customers of ATT to disclose their configs to CMU researchers

  9. Anonymization Challenges • We don’t know the intended use of the data • Must anonymize entire configuration file • A customized data set is easier to anonymize • Must preserve structure ofinformation in files • Relationships of identifiers inside/between files • IP address subnet relationships • Traditional parsing tools are of no use • No published grammar for Cisco IOS • 200+ different versions seen in 31 networks

  10. Anonymize Non-numeric Tokens • Created “pass list” of words by string-scraping Cisco’s web pages • Contains most IOS commands • Other words are generic networking terms (“IETF”) • All tokens not in pass list are hashed with salted SHA1 router bgp 64780 redistribute ospf 64 match route-map NYOffice neighbor 1.2.3.4 remote-as 701 route-map NYOffice deny 10 match ip address 4 router bgp 64780 redistribute ospf 64 match route-map 8aTzlvBrbaW neighbor 66.253.160.68 remote-as 701 route-map 8aTzlvBrbaW deny 10 match ip address 4

  11. Anonymize Specific Numbers • Most numbers are harmless, some reveal identity • Public AS numbers • Phone numbers (NOCs, backup modems) • 26 rules used to find and anonymize context-dependent items • "neighbor\\s+$ipAddrPatt\\s+remote-as" • " neighbor\s+\w+\s+remote-as " router bgp 64780 redistribute ospf 64 match route-map NYOffice neighbor 1.2.3.4 remote-as 701 route-map NYOffice deny 10 match ip address 4 router bgp 64780 redistribute ospf 64 match route-map 8aTzlvBrbaW neighbor 66.253.160.68 remote-as 1237 route-map 8aTzlvBrbaW deny 10 match ip address 4

  12. Limits of Anonymization • Anonymization is a lossy process • Comments & meaningful identifiers removed • (Were they right anyway???) • Anonymizer preserves relationships it knows about • Doesn’t know about IP addr <-> ASN mapping • A packet filter, based on IP address, and route policy, based on ASN, could target same AS • Post-anonymization: both mechanisms preserved, but won’t show them targeting same AS • (Router didn’t have that external information either)

  13. Potential Vulnerabilities: Textual Attacks • Identifying information left in configs • Heuristics used as double-check • Rules that anonymize public AS numbers record the public AS numbers they find • Search post-anonymization file for any remaining occurrences

  14. Potential Vulnerabilities:Fingerprinting Attacks • Network characteristics (fingerprint) extracted from anonymized configs matched against public data • Potential fingerprints • BGP community strings • Number of POPs, number of BGP peers • Structure of address space utilization • Others… • Evaluation still in progress • Seems like backbone networks are identifiable • Seems like enterprise networks are not

  15. A Clearinghouse for Configuration Data Network owners • Researchers Retrieve Anonymizer Questions Results Anonymize & test configs Run tools on site: Scalable, pictures Blinded email Upload configs Website enforcing single-blind methodology Retrieve configs Blinded email Analyze data Register with site Questions Results Boot-strap with configs from academic/research institutions?

  16. Questions?

  17. Fingerprinting Attacks • 1. For each anonymized network, compute fingerprint from anonymized config files • Will be 100% accurate • 2. Experimentally measure real networks Data from networks in repository of anonymized configs BGP Peers per POP POPs (sorted by peers/POP)

  18. Measured network characteristics Fingerprinting Attacks • Evaluation still in progress • Seems like backbone networks are identifiable • Seems like enterprise networks are not BGP Peers per POP POPs (sorted by peers/POP)

  19. ip as-path access-list 99 permit _6451[2-9]_ 64512, 64513, … 64519 ip as-path access-list 99 permit_6451[2-9]_ Anonymize Regular Expressions • Some AS numbers appear in regular expressions • Expressions w/ only private AS numbers ! no change • Expressions w/ public AS numbers ! expand and anonymize ip as-path access-list 101 permit _70 [1-3]_ Anonymize 701, 702, 703 1234, 543, 21 ip as-path access-list 101 permit_(1234|543|21)_

  20. Anonymize IP Addresses • Extended Minshall’s prefix-preserving algorithm • Made it class preserving • Class A to Class A, etc. • RIP and older protocols are class-full • Made it “subnet address” preserving • Assume 128.2.0.0/16 is subnet • We want 128.2.0.0 ! 150.7.0.0 • Before extension, 128.2.0.0 ! 150.7.43.66

  21. Y Special? N Anonymize IP Addresses - 2 • Made it “special address” preserving • Multicast, private address space • Must fix collisions in mapping function N Special? IP Addr Anonymize Y

  22. Anonymization Overview • Minimize dependence on context • If in-doubt, hash it out • Remove all comments • Find all IP addresses and hash using specialized prefix-preserving anonymization • Hash all non-numeric tokens not known to be safe • Anonymize specific numeric tokens using regular expressions • Anonymize regular expressions appearing in configs

More Related