1 / 29

Secure Network Provenance

Secure Network Provenance. Wenchao Zhou * , Qiong Fei *, Arjun Narayan *, Andreas Haeberlen *, Boon Thau Loo*, Micah Sherr + * University of Pennsylvania + Georgetown University http://snp.cis.upenn.edu/. Motivation. Route r 2. An example scenario: network routing

mira
Download Presentation

Secure Network Provenance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Secure Network Provenance Wenchao Zhou*, QiongFei*, ArjunNarayan*, Andreas Haeberlen*, Boon Thau Loo*, Micah Sherr+ *University of Pennsylvania +Georgetown University http://snp.cis.upenn.edu/

  2. Motivation Route r2 • An example scenario: network routing • System administrator observes strange behavior • Example: the route to foo.com has suddenly changed • What exactly happened (innocent reason or malicious attack)? Why did my route to foo.com change?! D E A foo.com • Innocent Reason? Alice • Malicious Attack? Route r1 C B

  3. We Need Secure Forensics • For network routing … • Example: incident in March 2010 • Traffic from Capital Hill got redirected • … but also for other application scenarios • Distributed hash table: Eclipse attack • Cloud computing: misbehaving machines • Online multi-player gaming: cheating • Goal: secure forensics in adversarial scenarios 3

  4. Ideal Solution Route r2 Why did my route to foo.com change?! Q: Explain why the route to foo.com changed to r2. D E A foo.com C B Alice The Network A: Because someone accessed Router D and changed the configuration from X to Y. • Not realistic: adversary can tell lies

  5. Challenge: Adversaries Can Lie Everything is fine. Router E advertised a new route. I should cover up the intrusion. Route r2 Q: Explain why the route to foo.com changed to r2. D E A foo.com C B Alice The Network • Problem: adversary can … • ... fabricate plausible (yet incorrect) response • … point accusation towards innocent nodes

  6. Existing Solutions • Existing systems assume trusted components • Trusted OS kernel, monitor, or hardware • E.g. Backtracker [OSDI 06], PASS [USENIX ATC 06], ReVirt [OSDI 02], A2M [SOSP 07] • These components may have bugs or be compromised • Are there alternatives that do not require such trust? • Our solution: • We assume no trusted components; • Adversary has full control over an arbitrary subset of the network (Byzantine faults). 6

  7. Ideal Guarantees Fundamentally impossible • Ideally: explanation is always complete and accurate • Fundamental limitations • E.g. Faulty nodes secretly exchange messages • E.g. Faulty nodes communicate outside the system • What guarantees can we provide? 7

  8. Realistic Guarantees Q: Why did my route to foo.com change to r2? Route r2 D E A C B The Network Aha, at least I know which node is compromised. foo.com A: Because someone accessed Router D and changed its router configuration from X to Y. Alice No faults: Explanation is complete and accurate Byzantine fault: Explanation identifies at least one faulty node Formal definitions and proofs in the paper

  9. Outline • Goal: A secure forensics system that works in an adversarial environment • Explains unexpected behavior • No faults: explanation is complete and accurate • Byzantine fault: exposes at least one faulty node with evidence • Model: Secure Network Provenance • Tamper-evident Maintenance and Processing • Evaluation • Conclusion 9

  10. Provenance as Explanations • Origin: data provenance in databases • Explains the derivation of tuples (ExSPAN [SIGMOD 10]) • Captures the dependencies between tuples as a graph • “Explanation” of a tuple is a tree rooted at the tuple route(D, foo.com) route(E, foo.com) D E link(D, E) link(E, B) route(A, foo.com) route(A, B) A link(A, B) foo.com route(B, foo.com) route(A, D) route(C, foo.com) …… B C Alice link(C, foo.com) link(B, C)

  11. Secure Network Provenance • Challenge #1. Handle past and transient behavior • Traditional data provenance targets current, stable state • What if the system never converges? • What if the state no longer exists? route(A, foo.com) link(A, B) route(B, foo.com) route(C, foo.com) link(C, foo.com) link(B, C)

  12. Secure Network Provenance • Challenge #1. Handle past and transient behavior • Traditional data provenance targets current, stable state • What if the system never converges? • What if the state no longer exists? • Solution: Add a temporal dimension Time = t1 Time = t2 Time = t3 route(A, foo.com) route(B, foo.com) route(B, foo.com) link(A, B) route(C, foo.com) route(C, foo.com) route(C, foo.com) link(C, foo.com) link(C, foo.com) link(C, foo.com) link(B, C) link(B, C) Timeline

  13. Secure Network Provenance • Challenge #2. Explain changes, not just state • Traditional data provenance targets system state • Often more useful to ask why a tuple (dis)appeared • Solution: Include “deltas” in provenance Time = t1 Time = t2 Time = t3 +route(A, foo.com) route(A, foo.com) +route(B, foo.com) route(B, foo.com) route(B, foo.com) +route(C, foo.com) link(A, B) route(C, foo.com) route(C, foo.com) route(C, foo.com) +link(C, foo.com) link(C, foo.com) link(C, foo.com) link(C, foo.com) link(B, C) link(B, C) Timeline

  14. Secure Network Provenance • Challenge #3. Partition and secure provenance • A trusted node would be ideal, but we don’t have one • Need to partition the graph among the nodes themselves • Prevent nodes from altering the graph route(D, foo.com) route(E, foo.com) link(D, E) link(E, B) route(A, foo.com) link(A, B) route(B, foo.com) route(C, foo.com) link(C, foo.com) link(B, C)

  15. Partitioning the Provenance Graph • Step 1: Each node keeps vertices about local actions • Split cross-node communications • Step 2: Make the graph tamper-evident route(D, foo.com) route(E, foo.com) link(D, E) link(E, B) route(A, foo.com) link(A, B) route(B, foo.com) route(C, foo.com) RECV SEND link(C, foo.com) link(B, C)

  16. Securing Cross-Node Edges • Step 1: Each node keeps vertices about local actions • Split cross-node communications • Step 2: Make the graph tamper-evident • Secure cross-node edges (evidence of omissions) Signedcommitmentfrom B Signed ACKfrom A RECEIVE SEND RECV SEND Router A Router B

  17. Outline • Goal: A secure forensics system that works in an adversarial environment • Explains unexpected behavior • No faults: explanation is complete and accurate • Byzantine fault: exposes at least one faulty node with evidence • Model: Secure Network Provenance • Tamper-evident Maintenance and Processing • Evaluation • Conclusion 17

  18. System Overview Users Primary system Provenance system Operator Application • Extract provenance • Maintain provenance • Query provenance Query engine Query engine Maintenance engine Maintenance engine Network • Stand-alone provenance system • On-demand provenance reconstruction • Provenance graph can be huge (with temporal dimension) • Rebuild only the parts needed to answer a query 18

  19. Extracting Dependencies • Option 1: Inferred provenance • Declarative specifications explicitly capture provenance • E.g. Declarative networking, SQL queries, etc. • Option 2: Reported provenance • Modified source code reports provenance • Option 3: External specification • Defined on observed I/Os of a black-box system 19

  20. Secure Provenance Maintenance foo.com Alice E D A …… RECV ACK …… SEND RCV-ACK B C • Maintain sufficient information for reconstruction • I/O and non-deterministic events are sufficient • Logs are maintained using tamper-evident logging • Based on ideas from PeerReview [SOSP 07] 20

  21. Secure Provenance Querying foo.com Alice E D route(A, foo.com) RECV (from B) A link(A, B) Explain the route from A to foo.com. B C • Recursively construct the provenance graph • Retrieve secure logs from remote nodes • Check for tampering, omission, and equivocation • Replay the log to regenerate the provenance graph 21

  22. Secure Provenance Querying foo.com Alice E D route(A, foo.com) A route(B, foo.com) link(A, B) RECV (from C) B C • Recursively construct the provenance graph • Retrieve secure logs from remote nodes • Check for tampering, omission, and equivocation • Replay the log to regenerate the provenance graph 22 link(B, C)

  23. Secure Provenance Querying foo.com Alice E D route(A, foo.com) A route(B, foo.com) link(A, B) OK. Now I know how the route was derived. route(C, foo.com) B C • Recursively construct the provenance graph • Retrieve secure logs from remote nodes • Check for tampering, omission, and equivocation • Replay the log to regenerate the provenance graph 23 link(C, foo.com) link(B, C)

  24. Outline • Goal: A secure forensics system that works in an adversarial environment • Explains unexpected behavior • No faults: explanation is complete and accurate • Byzantine fault: exposes at least one faulty node with evidence • Model: Secure Network Provenance • Tamper-evident Maintenance and Processing • Evaluation • Conclusion 24

  25. Evaluation Results • Prototype implementation (SNooPy) • How useful is SNP? Is it applicable to different systems? • How expensive is SNP at runtime? • Traffic overhead, storage cost, additional CPU overhead? • Does SNP affect scalability? • What is the querying performance? • Per-query traffic overhead? • Turnaround time for each query?

  26. Usability: Applications • We evaluated SNooPy with • Quagga BGP: RouteView (external specification) • Explains oscillation caused by router misconfiguration • HadoopMapReduce: (reported provenance) • Detects a tampered Mapper that returns inaccurate results • Declarative Chord DHT: (inferred provenance) • Detects an Eclipse attacker that always returns its own ID • SNooPy solves problems reported in existing work

  27. Runtime Overhead: Storage Over 50% of the overhead was due to signatures and acks. Batching messages would help. • Manageable storage overhead • One week of data: E.g. Quagga – 7.3GB; Chord – 665MB

  28. Query Latency largely due to replaying logs Verification Replay Download dominated by verifying logs and snapshots • Query latency varies from application to application • Reasonable overhead

  29. Summary • Questions? • Project website: http://snp.cis.upenn.edu/ • Secure network provenance in untrusted environments • Requires no trusted components • Strong guarantees even in the presence of Byzantine faults • Formal proof in a technical report • Significantly extends traditional provenance model • Past and transient state, provenance of change, … • Efficient storage: reconstructs provenance graph on demand • Application-independent (Quagga, Hadoop, and Chord)

More Related