1 / 40

Logic-based, data-driven enterprise network security analysis

Logic-based, data-driven enterprise network security analysis. Xinming (Simon) Ou Assistant Professor CIS Department Kansas State University. COS 598D: Formal Methods in Networking Princeton University March 08, 2010. Self Introduction. Brief Bio PhD, Princeton University, 2005

tamber
Download Presentation

Logic-based, data-driven enterprise network security analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Logic-based, data-driven enterprise network security analysis Xinming (Simon) Ou Assistant Professor CIS Department Kansas State University COS 598D: Formal Methods in Networking Princeton University March 08, 2010

  2. Self Introduction • Brief Bio • PhD, Princeton University, 2005 • Post-doc, Purdue CERIAS, Idaho National Laboratory, 2006 • Assistant Professor, Kansas State University, 2006-now • Research Interests • Computer and network security, especially on formal and quantitative analysis • Programming languages, formal methods • Research Group • Argus: http://people.cis.ksu.edu/~xou/argus/

  3. Overview of the two lectures • Lecture One • Datalog model for network attacks • SLG resolution for Datalog evaluation • Exhaustive proof generation for Datalog • Lecture Two • Formulating security hardening problem as a SAT solving problem • Applying MinCostSAT to achieve optimal security configuration • Open research problems

  4. Reasoning System Apache 1.3.4 bug! Cyber Defender’s Life Automated Situation Awareness Users and data assets IDS alerts Network configuration Vulnerability reports Security advisories

  5. Multi-step Attacks Internet Firewall 1 buffer overrun Demilitarized zone (DMZ) webServer Firewall 2 NFS shell sharedBinary Trojan horse workStation Corporation webPages fileServer

  6. Two Questions • Are there potential attack paths in the system? • How can they happen? • How can they be addressed in an optimal way? • Are there attacks that are going on/have succeeded in the system? • How do you know? • How to counter the attack? What we are going to focus on

  7. MulVAL Could root be compromised on any of the machines? User information Ou, Govindavajhala, and Appel. Usenix Security 2005 Datalog Rules from Security Experts Vulnerability Information (e.g. NIST NVD) Analyzer Answers Vulnerability definition (e.g. OVAL, Nessus Scripting Language) Vulnerability Scanner Vulnerability Scanner Network reachability information Network Analyzer

  8. Host access-control lists reachable(internet, webServer, tcp, 80) reachable(webServer, fileserver, nfs, -) . . . Network config (firewall analyzer)

  9. File permissions fileOwner(webServer, /bin/apache, root) fileAttr(webServer, /bin/apache, r,w,x,r,0,0,r,0,0) Host config scanner

  10. Installed software … … vulExists(dbServer, 'CVE-2009-2446', mySQL). vulExists(webserver, ‘CVE-2006-3747’, httpd) Host-based vulnerability scanner

  11. US-CERT NVD Apache 1.3.4 bug! Security advisories … … vulProperty('CVE-2009-2446', remote, privEscalation). vulProperty('CVE-2006-3747', remote, privEscalation).

  12. Datalog Rules Linux security behavior; Windows security behavior; Common attack techniques execCode(Host, PrivilegeLevel) :- vulExists(Host, Program, remote, privilegeEscalation), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel), networkAccess(Host, Protocol, Port). Security expert The rules are completely independent of any site-specific settings.

  13. Rule for NFS accessFile(Server, Access, Path) :- nfsExport(Server, Path, Access, Client), reachable(Client, Server, nfs, -), execCode(Client, _Perm). dmz webServer NFS shell sharedBinary corp webPages fileServer

  14. Rule for Trojan Horse execCode(H, User) :- accessFile(H, write, Path), fileOwner(H, Path, User). projectPlan sharedBinary Trojan horse corp webPages fileServer workStation

  15. Deducing new facts Oops! execCode(attacker, webServer, apache). execCode(Host, PrivilegeLevel) :- vulExists(Host, Program, remote, privilegeEscalation), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel), networkAccess(Host, Protocol, Port). internet networkAccess(webServer, tcp, 80). Derived Firewall 1 serviceRunning(webServer, httpd, tcp, 80, apache). From Vulnerability Scanner webServer dmz vulExists(webServer, httpd, remote, privilegeEscalation). From Vulnerability Scanner & NVD

  16. Advantages of using Prolog • Prolog’s goal-oriented evaluation is potentially more efficient. • Prolog provides more programming flexibility. Can we evaluate Datalog programs in Prolog?

  17. However… • Prolog as a programming language cannot be directly used to evaluate Datalog ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y). parent(bill,mary). parent(mary,john). ?- ancestor(X,Y).

  18. However… • Prolog as a programming language cannot be directly used to evaluate Datalog ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). parent(bill,mary). parent(mary,john). ?- ancestor(X,Y).

  19. However… • Prolog as a programming language cannot be directly used to evaluate Datalog ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). ?- ancestor(X,Y).

  20. Problem of SLD resolution ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y). parent(bill,mary). parent(mary,john). •  ancestor(X, Y). •  parent(X,Y). •  parent(X,Z), ancestor(Z,Y). • X=bill • Y=mary • X=mary • Y=john • X=mary • Z=john • X=bill • Z=mary •  • Success ancestor(john,Y). •  • Success ancestor(mary,Y). • … • Failure parent(mary,Y). parent(mary,Z2), ancestor(Z2,Y). • Y=john • Z2=john ancestor(john,Y). •  • Success • … • Failure

  21. Problem of SLD resolution ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). •  ancestor(X, Y). •  ancestor(Z, Y), parent(X, Z). •  ancestor(Z1, Y), parent(Z, Z1), parent(X, Z). •  ancestor(Z2, Y), parent(Z1, Z2), parent(Z, Z1), parent(X, Z). …

  22. Problem of SLD resolution • Termination of cyclic Datalog programs not only depends on logical semantics, but also the order of the clauses and subgoals. • This creates problems since in network security analysis, such cyclic rules are common place. • e.g. after compromising one machine, the attacker can use it as a stepping stone to compromise another. • Datalog is a declarative language; thus order should not matter. • A pure Datalog program shall always terminate due to the bound on the number of tuples.

  23. Bottom-up Evaluation ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). Semi-naïve Evaluation: Step(1) (base case)ancestor(bill,mary),ancestor(mary,john) Step(2)Iteration 1ancestor(bill, john) Iteration 2No new tuples (“fixpoint”)

  24. SLG Resolution • Goal-oriented evaluation • Predicates can be “tabled” • A table stores the evaluation results of a goal. • The results can be re-used later, i.e. dynamic programming. • Entering an active table indicates a cycle. • Fixpoint operation is taken at such tables. • The XSB system implements SLG resolution • Developed by Stony Brook (http://xsb.sourceforge.net/ ). • Provides full ISO Prolog compatibility.

  25. SLG resolution example ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). generator node new table created for ancestor(X,Y) •  ancestor(X, Y). active node resolve ancestor(Z,Y) against the results in the table for ancestor(X,Y) •  parent(X,Y). •  ancestor(Z, Y), parent(X, Z). • Z=bill • Y=john • X=bill • Y=mary • X=mary • Y=john • Z=bill • Y=mary • Z=mary • Y=john •  parent(X, bill). •  • Success •  • Success • Failure •  parent(X, bill). •  parent(X, mary). • Failure •  • Success X=bill

  26. SLG in MulVAL netAccess(H2, Protocol, Port) :- execCode(H1, User), reachable(H1, H2, Protocol, Port). netAccess(…) execCode(…) from input tuples Possible instantiations Possible instantiations table for first subgoal table for goal

  27. SLG complexity for Datalog • Total time dominated by the rule that has the maximum number of instantiations • Time for computing one table = Computation of the subgoals + retrieving information from input tuples + matching results in the rules bodies • Time for computing all tables = retrieving information from input tuples + matching results in the rules’ bodies • See “On the Complexity of Tabled Datalog Programs” http://www.cs.sunysb.edu/~warren/xsbbook/node21.html

  28. MulVAL complexity in SLG execCode(Attacker, Host, User) :- vulExists(Host, _, Program, remote, privilegeEscalation), networkService(Host, Program, Protocol, Port, User), netAccess(Attacker, Host, Protocol, Port). Scale with network size O(N) different instantiations

  29. MulVAL complexity in SLG netAccess(Attacker, H2, Protocol, Port) :- execCode(Attacker, H1, _), reachable(H1, H2, Protocol, Port). Scale with network size Complexity of MulVAL O(N2) different instantiations

  30. Datalog proof generation • In security analysis, not only do we want to know what attacks could happen, but also we want to know how attacks can happen • Thus, we need more than an yes/no answer for queries. • We need the proofs for the true queries, which in the case of security analysis will be attack paths. • We also want to know all possible attack paths; thus we need exhaustive proof generation.

  31. An obvious approach execCode(Host, PrivilegeLevel) :- vulExists(Host, Program, remote, privilegeEscalation), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel), networkAccess(Host, Protocol, Port). execCode(Host, PrivilegeLevel, Pf) :- vulExists(Host, Program, remote, privilegeEscalation, Pf1), serviceRunning(Host, Program, Protocol, Port, PrivilegeLevel, Pf2), networkAccess(Host, Protocol, Port, Pf3), Pf=(execCode(Host, PrivilegeLevel), [Pf1, Pf2, Pf3]). This will break the bounded-term property and result in non-termination for cyclic Datalog programs

  32. XSB reasoning engine Datalog Proof Steps MulVAL Attack-Graph Toolkit Ou, Boyer, and McQueen. ACM CCS 2006 Datalog rules Security advisories Translated rules Graph Builder Network configuration Datalog representation Datlog proof graph Machine configuration Joint work with Idaho National Laboratory

  33. Stage 1: Record Proof Steps netAccess(H2, Protocol, Port, ProofStep) :- execCode(H1, User), reachable(H1, H2, Protocol, Port), ProofStep= because( ‘multi-hop network access', netAccess(H2, Protocol, Port), [execCode(H1, User), reachable(H1, H2, Protocol, Port)] ). Proof step

  34. 1 0 2 3 Stage 2: Build the Exhaustive Proof because(‘multi-hop network access', netAccess(fileServer, rpc, 100003), [execCode(webServer, apache), reachable(webServer, fileServer, rpc, 100003)]) execCode(webServer, apache) multi-hop network access netAccess(fileServer, rpc, 100003) reachable(webServer, fileServer, rpc, 100003)

  35. Complexity of Proof Building • O(N2) to complete Datalog evaluation • With proof steps generated • O(N2) to build a proof graph from proof steps • Need to build O(N2) graph components • Building of one component • Find the predecessor: table lookup • Find the successors: table lookup Total time: O(N2), if table lookup is constant time

  36. 1 0 2 3 4 5 6 NFS shell Logical Attack Graphs accessFile(attacker,fileServer, write,/export) Trojan horse installation netAccess(attacker,webServer, tcp,80) NFS semantics Remote exploit execCode(attacker, webServer,apache) accessFile(attacker,workStation, write,/usr/local/share) vulExists(webServer, CAN-2002-0392, httpd, remoteExploit, privEscalation) execCode(attacker,workStation,root) : OR : AND networkService (webServer,httpd,tcp,80,apache) : ground fact

  37. Performance and Scalability

  38. Related Work • Sheyner’s attack graph tool (CMU) • Based on model-checking • Cauldron attack graph tool (GMU) • Based on graph-search algorithms • NetSPA attack graph tool (MIT LL) • Graph-search based on a simple attack model

  39. Advantages of the Logic-programming Approach • Publishing and incorporation of knowledge/information through well-understood logical semantics • Efficient and sound analysis by leveraging the reasoning power of well-developed logic-deduction systems

  40. Next Lecture • How to make use of the proof graph • Optimizing mitigation measures through SAT solving • Open problems • Uncertainty in reasoning

More Related