Download presentation
Presentation is loading. Please wait.
Published byMerilyn Wade Modified over 9 years ago
1
Logic-based, data-driven enterprise network security analysis Xinming (Simon) Ou Assistant Professor CIS Department Kansas State University COS 598D: Formal Methods in Networking Princeton University March 08, 2010 1
2
Self Introduction Brief Bio –PhD, Princeton University, 2005 –Post-doc, Purdue CERIAS, Idaho National Laboratory, 2006 –Assistant Professor, Kansas State University, 2006-now Research Interests –Computer and network security, especially on formal and quantitative analysis –Programming languages, formal methods Research Group –Argus: http://people.cis.ksu.edu/~xou/argus/http://people.cis.ksu.edu/~xou/argus/ 2
3
Overview of the two lectures Lecture One –Datalog model for network attacks –SLG resolution for Datalog evaluation –Exhaustive proof generation for Datalog Lecture Two –Formulating security hardening problem as a SAT solving problem –Applying MinCostSAT to achieve optimal security configuration –Open research problems 3
4
Cyber Defender’s Life Security advisories Apache 1.3.4 bug! Vulnerability reports Network configuration IDS alerts Users and data assets Reasoning System Automated Situation Awareness 4
5
Multi-step Attacks Internet Demilitarized zone (DMZ) Corporation webServer workStation webPages fileServer Firewall 2 buffer overrun Trojan horse sharedBinary NFS shell Firewall 1 5
6
Two Questions Are there potential attack paths in the system? –How can they happen? –How can they be addressed in an optimal way? Are there attacks that are going on/have succeeded in the system? –How do you know? –How to counter the attack? What we are going to focus on 6
7
MulVAL Datalog Rules from Security Experts Vulnerability Scanner Analyzer Could root be compromised on any of the machines? Ou, Govindavajhala, and Appel. Usenix Security 2005 Answers Network Analyzer Vulnerability Information (e.g. NIST NVD) Network reachability information Vulnerability definition (e.g. OVAL, Nessus Scripting Language) User information Vulnerability Scanner 7
8
Network config (firewall analyzer) Host access-control lists reachable(internet, webServer, tcp, 80) reachable(webServer, fileserver, nfs, -). 8
9
Host config scanner File permissions fileOwner(webServer, /bin/apache, root) fileAttr( webServer, /bin/apache, r,w,x,r,0,0,r,0,0 ) 9
10
Host-based vulnerability scanner Installed software vulExists(webserver, ‘CVE-2006-3747’, httpd) vulExists(dbServer, 'CVE-2009-2446', mySQL). … … 10
11
US-CERT NVD Apache 1.3.4 bug! Security advisories vulProperty('CVE-2006-3747', remote, privEscalation). vulProperty('CVE-2009-2446', remote, privEscalation). … 11
12
Security expert Datalog Rules execCode (Host, PrivilegeLevel) :- vulExists (Host, Program, remote, privilegeEscalation), serviceRunning (Host, Program, Protocol, Port, PrivilegeLevel), networkAccess (Host, Protocol, Port). Linux security behavior; Windows security behavior; Common attack techniques The rules are completely independent of any site-specific settings. 12
13
Rule for NFS dmz corp webServer webPages fileServer sharedBinary NFS shell accessFile (Server, Access, Path) :- nfsExport (Server, Path, Access, Client), reachable (Client, Server, nfs, -), execCode (Client, _Perm). 13
14
Rule for Trojan Horse corp workStation webPages fileServer Trojan horse projectPlan sharedBinary execCode (H, User) :- accessFile (H, write, Path), fileOwner (H, Path, User). 14
15
Deducing new facts execCode (Host, PrivilegeLevel) :- vulExists (Host, Program, remote, privilegeEscalation), serviceRunning (Host, Program, Protocol, Port, PrivilegeLevel), networkAccess (Host, Protocol, Port). internet dmz webServer Firewall 1 vulExists (webServer, httpd, remote, privilegeEscalation). serviceRunning (webServer, httpd, tcp, 80, apache). networkAccess (webServer, tcp, 80). execCode (attacker, webServer, apache). Oops! From Vulnerability Scanner & NVD From Vulnerability Scanner Derived 15
16
Advantages of using Prolog Prolog’s goal-oriented evaluation is potentially more efficient. Prolog provides more programming flexibility. Can we evaluate Datalog programs in Prolog? 16
17
However… Prolog as a programming language cannot be directly used to evaluate Datalog ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y). parent(bill,mary). parent(mary,john). ?- ancestor(X,Y). 17
18
However… Prolog as a programming language cannot be directly used to evaluate Datalog ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). parent(bill,mary). parent(mary,john). ?- ancestor(X,Y). 18
19
However… Prolog as a programming language cannot be directly used to evaluate Datalog ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). ?- ancestor(X,Y). 19
20
Z2=john X=mary Y=john X=bill Y=mary Problem of SLD resolution ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y). parent(bill,mary). parent(mary,john). parent(X,Y). Success Success parent(X,Z), ancestor(Z,Y). ancestor(X, Y). X=bill Z=mary ancestor(mary,Y). parent(mary,Y). Success parent(mary,Z2), ancestor(Z2,Y). … Failure … Failure ancestor(john,Y). X=mary Z=john ancestor(john,Y). 20
21
Problem of SLD resolution ancestor(X, Y). ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). ancestor(Z, Y), parent(X, Z). ancestor(Z1, Y), parent(Z, Z1), parent(X, Z). ancestor(Z2, Y), parent(Z1, Z2), parent(Z, Z1), parent(X, Z). … 21
22
Problem of SLD resolution Termination of cyclic Datalog programs not only depends on logical semantics, but also the order of the clauses and subgoals. –This creates problems since in network security analysis, such cyclic rules are common place. e.g. after compromising one machine, the attacker can use it as a stepping stone to compromise another. –Datalog is a declarative language; thus order should not matter. –A pure Datalog program shall always terminate due to the bound on the number of tuples. 22
23
Bottom-up Evaluation Semi-naïve Evaluation: Step(1) (base case) ancestor(bill,mary),ancestor(mary,john) Step(2) Iteration 1 ancestor(bill, john) Iteration 2 No new tuples (“fixpoint”) ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). 23
24
SLG Resolution Goal-oriented evaluation Predicates can be “tabled” –A table stores the evaluation results of a goal. –The results can be re-used later, i.e. dynamic programming. –Entering an active table indicates a cycle. –Fixpoint operation is taken at such tables. The XSB system implements SLG resolution –Developed by Stony Brook ( http://xsb.sourceforge.net/ ). http://xsb.sourceforge.net/ –Provides full ISO Prolog compatibility. 24
25
Z=bill Y=mary SLG resolution example ancestor(X, Y). ancestor(X,Y) :- ancestor(Z,Y), parent(X,Z). ancestor(X,Y) :- parent(X,Y). parent(bill,mary). parent(mary,john). ancestor(Z, Y), parent(X, Z). 25 generator node new table created for ancestor(X,Y) active node resolve ancestor(Z,Y) against the results in the table for ancestor(X,Y) parent(X, bill). parent(X,Y). X=mary Y=john X=bill Y=mary Success Success Failure Z=mary Y=john parent(X, mary). X=bill Success Z=bill Y=john parent(X, bill). Failure
26
SLG in MulVAL netAccess(H2, Protocol, Port) :- execCode(H1, User), reachable(H1, H2, Protocol, Port). netAccess(…) Possible instantiations table for goal execCode(…) Possible instantiations table for first subgoal from input tuples 26
27
SLG complexity for Datalog Total time dominated by the rule that has the maximum number of instantiations –Time for computing one table = Computation of the subgoals + retrieving information from input tuples + matching results in the rules bodies –Time for computing all tables = retrieving information from input tuples + matching results in the rules’ bodies See “On the Complexity of Tabled Datalog Programs” http://www.cs.sunysb.edu/~warren/xsbbook/node21.html http://www.cs.sunysb.edu/~warren/xsbbook/node21.html 27
28
MulVAL complexity in SLG execCode(Attacker, Host, User) :- vulExists(Host, _, Program, remote, privilegeEscalation), networkService(Host, Program, Protocol, Port, User), netAccess(Attacker, Host, Protocol, Port). Scale with network size O(N) different instantiations 28
29
netAccess(Attacker, H2, Protocol, Port) :- execCode(Attacker, H1, _), reachable(H1, H2, Protocol, Port). MulVAL complexity in SLG Scale with network size O(N 2 ) different instantiations Complexity of MulVAL 29
30
Datalog proof generation In security analysis, not only do we want to know what attacks could happen, but also we want to know how attacks can happen –Thus, we need more than an yes/no answer for queries. –We need the proofs for the true queries, which in the case of security analysis will be attack paths. –We also want to know all possible attack paths; thus we need exhaustive proof generation. 30
31
An obvious approach 31 execCode (Host, PrivilegeLevel) :- vulExists (Host, Program, remote, privilegeEscalation), serviceRunning (Host, Program, Protocol, Port, PrivilegeLevel), networkAccess (Host, Protocol, Port). execCode (Host, PrivilegeLevel, Pf) :- vulExists (Host, Program, remote, privilegeEscalation, Pf1), serviceRunning (Host, Program, Protocol, Port, PrivilegeLevel, Pf2), networkAccess (Host, Protocol, Port, Pf3), Pf=( execCode (Host, PrivilegeLevel), [Pf1, Pf2, Pf3]). This will break the bounded-term property and result in non-termination for cyclic Datalog programs
32
MulVAL Attack-Graph Toolkit Datalog representation Machine configuration Network configuration Security advisories XSB reasoning engine Datalog Proof Steps Graph Builder Datlog proof graph Datalog rules Ou, Boyer, and McQueen. ACM CCS 2006 Joint work with Idaho National Laboratory 32 Translated rules
33
netAccess(H2, Protocol, Port, ProofStep) :- execCode(H1, User), reachable(H1, H2, Protocol, Port), ProofStep= because( ‘multi-hop network access', netAccess(H2, Protocol, Port), [execCode(H1, User), reachable(H1, H2, Protocol, Port)] ). Stage 1: Record Proof Steps Proof step 33
34
netAccess(fileServer, rpc, 100003) Stage 2: Build the Exhaustive Proof because(‘multi-hop network access', netAccess(fileServer, rpc, 100003), [execCode(webServer, apache), reachable(webServer, fileServer, rpc, 100003)]) 1 multi-hop network access 0 execCode(webServer, apache) reachable(webServer, fileServer, rpc, 100003) 23 34
35
Complexity of Proof Building O(N 2 ) to complete Datalog evaluation –With proof steps generated O(N 2 ) to build a proof graph from proof steps –Need to build O(N 2 ) graph components –Building of one component Find the predecessor: table lookup Find the successors: table lookup Total time: O(N 2 ), if table lookup is constant time 35
36
Logical Attack Graphs 1023456 : OR : AND : ground fact execCode(attacker,workStation,root) Trojan horse installation accessFile(attacker,workStation, write,/usr/local/share) NFS semantics networkService (webServer,httpd,tcp,80,apache) vulExists(webServer, CAN-2002-0392, httpd, remoteExploit, privEscalation) netAccess(attacker,webServer, tcp,80) Remote exploit execCode(attacker, webServer,apache) accessFile(attacker,fileServer, write,/export) NFS shell 36
37
Performance and Scalability 37
38
Related Work Sheyner’s attack graph tool (CMU) –Based on model-checking Cauldron attack graph tool (GMU) –Based on graph-search algorithms NetSPA attack graph tool (MIT LL) –Graph-search based on a simple attack model 38
39
Advantages of the Logic- programming Approach Publishing and incorporation of knowledge/information through well- understood logical semantics Efficient and sound analysis by leveraging the reasoning power of well-developed logic-deduction systems 39
40
Next Lecture How to make use of the proof graph –Optimizing mitigation measures through SAT solving Open problems –Uncertainty in reasoning 40
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.