Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards High Performance Network Defense Zhichun Li EECS Department Northwestern University.

Similar presentations


Presentation on theme: "Towards High Performance Network Defense Zhichun Li EECS Department Northwestern University."— Presentation transcript:

1 Towards High Performance Network Defense Zhichun Li EECS Department Northwestern University

2 2 Motivation Botnets Worms Attackers Professional attackers exploit networks for profit $$$

3 3 Network Level Defense Network gateways/routers are the vantage points for detecting large scale attacks Only host based detection/prevention is not enough –Some users do not apply the host-based schemes due to the reliability, overhead, and conflicts –Many users do not update or patch their system on time –E.g., Conficker worm in the end of 2008 infected 9~15 millions of hosts –Cannot only reply on end users for security protection

4 4 Challenges Scalable to high speed networks with a large number of users Highly accurate Adapt fast to the emerging threats Have good attack coverage

5 5 Network-based Intrusion Detection, Prevention, and Forensics System Framework (I) Sketch based monitoring & detection (III) Signature matching engines (II) Polymorphic worm signature generation (IV) Network situational awareness Packet streams Accuracy & adapt fast Accuracy & adapt fast Scalability Accuracy & Scalability & Coverage

6 6 High-speed Network Monitoring and Anomaly Detection Online traffic monitoring and recording [SIGCOMM IMC 2004, INFOCOM 2006, ToN 2007] [INFOCOM 2008] –Reversible sketch for data streaming computation –Record millions of flows (GB traffic) in a few hundred KB –Small # of memory access per packet –Scalable to large key space size (2 32 or 2 64 ) Online sketch-based flow-level anomaly detection [IEEE ICDCS 2006] [Journal of Computer Networks 2010] [IEEE CG&A, Security Visualization 2006] Online stealthy botnet scan detection [IEEE IWQoS 2007] 1 j H 01K-1 … … … hj(k)hj(k) hH(k)hH(k) h1(k)h1(k)

7 7 Network and Distributed System Diagnosis Overlay network monitoring and diagnosis [SIGCOMM IMC 2003, SIGCOMM 2004, ToN 2007] [SIGCOMM 2006] End-user network diagnosis [INFOCOM 2007 (2)] Internet-scale Virtual Private Network (VPN) and backbone monitoring and diagnosis [INFOCOM 2009] Internet-scale Data Center and dist system profiling and diagnosis [NSDI 2010]

8 88 Exploit invariant signature generation [IEEE Symposium on Security and Privacy 2006] (cited by ~100, code and test cases release to Columbia U., UT Austin, Purdue, Georgia Tech, UC Davis, etc) Vulnerability signature generation [IEEE ICNP 2007, ToN 2010] [NSF CyberTrust 06 Award] 1010101 10111101 11111100 00010111 Network gateway Internet Polymorphic Worm Signature Generation Our network

9 99 NetShield vulnerability signature based NIDS/NIPS [NSF CyberTrust 08 Award] [under submission] [patent filed] –Interested by Cisco (IPS ruleset & site visit) –Code release has been used by researchers in University of Toronto Using failure information to detect enterprise zombies [SecureCom09] Spamming botnet detection [NSDI09] Online Protocol Parsing and Signature Matching

10 10 Large-scale botnet and P2P misconfiguration event situational-aware forensics –Botnet attack target/strategy inference [ASIACCS09] –Root cause analysis of the P2P misconfiguration/poisoning traffic [INFOCOM10] Analysis of 2TB data across 4 years over 5 /8 IPs Network Situational Awareness

11 Current Work Data center management and configuration Internet emergency response –AS topology study [CoNEXT09] –Recovery via IXP [Infocom10] Network based web dynamic vulnerability defense Social network security 11

12 12 NetShield: Matching a Large Vulnerability Signature Ruleset for High Performance Network Defense

13 13 Outline Motivation High Speed Matching for Large Rulesets High Speed Parsing Evaluation Research Contributions

14 14 NetShield Overview NIDS/NIPS (Network Intrusion Detection/Prevention System) operation Signature DB NIDS/NIPS Packets Security alerts Accuracy Speed Attack Coverage

15 15 State Of The Art Pros Can efficiently match multiple sigs simultaneously, through DFA Can describe the syntactic context Regular expression (regex) based approaches Used by: Cisco IPS, Juniper IPS, open source Bro Example:.*Abc.*\x90+de[^\r\n]{30}

16 Cons of Regex Regex Context Free Context Sensitive Protocol grammar Theoretical prospective Practical prospective HTTP chunk encoding DNS label pointers Limited expressive power, cannot describe semantic context, thus inaccurate

17 17 State Of The Art Pros Directly describe semantic context Very expressive, can express the vulnerability condition exactly Accurate Vulnerability Signature [Wang et al. 04] Cons Slow! Existing approaches all use sequential matching Require protocol parsing Blaster Worm (WINRPC) Example: BIND: rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00 && context[0].abstract_syntax.uuid=UUID_RemoteActivation BIND-ACK: rpc_vers==5 && rpc_vers_minor==1 CALL: rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00 && opnum==0x00 && stub.RemoteActivationBody.actual_length>=40 && matchRE(stub.buffer, /^\x5c\x00\x5c\x00/) Good state Bad state Vulnerability Signature Vulnerability: design flaws enable the bad inputs lead the program to a bad state Bad input

18 18 Motivation of NetShield 18

19 19 Motivation Desired Features for Signature-based NIDS/NIPS –Accuracy (especially for IPS) –Speed –Coverage: Large ruleset Regular Expression Vulnerability AccuracyRelative Poor Much Better SpeedGood?? MemoryOK?? CoverageGood?? Shield [sigcomm’04] Focus of this work Cannot capture vulnerability condition well!

20 20 Research Challenges and Solutions Challenges –Matching thousands of vulnerability signatures simultaneously Sequential matching  match multiple sigs. simultaneously –High speed protocol parsing Solutions –An efficient algorithm which matches multiple sigs simultaneously –A tailored parsing design for high-speed signature matching

21 21 Background Vulnerability signature basic –Use protocol semantics to express vulnerabilities –Defined on a sequence of PDUs & one predicate for each PDU –Example: ver==1 && method==“put” && len(buf)>300 Data representations –For all the vulnerability signatures we studied, we only need numbers and strings –number operators: ==, >, =, <= –String operators: ==, match_re(.,.), len(.). Blaster Worm (WINRPC) Example: BIND: rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00 && context[0].abstract_syntax.uuid=UUID_RemoteActivation BIND-ACK: rpc_vers==5 && rpc_vers_minor==1 CALL: rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00 && opnum==0x00 && stub.RemoteActivationBody.actual_length>=40 && matchRE(stub.buffer, /^\x5c\x00\x5c\x00/)

22 22 Outline Motivation High Speed Matching for Large Rulesets High Speed Parsing Evaluation Research Contributions

23 23 Matching Problem Formulation Suppose we have n signatures, defined on k matching dimensions (matchers) –A matcher is a two-tuple (field, operation) or a four- tuple for the associative array elements –Translate the n signatures to a n by k table –This translation unlocks the potential of matching multiple signatures simultaneously Rule 4: URI.Filename=“fp40reg.dll” && len(Headers[“host”])>300 RuleIDMethod ==Filename ==Header == LEN 1DELETE** 2POSTHeader.php* 3*awstats.pl* 4*fp40reg.dllname==“host”; len(value)>300 5**name==“User-Agent”; len(value)>544

24 24 Matching Problem Formulation Challenges for Single PDU matching problem (SPM) –Large number of signatures n –Large number of matchers k –Large number of “don’t cares” –Cannot reorder matchers arbitrarily -- buffering constraint –Field dependency Arrays, associative arrays Mutually exclusive fields.

25 25 Difficulty of the SPM Bad News –A well-known computational geometric problem can be reduced to this problem. –And that problem has bad worst case bound O((log N) K-1 ) time or O(N K ) space (worst case ruleset) Good News –Measurement study on Snort and Cisco ruleset –The real-world rulesets are good: the matchers are selective. –With our design O(K)

26 26 Matching Algorithms Candidate Selection Algorithm 1.Pre-computation decides the rule order and matcher order 2.Decomposition. Match each matcher separately and iteratively combine the results efficiently Integer range checking  balanced binary search tree String exact matching  Trie Regex  DFA (XFA)

27 27 Step 1: Pre-Computation Optimize the matcher order based on buffering constraint & field arrival order Rule reorder : Require Matcher 1 Don’t care Matcher 1 Require Matcher 1 Require Matcher 2 Don’t care Matcher 1 & 2 1 n

28 28 Step 2: Iterative Matching RuleIDMethod ==Filename ==Header == LEN 1DELETE** 2POSTHeader.php* 3*awstats.pl* 4*fp40reg.dllname==“host”; len(value)>300 5**name==“User-Agent”; len(value)>544 PDU={Method=POST, Filename=fp40reg.dll, Header: name=“host”, len(value)=450} S 1 ={2} Candidates after match Column 1 (method==) S2=S2=S1S1 A2A2 +B2+B2 ={2}{}+{4}={}+{4}={4} S 3 =S 2 A3+B3A3+B3 ={4}{4}+{}={4}+{}={4} Si Don’t care matcher i+1 require matcher i+1 In A i+1 R1 R2 R3

29 29 Complexity Analysis Merging complexity –Need k -1 merging iterations –For each iteration Merge complexity O(n) the worst case, since S i can have O(n) candidates in the worst case rulesets For real-world rulesets, # of candidates is a small constant. Therefore, O(1) –For real-world rulesets: O(k) which is the optimal we can get Three HTTP traces: avg(|S i |)<0.04 Two WINRPC traces: avg(|S i |)<1.5

30 30 Refinement and Extension SPM improvement –Allow negative conditions –Handle array cases –Handle associative array cases –Handle mutual exclusive cases Extend to Multiple PDU Matching (MPM) –Allow checkpoints.

31 31 Outline Motivation High Speed Matching for Large Rulesets. High Speed Parsing Evaluation Research Contribution

32 High Speed Parsing Design a parsing state machine Build an automated parsing state machine generator General V.S. Special Purpose Keep the whole parse tree in memory Parsing and matching on the fly Parse all the nodes in the tree Only signature related fields (leaf nodes) V.S.

33 33 Outline Motivation High Speed Matching for Large Rulesets. High Speed Parsing Evaluation Research Contributions

34 34 Evaluation Methodology 26GB+ Traces from Tsinghua Univ. (TH), Northwestern (NU) and DARPA Run on a P4 3.8Ghz single core PC w/ 4GB memory After TCP reassembly and preload the PDUs in memory For HTTP we have 794 vulnerability signatures which cover 973 Snort rules. For WINRPC we have 45 vulnerability signatures which cover 3,519 Snort rules 34 Fully implemented prototype 12,000 lines of C++ and 3,000 lines of Python Release at: www.nshield.org Deployed at a university DC with up to 106Mbps

35 35 Parsing Results Trace TH DNS TH WINRPC NU WINRPC TH HTTP NU HTTP DARPA HTTP Avg flow len (B) 778795966.6K55K2.1K Throughput (Gbps) Binpac Our parser 0.31 3.43 1.41 16.2 1.11 12.9 2.10 7.46 14.2 44.4 1.69 6.67 Speed up ratio 11.211.511.63.63.13.9 Max. memory per connection (bytes) 15 14

36 36 Matching Results TraceTH WINRPC NU WINRPC TH HTTP NU HTTP DARPA HTTP Avg flow length (B) 8795966.6K55K2.1K Throughput (Gbps) Sequential CS Matching 10.68 14.37 9.23 10.61 0.34 2.63 2.37 17.63 0.28 1.85 Matching only time speed up ratio 41.811.311.78.8 Avg # of Candidates 1.161.480.0330.0380.0023 Max. memory per connection (bytes) 27 20 11.0 8-core

37 37 Scalability and Accuracy Results Create two polymorphic WINRPC exploits which bypass the original Snort rules but detect accurately by our scheme. For 10-minute “clean” HTTP trace, Snort reported 42 alerts, NetShield reported 0 alerts. Manually verify the 42 alerts are false positives Rule scaling results Performance decrease gracefully Accuracy

38 38 Research Contribution Regular ExpressionExists Vul. IDSNetShield AccuracyPoorGood SpeedGoodPoorGood MemoryGood??Good CoverageGood??Good Build a better Snort alternative! Multiple sig. matching  candidate selection algorithm Parsing  parsing state machine Make vulnerability signature a practical solution for NIDS/NIPS

39 39 Future work Social network security ClientServer Network Security Web/WebSecurity WebPropeht[NSDI10] WebShield Data Center Security

40 40 Q & A Thanks!

41 41 Observations array PDU PDU  parse tree Leaf nodes are numbers or strings General V.S. Special Purpose Keep the whole parse tree in memory Parsing and matching on the fly Parse all the nodes in the tree Only signature related fields (leaf nodes) V.S.

42 42 Efficient Parsing with State Machines Studied eight protocols: HTTP, FTP, SMTP, eMule, BitTorrent, WINRPC, SNMP and DNS as well as their vulnerability signatures Common relationships among leaf nodes Pre-construct parsing state machines based on parse trees and vulnerability signatures Automated parsing state machine generator: UltraPAC

43 43 Example for WINRPC Rectangles are states Parsing variables: R 0.. R 4 0.61 instruction/byte for BIND PDU

44 44 Experiences Working in process –In collaboration with MSR, apply the semantic rich analysis for cloud Web service profiling. To understand why slow and how to improve. Interdisciplinary research Student mentoring (three undergraduates, six junior graduates)

45 45 Future Work Near term –Web security (browser security, web server security) –Data center security –High speed network intrusion prevention system with hardware support Long term research interests –Combating professional profit-driven attackers will be a continuous arm race –Online applications (including Web 2.0 applications) become more complex and vulnerable. –Network speed keeps increasing, which demands highly scalable approaches.

46 46 Research Contributions Demonstrate vulnerability signatures can be applied to NIDS/NIPS, which can significantly improve the accuracy of current NIDS/NIPS Propose the candidate selection algorithm for matching a large number of vulnerability signatures efficiently Propose parsing state machine for fast protocol parsing Implement the NetShield

47 47 Comparing With Regex Memory for 973 Snort rules: DFA 5.29GB (XFA 863 rules1.08MB), NetShield 2.3MB Per flow memory: XFA 36 bytes, NetShield 20 bytes. Throughput: XFA 756Mbps, NetShield 1.9+Gbps (*XFA [SIGCOMM08][Oakland08])

48 48 Measure Snort Rules Semi-manually classify the rules. 1.Group by CVE-ID 2.Manually look at each vulnerability Results –86.7% of rules can be improved by protocol semantic vulnerability signatures. –Most of remaining rules (9.9%) are web DHTML and scripts related which are not suitable for signature based approach. –On average 4.5 Snort rules are reduced to one vulnerability signature. –For binary protocol the reduction ratio is much higher than that of text based ones. For netbios.rules the ratio is 67.6.

49 49 Matcher order Reduce S i+1 Enlarge S i+1 fixed, put the matcher later, reduce B i+1 Merging Overhead | S i | (use hash table to calculate in A i+1, O(1))

50 50 Matcher order optimization Worth buffering only if estmaxB(M j )<=MaxB For M i in AllMatchers –Try to clear all the M j in the buffer which estmaxB(M j )<=MaxB –Buffer M i if ( estmaxB(M i )>MaxB ) –When len(Buf)>Buflen, remove the M j with minimum estmaxB(M j )

51 51

52 52 Backup Slides

53 53 Motivation Network security has been recognized as the single most important attribute of their networks, according to survey to 395 senior executives conducted by AT&T Many new emerging threats make the situation even worse

54 54 Candidate merge operation Si Don’t care matcher i+1 require matcher i+1 In A i+1

55 55 A Vulnerability Signature Example Data representations –For all the vulnerability signatures we studied, we only need numbers and strings –number operators: ==, >, =, <= –String operators: ==, match_re(.,.), len(.). Example signature for Blaster worm Example: BIND: rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00 && context[0].abstract_syntax.uuid=UUID_RemoteActivation BIND-ACK: rpc_vers==5 && rpc_vers_minor==1 CALL: rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00 && stub.RemoteActivationBody.actual_length>=40 && matchRE( stub.buffer, /^\x5c\x00\x5c\x00/)

56 56 System Framework Scalability Accuracy & adapt fast Accuracy & Scalability & Coverage Accuracy & adapt fast Scalability Accuracy & Scalability & Coverage Accuracy & adapt fast Scalability Accuracy & Scalability & Coverage Accuracy & adapt fast Accuracy & adapt fast Scalability Accuracy & Scalability & Coverage

57 57 Example of Vulnerability Signatures At least 75% vulnerabilities are due to buffer overflow Sample vulnerability signature Field length corresponding to vulnerable buffer > certain threshold Intrinsic to buffer overflow vulnerability and hard to evade Vulnerable buffer Protocol message Overflow!

58 58 Old Slides

59 59 Conclusions A novel network-based vulnerability signature matching engine –Through measurement study on Snort ruleset, prove the vulnerability signature can improve most of the signatures in NIDS/IPS. –Proposed parsing state machine for fast parsing –Propose a candidate selection algorithm for matching a large number of vulnerability signature simultaneously

60 60 Outline Motivation Feasibility Study: a measurement approach Problem Statement High Speed Parsing High Speed Matching for massive vulnerability Signatures. Evaluation Conclusions

61 61 Outline Motivation Feasibility Study: a measurement approach Problem Statement High Speed Parsing High Speed Matching for massive vulnerability Signatures. Evaluation Conclusions

62 62 Outline Motivation Feasibility Study: a measurement approach Problem Statement High Speed Parsing High Speed Matching for massive vulnerability Signatures. Evaluation Conclusions

63 63 Outline Motivation Feasibility Study: a measurement approach Problem Statement High Speed Parsing High Speed Matching for a large number of vulnerability Signatures. Evaluation Conclusions

64 64 Outline Motivation Feasibility Study: a measurement approach Problem Statement High Speed Parsing High Speed Matching for massive vulnerability Signatures. Evaluation Conclusions

65 65 Limitations of Regular Expression Signatures 1010101 10111101 11111100 00010111 Our network Traffic Filtering Internet Signature: 10.*01 X X Polymorphic attack (worm/botnet) might not have exact regular expression based signature Polymorphism!

66 66 What we do? Build a NIDS/NIPS with much better accuracy and similar speed comparing with Regular Expression based approaches –Feasibility: Snort ruleset (6,735 signatures) 86.7% can be improved by vulnerability signatures. –High speed Parsing: 2.7~12 Gbps –High speed Matching: Efficient Algorithm for matching massive vulnerability rules HTTP, 791 vulnerability signatures at ~1Gbps

67 67 Problem Formulation Parsing problem formulation –Given a PDU and the protocol specification as input, output the set of fields which required by matching.

68 68 Publications Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE ICNP 2007. Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007 Reversible sketches: Enabling monitoring and analysis over high speed data streams Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao, Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006 Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience Zhichun Li, Yan Chen and Aaron Beach, Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balacing, in Proc. of ACM SIGCOMM LSAD 2006 Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE ICDCS 2006 Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006 Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications

69 69 Current Status Part I: Sketch based monitoring & detection –Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007Reversible sketches: Enabling monitoring and analysis over high speed data streams –Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006 (252/1400=18%)Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications –Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE International Conference on Distributed Computing Systems (ICDCS) 2006 (75/536=14%) (Alphabetical order) Part II: Polymorphic worm signature generation –TOSG: Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao, Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006 (23/251=9%) Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience –LESG: Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE International Conference on Network Protocols (ICNP) 2007 (32/220=14%)

70 70 Current Status Part III: Signature matching engines –Work in progress, will be focus of this talk –Zhichun Li, Gao Xia, Yi Tang, Jian Chen, Ying He, Yan Chen and Bin Liu, NetShield : Towards High Performance Network- based Semantic Signature Matching, in submission Part IV: Network Situational Awareness –Work in process –Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson, Towards Situational Awareness of Large-Scale Botnet Events using Honeynets, in preparation –Zhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic, P2P Doctor: Measurement and Diagnosis of Misconfigured Peer-to-Peer Traffic, in submission

71 71 Current Status Part I: Sketch based monitoring & detection –Result in [Infocom06,ToN,ICDCS06] Part II: Polymorphic worm signature generation –Result in [Oakland06,ICNP07] Part III: Signature matching engines –Work in progress, will be focus of this talk Part IV: Network Situational Awareness –Work in process

72 72 Limitations of Exploit Based Signature 1010101 10111101 11111100 00010111 Our network Traffic Filtering Internet Signature: 10.*01 X X Polymorphic worm might not have exact exploit based signature Polymorphism!

73 73 Vulnerability Signature Work for polymorphic worms Work for all the worms which target the same vulnerability Vulnerability signature traffic filtering Internet X X Our network Vulnerability X X


Download ppt "Towards High Performance Network Defense Zhichun Li EECS Department Northwestern University."

Similar presentations


Ads by Google