Presentation is loading. Please wait.

Presentation is loading. Please wait.

BlindBox: Deep Packet Inspection over Encrypted Traffic

Similar presentations


Presentation on theme: "BlindBox: Deep Packet Inspection over Encrypted Traffic"— Presentation transcript:

1 BlindBox: Deep Packet Inspection over Encrypted Traffic
2015 SIGCOMM 15 August Justine Sherry UC Berkeley ,Chang Lan UC Berkeley , Raluca Ada Popa UC Berkeley ,Sylvia Ratnasamy UC Berkeley ,ETH Zürich UC Berkeley

2 Outline Introduction Overview Protocol I Protocol II Protocol III
System Implementation Evaluation Conclusion

3 Introduction

4 Introduction Network Intrusion Detection/Prevention (IDS/IPS) systems
Snort Bro All DPI System are working on plaintext data With HTTPS, man-in-the-middle attack on SSL and decrypt the traffic at the middlebox Insecure BlindBox the first system that simultaneously provides both of these properties. the functionality of middleboxes the privacy of encryption. 現在很多網路的middlebox perform dpi 來對 user和network operator 提供服務 面臨到的問題是the functionality of middle- boxes and the privacy of encryption. 因為現在點對點加密不被保證,因為被中間middlebox破壞,private data in middlebox , security cannot be promised 因此現在抉擇了兩種 middlebox 的功能和加密的私密性

5 Challenges Networks operate at very high rates requiring cryptographic operations in micro or nano seconds Some middleboxes support for rich operations Fully homomorphic Functional encryption

6 BlindBox Perform the deep-packet inspection directly on the encrypted traffic Enables applications IDS Exfiltration detection Parental filtering Supports real rulesets from both open-source and industrial DPI systems. It is practical for settings with long-lived HTTPS connections Core encryption scheme is 3-6 orders of magnitude faster than existing relevant cryptographic schemes. Exfiltration detection 可以block private data 因為意外洩漏出去 藉由private data 有加上watermarl 搜尋那下watermark

7 Privacy Model from BlindBox
Two privacy classes Exact match privacy Probable cause privacy Protect the data with strong randomized encryption schemes Basic idea : AES 兩種privacy gaurantee 都比當今的man in the middle approcah的privay 還要強 第一個class 做exact string matching,對於其他沒有match 到的 ,blind box 不會學習到任何事情(honest but curious) 第二個class 是在當有string mathcing到的時候,blindbox會decrypt the traffic then do regular expression or some script 所以blindbox不會看到traffic if the stream 沒有包含attack rule(string)

8 Overview

9 Simple transmit process
Rule generator Middle Box Rule list End-User End-User E(Attack) Attack E(Data) Data E(Data) Data Detect Attack Compare

10 BlindBox’s Requirements
Key used in encrypting token and rule cannot be revealed to middlebox Obfuscated rule encryption used in rule encryption The middlebox cannot read the user’s traffic except the portions of the traffic which are considered suspicious based on the attack rules. Rule cannot be revealed to sender 如果端點知道rule,會利用rule躲過detection 如果middlebox知道key,他可以解掉traffic 或是token

11 Threat Model The original attacker considered by IDS
At least one of the endpoint should be honest(not malicious) Parental filter Commercial exfiltration detection devices The attacker at the middlebox Honest but curious Wish to hide content form MB but allowing MB to do DPI 如果點對點都是惡毒的,他們會自己協調好一個key,利用這個key躲過blindbox detection,類似一些stroing encryption Parental filter require one endpoint is innocent Middlebox try to learn private data

12 Architecture

13 System architecture Sender Rule generator Middlebox Receiver
Encrypt with SSL Tokenize then Encrypt Used in tokenzie the traffic to be keywords Rule generator Provide rule to middlebox Provide KEY to sender and receiver to perform AES Middlebox Block or Alert if encrypted tokens match encrypted rules Receiver Decrypt with SSL Validate tokens

14 Protocol I : BASIC DETECTION

15 Challenge 1 ) Encrypt rule 2) Tokenization 3) Prevent same cipher-text
Sender Middle Box Receiver SSL Raw Data Raw Data 𝐴𝐸𝑆 𝑘 ( 𝑇 2 ) 𝐴𝐸𝑆 𝑘 ( 𝑇 1 ) 𝐴𝐸𝑆 𝑘 ( 𝑇 0 ) Compare Validate Rule List 𝐴𝐸𝑆 𝑘 ( 𝑅 0 ) 𝐴𝐸𝑆 𝑘 ( 𝑅 1 ) Challenge 1 ) Encrypt rule 2) Tokenization 3) Prevent same cipher-text Rule generator 負責產生rule給middle box,產生一個key給sender和reciver,作為對稱是加密的鑰匙 每一條rule 都代表一種attack rule通常是一個或多個key word,如果比對到代表traffic 是有嫌疑的 # Rule List is in Middle box

16 How to Encrypt rule? AES Yao’s Garbled Circuits
Two parties can collaborate to correctly the output of a function without either party needing to reveal their inputs to the function. AES Rule Key Cipher-text Oblivious Transfer (OT) B can obtain without learning and A does not learn b.

17 Oblivious Transfer A has S = { 𝑆 0 =17, 𝑆 1 =19} B want select i = 0
q = 11, g = 7 C = 9 1) Select C from 𝑍 𝑞 that C = 𝑔 𝑢 𝑚𝑜𝑑 𝑞 2) Select 𝑥 𝑖 , that 0 < 𝑥 𝑖 < q-1, so 𝑥 0 = 3 𝐵 0 = 2, 𝐵 1 = 10 4) Check 3) Set 5) Select 𝑦 0 = 2, 𝑦 1 = 3, that 0<= 𝑦 𝑖 < q-1 𝑎 0 = 5, 𝑎 1 = 2 a b同意兩個數 g q使得g是q的generator C的discrete log u 很難被找到 6) Cal. 7) Cal. 𝑟 0 = 21, 𝑟 1 = 29 8) Since can easily calculate 𝑆 0 = 𝑟 0 −4=17

18 Yao's garbled circuit A B 4 garbled value OT Result 1 ) Garbling
3) B’s input wire is W1, but it didn’t have key, so it will do OT to A. Ex B want input 0. 2 ) A’s input wire is W0, so if A want to send 0’bit then it would send 4) Getting W1 key and calculating garbled value, then check output pair. Result 5) Send output result to A

19 Yao's garbled circuit AES
Middle box encrypts rules without learning end-user’s key. End-user don’t know the rule table. AES Rule Key Cipher-text

20 Challenge 1 ) Encrypt rule 2) Tokenization 3) Prevent same cipher-text
Sender Middle Box Receiver SSL Raw Data Raw Data 𝐴𝐸𝑆 𝑘 ( 𝑇 2 ) 𝐴𝐸𝑆 𝑘 ( 𝑇 1 ) 𝐴𝐸𝑆 𝑘 ( 𝑇 0 ) Compare Validate Rule List 𝐴𝐸𝑆 𝑘 ( 𝑅 0 ) 𝐴𝐸𝑆 𝑘 ( 𝑅 1 ) Challenge 1 ) Encrypt rule 2) Tokenization 3) Prevent same cipher-text Rule generator 負責產生rule給middle box,產生一個key給sender和reciver,作為對稱是加密的鑰匙 每一條rule 都代表一種attack rule通常是一個或多個key word,如果比對到代表traffic 是有嫌疑的 # Rule List is in Middle box

21 Tokenization Examlple : Login.php?user=alice Window-based
Overhead : window size X Delimiter-based Punctuation, spacing, special symbols Ignore redundant tokens [Log][ogi][gin][in.][n.p][.ph][php]… Base size=3 [Login][php][?user=][user=alice] Delimiter-base

22 Challenge 1 ) Encrypt rule 2) Tokenization 3) Prevent same cipher-text
Sender Middle Box Receiver SSL Raw Data Raw Data 𝐴𝐸𝑆 𝑘 ( 𝑇 2 ) 𝐴𝐸𝑆 𝑘 ( 𝑇 1 ) 𝐴𝐸𝑆 𝑘 ( 𝑇 0 ) Compare Validate Rule List 𝐴𝐸𝑆 𝑘 ( 𝑅 0 ) 𝐴𝐸𝑆 𝑘 ( 𝑅 1 ) Challenge 1 ) Encrypt rule 2) Tokenization 3) Prevent same cipher-text Rule generator 負責產生rule給middle box,產生一個key給sender和reciver,作為對稱是加密的鑰匙 每一條rule 都代表一種attack rule通常是一個或多個key word,如果比對到代表traffic 是有嫌疑的 # Rule List is in Middle box

23 The DPIEnc Encryption Scheme
Consider simple deterministic encryption - Weak when attacker use frequency analysis Consider hash function NOT invertible and pseudorandom Weakness : existing hash functions are too slow Ex : SHA-1 Solution :Use AES to implement H AES must be keyed with a value that MB does not know when there is no match to an attack rule : 做好了token 就要將它們加密 rule也是 Mb 不會知道沒有比對成功的aes (t) 也就是key 只知道成功比對的,因為aers(t) = aes(r) in same key 用hash function 把沒有比對到的aes(t)給隱藏起來

24 The DPIEnc Encryption Scheme (Cont’d)
> Rs 5 bytes to reduces size intial : 8 byte or more Sender 宋兩個 :salt 和加密的職 直覺上來說就是將每個encrytion token 和每個encryption rule去做比對 看有沒有match 這樣比對是線性比對 一一把每個token 和rule去做比對,這樣太慢 所以接下來介紹log-based的detection function 例如如果ruleset 是10000個 a logarithmic lookup is four orders of magnitude faster than a linear scan

25 BlindBox Detect Protocol
Precompute the values Enck (salt, r) for every rule r and for every possible salt Recall : MB can compute Enck (salt, r) based only on salt and its knowledge of AESk(r), and MB does not need to know key k Arrange each enck(salt, r) in a search tree For each encrypted token t in the traffic stream, MB simply looks up the search tree and checks if an equal value exists Weakness : MB HAVE TO prepare all posible salts for each rule r 第一個概念是對每個rule中產生對應的一個可能的salt Enck (salt, r) 現實情況可能是只用到few salts 但有問題 :如果 salt reused in same token , attacker 在mb中可以看到哪些token 和哪些token相等, 這樣可以frequency analysis

26 BlindBox Detect Protocol : Proposed scheme
To maintain the desired security, every encryption of a token t must contain a different salt Sender no logner send salt Only send salt0 initially Keep a counter table to map each token encrypted and its times Instead, Sender send Reset Send new salt0 to prevent counter table be too big Example a,b,a 若不知道salt , mb 不會知道任兩個送過來的有沒有一樣 如果兩個不一樣 salt 一樣,可知道key不一樣 salt不一樣,可知道key可能一樣或是不一樣 RESET 防止salt 太大

27 BlindBox Detect Protocol (cont’d)
MB creates a table mapping each keyword r to a counter MB also creates each in search tree nodes If match, MB increments then inserts new into the tree, and delete old nodes

28 Validate Tokens The validate tokens procedure runs at the receiver
The result is a set of encrypted tokens and it checks that these are the same as the encrypted tokens forwarded by MB If not, there is a chance that the other endpoint is malicious and flags the misbehavior.

29 Protocol II : LIMITED IDS

30 Protocol II A rule can contain
Multiple keywords to be matched in traffic Absolute and relative offset information within the packet A rule is “matched” if all keywords are found within a flow Privacy model Same as protocol I Security gaurantee

31 Protocol III : FULL IDS WITH PROBABLE CAUSE PRIVACY

32 Protocol III Privacy model : Probable cause privacy
Need to decrypt the traffic flow if match any rules Use Key which encrypt SSL to used in script and regular expression Now IDS like SNORT and Bro is detected with script and regular expression Secucity Same as protocol I and protocol II Enck(salt, t1) Enck(salt + 1, t1) xor Kssl Enck(salt+2, t2) enck(salt +3, t)xor Kssl

33 System Implementation

34 System implementation
BlindBox library One normal SSL One to transmit the searchable encrypted tokens One to listen if a middlebox on path requests garbled circuits The middlebox Half of threads perform detection over data stream Half of threads perform obfuscated rule encryption exchanges with clients 當檢查到一個rule看是protocol 1 2 3,3則解出ssl key進而解碼出traffic stream

35 Evaluation : Functionality Evaluation

36 Can BlindBox implement the functionality required for each target system?

37 Table 1 這部分是對blinbox做fucntion的實現
也就是parental filtering, document watermarking, ids rules

38 Does BlindBox fail to detect any attacks/policy violations that these standard implementations would detect?

39 Different tokenization
Window-based No miss with every byte a token Delimiter-based Experiment to test if this tokenization miss some possible attack Detected 97.1% of the attack keywords and 99% of the attack rules that would have been detected with Snort. (an attack rule may consist of multiple keywords) 要側的是第二個 因為第一個不可能把rule miss 掉

40 Evaluation : Performance Evaluation

41 Some result over experiments
Used in protocol II As a result, BlindBox is 3 to 6 orders of magnitude faster than relevant implementations using existing cryptography The primary overhead of BlindBox is setting up a connection, due to the obfuscated rule encryption BlindBox is not yet practical for systems with thousands of rules and short-lived connections that need to run setup frequently 這邊不測試第三個protocol 因為這和第二個差在第三個有第二惡middlebox 以及增加頻寬對於多embed 一個kssl key 這種overhead在小的ruleset沒差 但當在大的rule set(數千條rukle 每條多個keywords)就能set up for 1.5分鐘

42 Strawmen A searchable encryption scheme
Does not enable obfuscated rule encryption or probable cause decryption Can implement encryption and detection as in Protocols I and II Generic functional encryption (FE) Used in create reusable garble circuit and obfuscated rule encrytion Nests fully homomorphic encryption twice Too slow BlindBox is the only system we know of to enable DPI over encrypted data. 所以第一種方法就沒有達成enable obfuscated rule encryption or probable cause decryption 第二種可以達成 可是太慢 等等會有比較圖

43 Client Performance : How long does it take to encrypt a token?

44 看encrypt Blindbox和一般的https差在要tokenzation process 和encrypt these token 所以多了五倍到30倍的時間 The FE strawman takes six orders of magnitude longer than BlindBox and is even further impractical 一個client對每個packet要用15秒才能加完密很不實際

45 Client Performance : How long does the initial handshake take with the middle- box?

46 Set up 是在做和mb交換garbled curuit和相對的obfuscated rule encryption
在fe和searchable兩個scheme都不能達成setup的基本要求 Fouble fe原本是可以做出來,但太久(10的10次方) 所以改用functional encryption,因此也不能達到setup的基本要求 兩個都會make the rules visible to the points

47 Client Performance : What is the computational overhead of BlindBox encryption, and how does this overhead impact page load times?

48 在client端下載速度是20mbps時 可以看出payloadtime被有影片黨和圖片黨的Youtube和airbn 因為sender端送加密的token的速度大於clinet從mb端下載data的速度 the encryption cost is not noticeable as the CPU can continue producing data at around the link rate Youtube和airbn有一堆不能token的binary data(影片圖片黨) 但當下載速度為1gbps時 as the BlindBox sender cannot encrypt fast enough to keep up with the line rate

49 Client Performance : What is the bandwidth overhead of transmitting encrypted tokens for a typical web page?

50 Key to enhance the performance
Minimizing bandwidth over- head is key to client performance Less data transmitted means less cost, faster transfer times, and faster detection times The bandwidth overhead in BlindBox depends on the number of tokens produced The number of encrypted tokens varies widely depending on three parameters of the page being loaded what fraction of bytes are text/code which must be tokenized how “dense” the text/code is in number of delimiters whether or not the web server and client support compression

51 這兩張圖把transmitted data分成text bytes binary bytes tokenized bytes
用window-based和delimiter-based的方法去做tokenize 右邊那個軸是adding token 的overhead 對於transmitting original page data 可以看得出來用dekimiter的方法比用window-based的方法還要小

52 Middlebox Performance : What throughput can BlindBox sustain and how does this compare to standard IDS? 一般的standard ids事做用在plaintext domain

53 We measured a throughput of 166Mbps when using BlindBox
When running Snort over the same traffic, we measured a throughput of 85Mbps BlindBox preformed detection twice as fast as SNORT 有兩個原因 第一,Blindbox把所有detection都弄成exact matching,將regular expression都parsing 到secondary middlebox,但很少用到 第二,blindbox Scheme用的packet-capture library是比snort所用的還要快,因此就結果上來說也會有一定程度的影響

54 Middlebox Performance : How does BlindBox compare in detection time against other strawmen approaches?

55 可以看出來fe做一個pachet detection over 3k rules(每條rule平均有三個keyword)
要花上一天,很不實際 Searchable encrytion的scheme同樣的很慢 Detect六到七個packet over 3k rules要花上一秒 相比下來 Blinbox的速度是three orders of magnitude faster than the searchable encryption scheme Blinbox six orders of magnitude faster than 2fe scheme

56 Conclusion

57 Conclusion BlindBox, a system that resolves the tension between security and DPI middlebox functionality in networks BlindBox is the first system to enable Deep Packet Inspection over encrypted traffic without requiring decryption of the underlying traffic BlindBox supports lots of real DPI application BlindBox detection scheme is faster than other standard DPI These standard DPI is performed on plaintext data


Download ppt "BlindBox: Deep Packet Inspection over Encrypted Traffic"

Similar presentations


Ads by Google