By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron Estimating Closeness to the Uniform Distribution on RC4 Keystream Bytes using Property Testing By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering
Agenda Introduction + Background Estimating Closeness via Learning RC4, WPA-TKIP and the 𝐿 1 measure Estimating Closeness via Learning Uniformity testing Paninski test The collision tester Comparing the fingerprints Conclusion Tel Aviv University, Faculty of Engineering
The RC4 cipher RC4 is a stream cipher that was designed by Ron Rivest in 1987. Very fast and simple in hardware and software. Used in many systems/protocols: WEP, WPA-TKIP (wireless networks), SSL and more. Tel Aviv University, Faculty of Engineering
The RC4 Algorithm The algorithm consists of 2 parts: The KSA Key Scheduling Algorithm (KSA) Pseudo Random Generation Algorithm (PRGA) The KSA Tel Aviv University, Faculty of Engineering
The PRGA K is the next keying stream byte, and is XORed with the next plaintext byte to produce ciphertext byte. Tel Aviv University, Faculty of Engineering
Biases in the Keystream Empirical distributions (obtained by 2 44 , 16-byte keys) [AlFardan et al] 1 256 Tel Aviv University, Faculty of Engineering
Pr 𝑍 2 =0 ≈2∙ 2 −8 [Mantin & Shamir] Pr 𝑍 1 = 𝑍 2 =0 ≈3∙ 2 −16 [Isobe et al] 1 256 Tel Aviv University, Faculty of Engineering
With further stream locations, the bias power is weakened… 1 256 Tel Aviv University, Faculty of Engineering
Transmitter MAC address WPA-TKIP Interim solution to replace WEP TKIP per-packet key: Temp Shared Key (16 byte) Weakens Security: TSC-dependent (strong) biases in the keystream [Paterson et al] TSC (6 byte) Transmitter MAC address (6 byte) Key mix 𝐾 0 = 𝑇𝑆𝐶 1 𝐾 1 = 𝑇𝑆𝐶 1 0𝑥20)&0𝑥7𝐹 𝐾 2 = 𝑇𝑆𝐶 0 𝐾 2 𝐾 1 𝐾 0 16 bytes per-packet key Tel Aviv University, Faculty of Engineering
TKIP TSC-dependent biases Keystream distribution at position 1 Tel Aviv University, Faculty of Engineering
Keystream distribution at positions 17 and 33 For 𝑇𝑆𝐶 0 , 𝑇𝑆𝐶 1 =(0𝑥00,0𝑥00) Tel Aviv University, Faculty of Engineering
Motivation Find which bytes locations in the stream are “good” for encryption (i.e. relatively “close” to the uniform distribution), versus “bad” bytes (i.e. farther than some threshold from the uniform distribution). Using the 𝐿 1 as a measure tool Working on pairs of consecutive keystream bytes How many samples do we need to distinguish between the above two cases? Tel Aviv University, Faculty of Engineering
The 𝑳 𝟏 measure Let 𝑝,𝑞 be two (discrete) probability functions over the domain 𝐷; then, the 𝐿 1 distance between them is: 𝑝, 𝑞 1 = 𝑥 ∈𝐷 𝑝 𝑥 −𝑞 𝑥 In our case: 𝑝 𝑥 𝑖𝑠 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 4 (𝑗𝑜𝑖𝑛𝑡) 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛𝑠: 𝑍 1 , 𝑍 2 𝑍 100 , 𝑍 101 𝑇𝐾 1 , 𝑇𝐾 2 𝑤ℎ𝑒𝑟𝑒 (𝑇𝑆𝐶 0 , 𝑇𝑆𝐶 1 =(0𝑥00,0𝑥𝐹𝐹)) 𝑇𝐾 32 , 𝑇𝐾 33 𝑤ℎ𝑒𝑟𝑒 (𝑇𝑆𝐶 0 , 𝑇𝑆𝐶 1 =(0𝑥00,0𝑥00)) Thus the domain size is 𝑁= 2 16 𝑞 𝑥 𝑖𝑠 𝑡ℎ𝑒 𝑢𝑛𝑖𝑓𝑜𝑟𝑚 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑣𝑒𝑟 𝑁 Therefore: 𝑝, 𝑈 𝑁 1 = 𝑖=0 2 16 −1 𝑝 𝑖 − 2 −16 𝑍 𝑟 𝑍 𝑟+1 0x00-0xFF Range: [0, 2 16 −1=65,535] Tel Aviv University, Faculty of Engineering
Estimating Closeness via Learning 𝑝, 𝑈 𝑁 1 = 𝑖=0 2 16 −1 𝑝 𝑖 − 2 −16 How to find 𝑝 𝑖 , ∀𝑖 ∈ 0, 2 16 −1 ≜[ 2 16 ]? Need a Sample Accurately, needs 2 128 samples infeasible Have to use approximate methods Draw 𝑆 samples ( 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑆 ) according to 𝑝 For each domain elements 𝑖 ∈[ 2 16 ], count how many times it appeared in the sample (denote this value by 𝑦 𝑖 ) 𝑝 𝑖 ≜ 𝑦 𝑖 𝑆 Tel Aviv University, Faculty of Engineering
Corollary (due the triangle inequality): Theorem: For a sample size of 𝑆=𝑂 𝑁 𝜖 2 , the following holds: 𝑝, 𝑝 1 ≤𝜖, with high probability Corollary (due the triangle inequality): If 𝑆=𝑂 𝑁 𝜖 2 , then: max 0, 𝑝 , 𝑈 𝑁 1 −𝜖 ≤ 𝑝, 𝑈 𝑁 1 ≤ 𝑝 , 𝑈 𝑁 1 +𝜖 In our case: 𝑁= 2 16 𝜖≤ 2 −9 (from our initial tests) Therefore, 𝑆≥ 2 37 Tel Aviv University, Faculty of Engineering
Simulation results For 𝑆= 2 37 Recall: Therefore: (𝜖= 2 −9 ) max 0, 𝑝 , 𝑈 𝑁 1 −𝜖 ≤ 𝑝, 𝑈 𝑁 1 ≤ 𝑝 , 𝑈 𝑁 1 +𝜖 Therefore: 0≤ 𝑍 100 , 𝑍 101 , 𝑈 2 16 1 ≤0.00268 0.00286≤ 𝑇𝐾 32 , 𝑇𝐾 33 , 𝑈 2 16 1 ≤0.00676 0.00589≤ 𝑍 1 , 𝑍 2 , 𝑈 2 16 1 ≤0.00979 0.05922≤ 𝑇𝐾 1 , 𝑇𝐾 2 , 𝑈 2 16 1 ≤0.06312 𝑝 , 𝑈 𝑁 1 Distribution learned 0.00784 ( 𝑍 1 , 𝑍 2 ) 0.00073 ( 𝑍 100 , 𝑍 101 ) 0.06117 ( 𝑇𝐾 1 , 𝑇𝐾 2 ) 0.00481 ( 𝑇𝐾 32 , 𝑇𝐾 33 ) Tel Aviv University, Faculty of Engineering
Execution time of about 10 days! Simulation results For 𝑆= 2 38 (𝜖= 2 −9.5 ) Recall: max 0, 𝑝 , 𝑈 𝑁 1 −𝜖 ≤ 𝑝, 𝑈 𝑁 1 ≤ 𝑝 , 𝑈 𝑁 1 +𝜖 Therefore: 0≤ 𝑍 100 , 𝑍 101 , 𝑈 2 16 1 ≤0.00268 0.00341≤ 𝑇𝐾 32 , 𝑇𝐾 33 , 𝑈 2 16 1 ≤0.00617 0.00646≤ 𝑍 1 , 𝑍 2 , 𝑈 2 16 1 ≤0.00922 0.05922≤ 𝑇𝐾 1 , 𝑇𝐾 2 , 𝑈 2 16 1 ≤0.06312 𝑝 , 𝑈 𝑁 1 Distribution learned 0.00784 ( 𝑍 1 , 𝑍 2 ) 0.00073 ( 𝑍 100 , 𝑍 101 ) 0.06117 ( 𝑇𝐾 1 , 𝑇𝐾 2 ) 0.00479 ( 𝑇𝐾 32 , 𝑇𝐾 33 ) Execution time of about 10 days! (on a single CPU) Tel Aviv University, Faculty of Engineering
Addressing execution time Distributed network For example 128 processors + threads Drawbacks: Requires a relatively large amount of resources Eventually the same (total) sample size Tolerant test “Accept”, if the 𝐿 1 distance between the tested distribution and the uniform distribution is less than some predefined threshold 𝜖 1 . “Reject”, if the 𝐿 1 distance is greater than another predefined threshold 𝜖 2 , such that: 0 < 𝜖 1 < 𝜖 2 . In the general case, for a constant 𝜖, 𝑆=Ω 𝑁 𝑙𝑜𝑔𝑁 [Gregory and Paul Valiant] Uniformity testing “Accept” if the tested distribution is the uniform distribution. “Reject” if its 𝐿 1 distance is greater than 𝜖. It is known that 𝑆=𝑂 𝑁 𝜖 2 is sufficient, but also required (𝑆=Ω 𝑁 𝜖 2 ) [Paninski] Tel Aviv University, Faculty of Engineering
Paninski test The algorithm: Important observation: The further a distribution is from the uniform distribution, the greater the number of collisions that will occur in its sample. The algorithm: Draw 𝑆=𝑂 𝑁 𝜖 2 <𝑁 samples from the tested distribution 𝑝. Count how many bins have exactly one sample in them (denote this value by 𝐾 1 ). If 𝐾 1 < “some_threshould”, “reject” (the hypothesis that 𝑝 is the uniform distribution), otherwise, “accept”. Tel Aviv University, Faculty of Engineering
Paninski test results Using a sample size of 60,000 < 2 16 500 simulations 𝐸 𝑈 𝐾 1 =𝑆∙ 𝑁−1 𝑁 𝑆−1 ≅24,019 𝑆𝑡𝑑( 𝐾 1 ) 𝐴𝑣𝑔( 𝐾 1 ) Distribution 126 23,846 ( 𝑇𝐾 1 , 𝑇𝐾 2 ) 128 23,989 ( 𝑍 1 , 𝑍 2 ) 116 24,017 ( 𝑇𝐾 32 , 𝑇𝐾 33 ) 129 24,019 ( 𝑍 100 , 𝑍 101 ) Tel Aviv University, Faculty of Engineering
The Collision Tester Counts the number of colliding pairs in the sample: 𝐶 𝑝 = 𝑖,𝑗 ; 1≤𝑖<𝑗≤𝑆; 𝑥 𝑖 = 𝑥 𝑗 Used for estimating the collision probability. Based on a similar observation as before; If 𝐶 𝑝 𝑆 2 <"𝑠𝑜𝑚𝑒_𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑“, “accept”; otherwise “reject”. Works also in the general case. The sample size complexity: 𝑆=𝑂 𝑁 𝜖 4 [Goldreich and Ron] 2 Recently by [Diakonikolas et al] Tel Aviv University, Faculty of Engineering
The collision tester results For 𝑆= 2 18 , 100 simulations After Zoom in… Tel Aviv University, Faculty of Engineering
For 𝑆= 2 20 Tel Aviv University, Faculty of Engineering
For 𝑆= 2 22 Less than 25 minutes Tel Aviv University, Faculty of Engineering
The Fingerprint A fingerprint is a vector whose 𝑖th entry denotes the number of domain elements that appear exactly 𝑖 times in the sample. Can also be described as the histogram of the histogram For example Results of rolling a dice 10 times: (1,2,1,1,5,5,2,6,1,3) The histogram that depicts the results (over {1,2,…,6}): 4,2,1,0,2,1 The fingerprint obtained: 2,2,0,1 Tel Aviv University, Faculty of Engineering
The fingerprint (of a sample) contains all the information (collision statistics) that required for testing symmetric properties (such as the 𝐿 1 distance from the uniform distribution). In particular, the number of colliding pairs can be retrieved from the fingerprint: 𝐶 𝑝 = 𝑗=2 𝑆 𝐹(𝑗)∙ 𝑗 2 Tel Aviv University, Faculty of Engineering
Comparing the fingerprints Using a sample size of 𝑆= 2 21 100 simulations Tel Aviv University, Faculty of Engineering
Tel Aviv University, Faculty of Engineering
Tel Aviv University, Faculty of Engineering
Tel Aviv University, Faculty of Engineering
Tel Aviv University, Faculty of Engineering
Tel Aviv University, Faculty of Engineering
Pr 𝑍 2 =0 ≅ 2 256 Tel Aviv University, Faculty of Engineering
Pr 𝑇𝐾 1 =128 ≅ 7.5 256 Tel Aviv University, Faculty of Engineering
Conclusion Learning the 𝐿 1 distance between our 4 tested distributions and the uniform distribution requires about 2 38 samples (about 10 days on a single CPU). Using the collision tester we managed to distinguish between all 4 distributions even with a sample size of 2 22 samples (less than 25 minutes). The collision tester can be applied for testing other applications (not only in the RC4 context). Tel Aviv University, Faculty of Engineering
Questions? Tel Aviv University, Faculty of Engineering