By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron

By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron
Estimating Closeness to the Uniform Distribution on RC4 Keystream Bytes using Property Testing By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron Project presentation 8 February, 2017 Tel Aviv University, Faculty of Engineering

Agenda Introduction + Background Estimating Closeness via Learning
RC4, WPA-TKIP and the 𝐿 1 measure Estimating Closeness via Learning Uniformity testing Paninski test The collision tester Comparing the fingerprints Conclusion Tel Aviv University, Faculty of Engineering

The RC4 cipher RC4 is a stream cipher that was designed by Ron Rivest in 1987. Very fast and simple in hardware and software. Used in many systems/protocols: WEP, WPA-TKIP (wireless networks), SSL and more. Tel Aviv University, Faculty of Engineering

The RC4 Algorithm The algorithm consists of 2 parts: The KSA
Key Scheduling Algorithm (KSA) Pseudo Random Generation Algorithm (PRGA) The KSA Tel Aviv University, Faculty of Engineering

The PRGA K is the next keying stream byte, and is XORed with the next plaintext byte to produce ciphertext byte. Tel Aviv University, Faculty of Engineering

Biases in the Keystream
Empirical distributions (obtained by , 16-byte keys) [AlFardan et al] 1 256 Tel Aviv University, Faculty of Engineering

Pr 𝑍 2 =0 ≈2∙ 2 −8 [Mantin & Shamir]
Pr 𝑍 1 = 𝑍 2 =0 ≈3∙ 2 −16 [Isobe et al] 1 256 Tel Aviv University, Faculty of Engineering

With further stream locations, the bias power is weakened…
1 256 Tel Aviv University, Faculty of Engineering

Transmitter MAC address
WPA-TKIP Interim solution to replace WEP TKIP per-packet key: Temp Shared Key (16 byte) Weakens Security: TSC-dependent (strong) biases in the keystream [Paterson et al] TSC (6 byte) Transmitter MAC address (6 byte) Key mix 𝐾 0 = 𝑇𝑆𝐶 1 𝐾 1 = 𝑇𝑆𝐶 1 0𝑥20)&0𝑥7𝐹 𝐾 2 = 𝑇𝑆𝐶 0 𝐾 2 𝐾 1 𝐾 0 16 bytes per-packet key Tel Aviv University, Faculty of Engineering

TKIP TSC-dependent biases
Keystream distribution at position 1 Tel Aviv University, Faculty of Engineering

Keystream distribution at positions 17 and 33
For 𝑇𝑆𝐶 0 , 𝑇𝑆𝐶 1 =(0𝑥00,0𝑥00) Tel Aviv University, Faculty of Engineering

Motivation Find which bytes locations in the stream are “good” for encryption (i.e. relatively “close” to the uniform distribution), versus “bad” bytes (i.e. farther than some threshold from the uniform distribution). Using the 𝐿 1 as a measure tool Working on pairs of consecutive keystream bytes How many samples do we need to distinguish between the above two cases? Tel Aviv University, Faculty of Engineering

The 𝑳 𝟏 measure Let 𝑝,𝑞 be two (discrete) probability functions over the domain 𝐷; then, the 𝐿 1 distance between them is: 𝑝, 𝑞 1 = 𝑥 ∈𝐷 𝑝 𝑥 −𝑞 𝑥 In our case: 𝑝 𝑥 𝑖𝑠 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 4 (𝑗𝑜𝑖𝑛𝑡) 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛𝑠: 𝑍 1 , 𝑍 2 𝑍 100 , 𝑍 101 𝑇𝐾 1 , 𝑇𝐾 𝑤ℎ𝑒𝑟𝑒 (𝑇𝑆𝐶 0 , 𝑇𝑆𝐶 1 =(0𝑥00,0𝑥𝐹𝐹)) 𝑇𝐾 32 , 𝑇𝐾 𝑤ℎ𝑒𝑟𝑒 (𝑇𝑆𝐶 0 , 𝑇𝑆𝐶 1 =(0𝑥00,0𝑥00)) Thus the domain size is 𝑁= 2 16 𝑞 𝑥 𝑖𝑠 𝑡ℎ𝑒 𝑢𝑛𝑖𝑓𝑜𝑟𝑚 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑣𝑒𝑟 𝑁 Therefore: 𝑝, 𝑈 𝑁 1 = 𝑖= −1 𝑝 𝑖 − 2 −16 𝑍 𝑟 𝑍 𝑟+1 0x00-0xFF Range: [0, 2 16 −1=65,535] Tel Aviv University, Faculty of Engineering

Estimating Closeness via Learning
𝑝, 𝑈 𝑁 1 = 𝑖= −1 𝑝 𝑖 − 2 −16 How to find 𝑝 𝑖 , ∀𝑖 ∈ 0, 2 16 −1 ≜[ 2 16 ]? Need a Sample Accurately, needs samples infeasible Have to use approximate methods Draw 𝑆 samples ( 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑆 ) according to 𝑝 For each domain elements 𝑖 ∈[ 2 16 ], count how many times it appeared in the sample (denote this value by 𝑦 𝑖 ) 𝑝 𝑖 ≜ 𝑦 𝑖 𝑆 Tel Aviv University, Faculty of Engineering

Corollary (due the triangle inequality):
Theorem: For a sample size of 𝑆=𝑂 𝑁 𝜖 2 , the following holds: 𝑝, 𝑝 1 ≤𝜖, with high probability Corollary (due the triangle inequality): If 𝑆=𝑂 𝑁 𝜖 2 , then: max 0, 𝑝 , 𝑈 𝑁 1 −𝜖 ≤ 𝑝, 𝑈 𝑁 1 ≤ 𝑝 , 𝑈 𝑁 𝜖 In our case: 𝑁= 2 16 𝜖≤ 2 −9 (from our initial tests) Therefore, 𝑆≥ 2 37 Tel Aviv University, Faculty of Engineering

Simulation results For 𝑆= 2 37 Recall: Therefore: (𝜖= 2 −9 )
max 0, 𝑝 , 𝑈 𝑁 1 −𝜖 ≤ 𝑝, 𝑈 𝑁 1 ≤ 𝑝 , 𝑈 𝑁 𝜖 Therefore: 0≤ 𝑍 100 , 𝑍 101 , 𝑈 ≤ ≤ 𝑇𝐾 32 , 𝑇𝐾 33 , 𝑈 ≤ ≤ 𝑍 1 , 𝑍 2 , 𝑈 ≤ ≤ 𝑇𝐾 1 , 𝑇𝐾 2 , 𝑈 ≤ 𝑝 , 𝑈 𝑁 1 Distribution learned ( 𝑍 1 , 𝑍 2 ) ( 𝑍 100 , 𝑍 101 ) ( 𝑇𝐾 1 , 𝑇𝐾 2 ) ( 𝑇𝐾 32 , 𝑇𝐾 33 ) Tel Aviv University, Faculty of Engineering

Execution time of about 10 days!
Simulation results For 𝑆= 2 38 (𝜖= 2 −9.5 ) Recall: max 0, 𝑝 , 𝑈 𝑁 1 −𝜖 ≤ 𝑝, 𝑈 𝑁 1 ≤ 𝑝 , 𝑈 𝑁 𝜖 Therefore: 0≤ 𝑍 100 , 𝑍 101 , 𝑈 ≤ ≤ 𝑇𝐾 32 , 𝑇𝐾 33 , 𝑈 ≤ ≤ 𝑍 1 , 𝑍 2 , 𝑈 ≤ ≤ 𝑇𝐾 1 , 𝑇𝐾 2 , 𝑈 ≤ 𝑝 , 𝑈 𝑁 1 Distribution learned ( 𝑍 1 , 𝑍 2 ) ( 𝑍 100 , 𝑍 101 ) ( 𝑇𝐾 1 , 𝑇𝐾 2 ) ( 𝑇𝐾 32 , 𝑇𝐾 33 ) Execution time of about 10 days! (on a single CPU) Tel Aviv University, Faculty of Engineering

Addressing execution time
Distributed network For example 128 processors + threads Drawbacks: Requires a relatively large amount of resources Eventually the same (total) sample size Tolerant test “Accept”, if the 𝐿 1 distance between the tested distribution and the uniform distribution is less than some predefined threshold 𝜖 1 . “Reject”, if the 𝐿 1 distance is greater than another predefined threshold 𝜖 2 , such that: 0 < 𝜖 1 < 𝜖 2 . In the general case, for a constant 𝜖, 𝑆=Ω 𝑁 𝑙𝑜𝑔𝑁 [Gregory and Paul Valiant] Uniformity testing “Accept” if the tested distribution is the uniform distribution. “Reject” if its 𝐿 1 distance is greater than 𝜖. It is known that 𝑆=𝑂 𝑁 𝜖 2 is sufficient, but also required (𝑆=Ω 𝑁 𝜖 2 ) [Paninski] Tel Aviv University, Faculty of Engineering

Paninski test The algorithm: Important observation:
The further a distribution is from the uniform distribution, the greater the number of collisions that will occur in its sample. The algorithm: Draw 𝑆=𝑂 𝑁 𝜖 2 <𝑁 samples from the tested distribution 𝑝. Count how many bins have exactly one sample in them (denote this value by 𝐾 1 ). If 𝐾 1 < “some_threshould”, “reject” (the hypothesis that 𝑝 is the uniform distribution), otherwise, “accept”. Tel Aviv University, Faculty of Engineering

Paninski test results Using a sample size of 60,000 < 2 16
500 simulations 𝐸 𝑈 𝐾 1 =𝑆∙ 𝑁−1 𝑁 𝑆−1 ≅24,019 𝑆𝑡𝑑( 𝐾 1 ) 𝐴𝑣𝑔( 𝐾 1 ) Distribution 126 23,846 ( 𝑇𝐾 1 , 𝑇𝐾 2 ) 128 23,989 ( 𝑍 1 , 𝑍 2 ) 116 24,017 ( 𝑇𝐾 32 , 𝑇𝐾 33 ) 129 24,019 ( 𝑍 100 , 𝑍 101 ) Tel Aviv University, Faculty of Engineering

The Collision Tester Counts the number of colliding pairs in the sample: 𝐶 𝑝 = 𝑖,𝑗 ; 1≤𝑖<𝑗≤𝑆; 𝑥 𝑖 = 𝑥 𝑗 Used for estimating the collision probability. Based on a similar observation as before; If 𝐶 𝑝 𝑆 2 <"𝑠𝑜𝑚𝑒_𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑“, “accept”; otherwise “reject”. Works also in the general case. The sample size complexity: 𝑆=𝑂 𝑁 𝜖 [Goldreich and Ron] 2 Recently by [Diakonikolas et al] Tel Aviv University, Faculty of Engineering

The collision tester results
For 𝑆= 2 18 , 100 simulations After Zoom in… Tel Aviv University, Faculty of Engineering

For 𝑆= 2 20 Tel Aviv University, Faculty of Engineering

For 𝑆= 2 22 Less than 25 minutes
Tel Aviv University, Faculty of Engineering

The Fingerprint A fingerprint is a vector whose 𝑖th entry denotes
the number of domain elements that appear exactly 𝑖 times in the sample. Can also be described as the histogram of the histogram For example Results of rolling a dice 10 times: (1,2,1,1,5,5,2,6,1,3) The histogram that depicts the results (over {1,2,…,6}): 4,2,1,0,2,1 The fingerprint obtained: 2,2,0,1 Tel Aviv University, Faculty of Engineering

The fingerprint (of a sample) contains all the information (collision statistics) that required for testing symmetric properties (such as the 𝐿 1 distance from the uniform distribution). In particular, the number of colliding pairs can be retrieved from the fingerprint: 𝐶 𝑝 = 𝑗=2 𝑆 𝐹(𝑗)∙ 𝑗 2 Tel Aviv University, Faculty of Engineering

Comparing the fingerprints
Using a sample size of 𝑆= 2 21 100 simulations Tel Aviv University, Faculty of Engineering

Tel Aviv University, Faculty of Engineering

Pr 𝑍 2 =0 ≅ 2 256 Tel Aviv University, Faculty of Engineering

Pr 𝑇𝐾 1 =128 ≅ Tel Aviv University, Faculty of Engineering

Conclusion Learning the 𝐿 1 distance between our 4 tested distributions and the uniform distribution requires about samples (about 10 days on a single CPU). Using the collision tester we managed to distinguish between all 4 distributions even with a sample size of samples (less than 25 minutes). The collision tester can be applied for testing other applications (not only in the RC4 context). Tel Aviv University, Faculty of Engineering

Questions? Tel Aviv University, Faculty of Engineering

By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron

Similar presentations

Presentation on theme: "By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron

Similar presentations

Presentation on theme: "By Eliezer Yucht Prepared under the supervision of Prof. Dana Ron"— Presentation transcript:

Similar presentations

About project

Feedback