1 Introduction to Practical Cryptography Lectures 3/4 Stream Ciphers
2 Agenda Properties Building Blocks Competitions Examples
3 Uses Encryption of streaming data Random bit generation
4 Stream Ciphers Stream cipher outputs keystream, KS KS produced by a function, F, that is initialized with a key, k C = E k (P) = P KS P = C KS k can be used only once C1 = E k1 (P1); C2 = E k2 (P2) C1 C2 = P1 KS1 P2 KS2 = P1 P2 if KS1 = KS2 Will know when P1 and P2 have identical bits If know part of P1 (if packet headers, format information), then can obtain part of P2 Period – how long is KS before it starts repeating? repeating is equivalent to reusing a key
5 Stream Ciphers Speed Initialization Keystream generation Resources – memory, power, cpu Hardware, software suitability
6 Stream Ciphers Synchronous stream cipher Sender and receiver must be in-synch Lost bit garbles all subsequent bits unless synch up Flipped bit garbles only one bit Can precompute key stream Example: RC4, block cipher in OFB mode Self-synchronizing stream ciphers Use n previous ciphertext bits to compute keystream Lost bit: synch up after n bits Flipped bit : next n bits garbled Can’t precompute keystream Example: Block cipher in ciphertext feedback (CFB) mode
7 Stream Ciphers – General Concept State Updates FSR based (SOBER, LILI) Array Permutations (RC4) key state (data) output function p i (c i ) c i (p i ) ks i next state function synchronous
8 Stream Ciphers – General Concept key state (data) output function p i (c i ) cici ks i next state function subset of c i ’s error propagation block cipher in CFB mode self synchronizing
9 Keystream Properties Period Period of 2 32 repeats after ~ 8.5 minutes when encrypting 1MB/sec Random appearance: Runs of 1’s or 0’s: ½ with length 1, ¼ with length 2, 1 / 8 have length 3 … Test – little or no compression Dissipates statistics of plaintext Complexity: Low ability to define a bit as a linear expression (or algebraic expression) of bits < period bits away No discernable relation to key (seed/initial state) bits
10 Agenda Properties Building Blocks Competitions Examples
11 Stream Ciphers - Approaches Feedback Shift Register (FSR) based – useful in hardware Block cipher – CTR, CFB, OFB modes Components similar to those found in block ciphers
12 Feedback Shift Register b n-1 b n-2 b1b1 b0b0 …… b0b0 F(b n-1,…..b 0 ) …… new b n-1 Linear F: b n-1 = i b i for i {0,1} i=0,n-1 Nonlinear F Tap Sequence: bits used in F Feedback with Carry Shift (FCSR) F: s = ( i b i + c) for i {0,1} i=0,n-1 b n-1 = s mod 2 c = s/2 mod log 2 (# tap bits) State: b i values bits, same concept with bytes, words
13 Period LFSR of n bits: Maximum 2 n –1 FCSR: depends on initial state Non-linear FSR: depends on function, initial state Inefficient in Software Small # of bits in tap sequence, easier to break. Large # of bits in tap sequence, slow. Security Berlekamp-Massey Algorithm: 2n output bits needed to reproduce the LFSR in O(n 2 ) time. Non-linear FSR: avoid linear approximations Feedback Shift Registers
14 Variations Utilizing LFSR Combination generator Output bit = nonlinear function on output of multiple LFSRs. May clock each LFSR differently Various combinations of AND,OR,Thresholds LSFR1 LSFR2 LSFRn nonlinear function keystream
15 Variations Utilizing LFSR Clock controlled generator Move to next state only on some clock cycles. Move to next state on every cycle but only output bit on some clock cycles. 2 nd LFSR may control clock. Clock control that affects output is also called stuttering
16 Clock Control Examples Stop and Go: 2 LFSRs LFSR1’s output clocks LFSR2 Alternating Stop and Go: 3 LFSRs output of LFSR1 indicates whether to clock LFSR2 or LFSR3 output is of LFSR’s 2 and 3 Bilateral Stop and Go: 2 LFSRs output = of both outputs clock LSRFs depending on their output values Self-Decimated Generators: control their own clock – some function of their state bits controls clock
17 Clock Control Examples Shrinking Generator: 2 LFSRs always clock if LFSR1 outputs 1, use LFSR2’s output, else no output on that cycle called “shrinking” because fewer output bits than clock ticks Self Shrinking Generator: similar to shrinking generator but use 2 different bits from 1 LSFR instead of 2 LFSRs Cascade: output of 1 st level (may be single or combination of generators) controls clock of next level usually not secure due to some relationship between 1 st level output and final output.
18 Agenda Properties Building Blocks Competitions Examples
19 NESSIE Stream Cipher Submissions None recommended BMGL – too slow, small internal state – time/memory tradeoff attack Leviathan - distinguishing attack LILI-128 – attack O(2 71 ) SNOW – distinguishing attack SOBER-t16 – distinguishing attack SOBER-t32 – distinguishing attack Both Sober algorithms thought to be subject to side channel analysis
20 ECRYPT’s eStream Contest Just ended (3 rd round of evaluations finished, winners selected) 4 for software, 4 for hardware In third round of evaluations 16 candidates 3+ years from time of call for proposals to final report originally November 2004 to January 2008 Just ended ECRYPT: European Network of Excellence for Cryptology
21 eStream Overview Categories key length of 128 bits and an IV length of 64 and/or 128 bits key length of 80 bits and an IV length of 32 and/or 64 bits Separate software and hardware categories within each Evaluation Security Free of licensing requirements … Performance, range of environments Committee is only collecting submissions. Evaluations are done by the general cryptographic community.
22 eStream Evaluation Security Criteria Any key-recovery attack should be at least as difficult as exhaustive search. Distinguishing attacks Interest to the cryptographic community Relative importance of high complexity distinguishing attacks is an issue for wider discussion Clarity of design Implementation Criteria Software and hardware efficiency Execution code and memory sizes Performance Flexibility of use
23 eSTREAM Phase 3 Candidates Profile 1 (SW)Profile 2 (HW) CryptMT (CryptMT Version 3)DECIM (DECIM v2 and DECIM-128) DragonEdon80 HC (HC-128 and HC-256)F-FCSR (F-FCSR-H v2 and F-FCSR-16) LEX (LEX-128, LEX-192 and LEX-256)Grain (Grain v1 and Grain-128) NLS (NLSv2, encryption-only)MICKEY (MICKEY 2.0 and MICKEY ) RabbitMoustique Salsa20Pomaranch (Pomaranch Version 3) SOSEMANUKTrivium key lengths: 128 bits for SW and 80 bits for HW
24 eSTREAM Winners Profile 1 (SW)Profile 2 (HW) HC (HC-128 and HC-256)F-FCSR (F-FCSR-H v2 and F-FCSR-16) RabbitGrain (Grain v1 and Grain-128) Salsa20MICKEY (MICKEY 2.0 and MICKEY ) SOSEMANUKTrivium key lengths: 128 bits for SW and 80 bits for HW
25 Agenda Properties Building Blocks Competitions Examples
26 Stream Cipher Examples Lists RC4 A5/1 A5/3 LILI Sober Trivium Lex
27 RC4 S-Box Creation input key; if (key < 256 bytes) { repeat key until 256 bytes; } for (i=0; i < 256; ++i) { S[i] = i; // initialize S-Box K[i] = i th key byte; } j = 0; for (i = 0; i <256; ++i) { j = (j + S[i] + K[i]) mod 256; swap(S[i],S[j]); } Keystream Generator i = 0; j = 0; loop { i = (i+1) mod 256; j = (j+S[i]) mod 256; Swap(S[i],S[j]); t = (S[i] + S[j]) mod 256; ks_byte = S[t]; } 2 S-Box entries form index into S-Box Output S-Box entry (byte) S-Box: key dependent permutation of 0 to 255. (lookup table)
28 RC4 Cryptanalysis Initial keystream byte highly correlated with first few key bytes Recommendations to discard first 256 or 512 output bytes Distinguish from random: O( ) bytes needed Attempts to backtrack to initial state from keystream Keystream Generator i = 0; j = 0; loop { i = (i+1) mod 256; j = (j+S[i]) mod 256; Swap(S[i],S[j]); t = (S[i] + S[j]) mod 256; ks_byte = S[t]; }
29 A5/1 Used in Global System for Mobil Communications (GSM) Example of a cipher manufacturers tried to keep secret, it was leaked and also reversed engineered within 5 years A5/2 – weaker cipher used in some countries due to export rules GSM phone conversations are sent as sequences of frames. One 228 bit frame is sent every 4.6 milliseconds: 114 bits for the communication in each direction. A5/1 produces 228 bits to XOR with the frame Initialized using a 64-bit key combined with a publicly-known 22-bit frame number. In some GSM implementations, 10 key bits are fixed at zero - effective key length is 54 bits. A5/1 is based around a combination of three LFSRs with irregular clocking.
30 A5/1 Image from Wikipedia
31 A5/1 LSRFs 19 bits x 19 + x 5 + x 2 + x + 1 clock bit 8 tapped bits: 13, 16, 17, bits x 22 + x + 1 clock bit 10 tapped bits 20, bits x 23 + x 15 + x 2 + x + 1 clock bit 10 tapped bits 7, 20, 21, 22 Least significant bit numbered 0 Tapped bits of each LSRF are XORed to create value of next 0 bit. Output bits of the three LSRFs are XORed to form the keystream bit
32 A5/1 Each cycle, look at the three clock bits. The majority value, c m, is determined. In each LSRF, if the clock bit matches c m, the registers are clocked. In each cycle, 2 or 3 LSRFs will be clocked.
33 A5/1 Initialization Registers set to all 0’s Incorporate the key and frame number: For 64 cycles, the key is mixed in by XORing the ith key bit with the least significant bit of each register For 22 cycles, the 22 bit frame value is mixed in – same as with key value Normal clocking used 100 cycles are run using the majority clocking, the output is discarded End result is the initial state
34 A5/1 Three short LSFRs Not many tap bits to guess
35 A5/3 Core BLCNT is a 64 bit counter KM = 0x555….555 (128 bit key modifier) CK = key bits CBC XORed with counter and key A counter previous output defined on next slide
36 A5/3 GSM K c = key
37 LILI LFSR C LFSR D FcFc FDFD bit for keystream. c s1s1 s2s2 sksk … s1s1 s2s2 snsn … clocking function integer output Irregular: clocked c times Regularly clocked b non-linear function Family of keystream generators
38 SOBER - Original LFSR S 17 = 141 S 15 175 S 0 LFSR: 17 bytes, 128 bit key Nonlinear transformation V n = (S 0 + S 2 + S 5 + S 12 ) (S 12 S 13 ) Stutter Control Output function (sc) 00: No output 01: Vn : Vn 11: Vn output byte sc VnVn Clock Control VnVn sc = next 2 bits of byte need byte: clock, take V n Clock Control (sc) 00: 1 clock 01: clock, output, clock 10: 2 clocks 11: 1 clock S i ’s
39 Sober t{8,16,32} 8,16,32 = byte size of key
40 Sober-t LSFR for Sober-tw w = 8,16,32 Max period: 2 13w -1
41 Non-linear Function LFSR output input to non-linear function f w, output added to subset of LSFRs bits, XORed with key-dependent value, this result then added to subset of LSRF bits
42 Trivium Christophe De Canniere and Bart Preneel hardware oriented synchronous stream cipher Trivium generates up to 2 64 bits of key stream from an 80-bit secret key and an 80-bit initial value (IV). Internal state is 288 bits
43 Trivium IV and key used to initialize the state Iterate state –extract values of 15 specific state bits and use them to update 3 bits of the state and to compute 1 bit of the key stream zi. –state bits then rotated and process repeats
44 Trivium Key Stream Generation for i = 1 to N do t1 s66 s93 t2 s162 s177 t3 s243 s288 zi t1 t2 t3 t1 t1 s91 s92 s171 t2 t2 s175 s176 s264 t3 t3 s286 s287 s69 (s1; s2; : : : ; s93) (t3; s1; : : : ; s92) (s94; s95; : : : ; s177) (t1; s94; : : : ; s176) (s178; s279; : : : ; s288) (t2; s178; : : : ; s287) end for
45 Trivium Initialization load 80-bit key and 80-bit IV into 288-bit initial state set all remaining bits to 0, except for s286, s287, and s288, which are set to 1 state is rotated over 4 full cycles of the for look, but no bits are output (for i = 1 to 4 288)
46 Trivium state bit is not used for at least 64 iterations after it has been modified up to 64 iterations can be computed at once, provided that 3 AND gates and 11 XOR gates in the original scheme are duplicated a corresponding number of times
47 Estimated Gate Counts 1-bit to 64-bit hardware implementations Components1-bit8-bit16-bit32-bit64-bit Flip-ops288 AND gates XOR gates NAND gate count
48 Software Stream generation: 12 cycles/byte Key setup: 55 cycles IV setup: 2050 cycles on Intel XeonTM CPU 1.5 GHz
49 Trivium Security Linear correlations between key stream bits and internal state bits are easy to find because zi is simply defined to be equal to s66 s93 s162 s177 s243 s288. But, as opposed to LFSR based ciphers, Trivium's state evolves in a nonlinear way –not clear how an attacker should combine these equations in order to efficiently recover the state –Estimate: follow linear trails through the cipher and approximate the outputs of all encountered AND gates by 0. However, the positions of the taps in Trivium have been chosen in such a way that any trail of this specific type is forced to approximate at least 72 AND gate outputs –If assume that the correlation of linear combination is completely explained by a specific trail considered, then it would have a correlation coefficient of Detecting such a correlation would require at least bits of key stream Other more complicated types of linear trails with larger correlations might exist, estimate that no correlations will exceed 2 -40
50 Lex Alex Biryukov Leak EXtraction Software category –Uses AES – reuse if application has AES implementation
51 Lex Initialize: –Key AES: 128-bit key, K, run through standard AES key-schedule –State: 128-bit IV is encrypted by AES S = AES K (IV ) Generate key stream –Repeatedly encrypt (starting with S) Rekey every 500 AES applications
52 Lex takes 4 bytes from each round of AES (see next slide)
53 Lex
54 Lex order of bytes –not relevant for security –relevant for fast software implementation allows extraction of a 32-bit value from two 32-bit rows in four steps ((t0&0xFF00FF) << 8) (t2&0xFF00FF) t0 = 1st row, t2 = 3rd row in 4x4 matrix