Presentation is loading. Please wait.

Presentation is loading. Please wait.

On the (in)security of the random number generators of Linux and Windows Benny Pinkas, University of Haifa Zvi Gutterman, Leo Dorrendorf, Tzachy Reinman,

Similar presentations


Presentation on theme: "On the (in)security of the random number generators of Linux and Windows Benny Pinkas, University of Haifa Zvi Gutterman, Leo Dorrendorf, Tzachy Reinman,"— Presentation transcript:

1 On the (in)security of the random number generators of Linux and Windows Benny Pinkas, University of Haifa Zvi Gutterman, Leo Dorrendorf, Tzachy Reinman, Hebrew University

2 In this talk  What are PRNGs (pseudo-random number generators)? Why are they important? Why should the generators of Windows/Linux be investigated?  The generator used by Windows Its algorithm, and its weaknesses.  A little on the generator used by Linux Its algorithm, and its weaknesses.  Security issues when using a generator in a systems without a hard disk.

3 Why are random number generators important?

4 Usage of random bits  Many applications need random bits for their operation  This is particularly true for security applications  All cryptosystems are only secure if they use random keys “… Pick a key K at random …” In practice looks like CryptGenRandom(Key, 16) Relevant for SSL, SSH, etc.

5 Usage of random bits: session ids  http is stateless. Session ids can make http stateful. Session id, blah-blah Knowledge of session ids enables to impersonate clients. Session ids must therefore be random/unpredictable. [GM05] showed how to guess session ids in the Apache Java implementation for Servlet 2.4

6 Usage of random bits: preventing TCP spoofing  TCP sequence numbers should be unpredictable  to prevent “packet spoofing” I.e., prevent attackers from pretending to come from fake IP addresses "completing" a TCP handshake with a victim server without ever receiving any responses from the server.  Predictable TCP sequence numbers enable such attacks

7 Security of random number generators

8 Security  Applications are designed to be secure when using truly random bits  Random bits are hard to get  instead use pseudo-random bits (from a pseudo- random number generator - PRNG) Applications are now only secure if pseudo-random bits are indistinguishable from random Otherwise an attacker can, e.g.,  Guess cryptographic keys (SSL in Netscape [GW96])  Guess session ids (Apache session ids [GM05])

9 Pseudo-random generator G s G(s) seed (random, short) pseudo-random generator long output u Distinguisher D random ????  Pseudo-random number generator: a deterministic function mapping a short, random, secret seed, to a long output which is indistinguihsable from random. a deterministic function

10 Possible Random Number Generators  Pure hardware generator (of true randomness) Cost / portability / interface issues  Application based PRNGs Too little noise available for seeding Implementer can make mistakes (The generators provided by most programming languages are insecure for security related applications)  Operating system based PRNGs Seeding can use system based noise/entropy (process scheduling, hard disk timing, etc.) PRNG can be implemented and hidden in the kernel Implementer is less likely to make mistakes…

11 Why investigate the PRNGs of major operating systems?  Implementers of applications use the pseudo- random number generators provided by the major operating systems But… The algorithms and code of these generators were never published ! We don’t know how they are initialized ! Yet their output is crucial for almost any security application !

12 Operating system based PRNGs  The PRNG keeps an internal state, which advances (in a deterministic way) when output is generated.  The state is periodically refreshed with entropy generated by the operating system.  Different than the theoretical model of a PRNG. OS

13 Operating system based PRNGs  When analyzing PRNG security, we assume that everything but the initial seed (system entropy) is known to the attacker  The OS manufacturer might try to hide the algorithm, but reverse engineering can find it… OS secret ??

14 Desired security properties

15 Desired property 1: Pseudo-randomness OS Output is indistinguishable from random (Therefore, it can be used instead of truly random bits)

16 Desired property 2: Backward security (break- in recovery)  An attacker that learns the internal state cannot learn future outputs of the generator, assuming that sufficient entropy is used to refresh the state. State i-1 output i-1 compromised output i State i State i+1 output i+1 State i+2 output j State i+3 output i+3 system entropy State i+4 output i+4

17 Desired property 3: Forward security  Given state i+1 it is hard to compute state i (i.e., an attacker which learns the internal state cannot learn previous outputs of the generator). A mandatory requirement of the German evaluation guidance for PRNGs. HARD compromised

18 Why is forward security important?  Security systems are secure as long as attackers cannot access secret keys Determined attackers might be able to access keys  How can we minimize the damage of key exposure?

19 Minimizing the damage of key exposure  Threshold crypto: (space dimension) Use n servers. Critical operations require participation of t (<n) servers. Attacker must break into t servers in order to break security. Example: secret sharing, threshold signatures.  Proactive crypto (space + time) At end of every day the n servers exchange messages and change their state. Attacker must break into t servers at the same day to break security.  Disadvantages: At least t servers are needed for any operation (e.g., signatures).

20 Key evolution  Use time dimension.  The sensitive information (e.g. key) is frequently updated.  Adversary which learns the current key cannot break security of past operations.  “Forward security”: current users do not have to worry about attacks which might happen in the future.

21 Forward secure PRGs [K,BY,BH]  Also, the proof of the HILL construction of PRGs from one-way functions can be extended to show forward security.

22 Forward secure signatures  A single public verification key.  A different private signature key per day. Signatures with all private keys can be verified with the same public key.  At the end of the day the signature key is erased.  Forward security: An attacker which obtains today’s private signature key, cannot learn the keys of previous days.

23 Forward secure signatures  Basic scheme [Anderson] Private keys: a certification key + 365 day keys Public key: public verification key of certification key Initialization:  Use certification key to sign 365 certificates: “Key PK i is the public verification key of day i”.  Erase certification key Day i:  Sign using K i. Add to the signature the verification key PK i, and the certificate of day i. Erase K i at end of day.  Improvement [K]: O(1) storage. Use a forward-secure PRG to generate private day keys. Sign certificates using a hash tree.  Many more improvements (time vs. space).

24 Cryptanalysis of the Windows random number generator With Leo Dorrendorf, Zvi Gutterman Hebrew University ACM CCS 2007

25 CryptGenRandom  The only API provided by Windows OS for getting secure random numbers  The world’s most common PRNG  Used by Internet Explorer to generate SSL keys  Its exact design and code were unknown (until now) Security by obscurity?

26 Our research  Examined the binary code of Windows 2000 Windows 2000 is still the 2 nd /3 rd most popular OS PRNGs of all Windows systems are said to be similar  Identified the algorithm used by the PRNG Did not have access to the source code. Used static and dynamic reverse engineering. This was not easy. Verified the algorithm by writing a user-mode simulator which outputs the same values as the OS.  Showed attacks on forward and backward security

27 The main loop (never before published!) CryptGenRandom (Buffer, Len) // output Len bytes to buffer while (Len >0) { R := R  get_next_20_rc4_bytes () State := State  R T := SHA-1’( State ) Buffer := Buffer | T // | denotes concatenation R[0..4] := T[0..4] // copy 5 least significant bytes State := State + R + 1 Len := Len − 20 }

28 Two 20 byte long registers CryptGenRandom (Buffer, Len) // output Len bytes to buffer while (Len >0) { R := R  get_next_20_rc4_bytes () State := State  R T := SHA-1’( State ) Buffer := Buffer | T // | denotes concatenation R[0..4] := T[0..4] // copy 5 least significant bytes State := State + R + 1 Len := Len − 20 } output SHA-1 is a hash function

29 Uses RC4 and SHA1 CryptGenRandom (Buffer, Len) // output Len bytes to buffer while (Len >0) { R := R  get_next_20_rc4_bytes () State := State  R T := SHA-1’( State ) Buffer := Buffer | T // | denotes concatenation R[0..4] := T[0..4] // copy 5 least significant bytes State := State + R + 1 Len := Len − 20 } Several instances of RC4 generate output used by the generator RC4 is a stream cipher

30 Odd usage of  and + CryptGenRandom (Buffer, Len) // output Len bytes to buffer while (Len >0) { R := R  get_next_20_rc4_bytes () State := State  R T := SHA-1’( State ) Buffer := Buffer | T // | denotes concatenation R[0..4] := T[0..4] // copy 5 least significant bytes State := State + R + 1 Len := Len − 20 }

31 CryptGenRandom  Scoping: a different state is kept for every thread  RC4 states in static DLL space. R and State stored in the stack.  For an attacker, it is easier to learn this data compared to a system where this data is stored in the kernel.  Initialization gathers 3584 bytes of system data (most of this data is predictable). Internal states, OS and CPU queries, registry keys, etc. Applies SHA1 and RC4 to this data to compute the initial RC4 states. Initialization is crucial for security.  Reseeding: after a process reads 128 Kbytes of output from CryptGenRandom initialization is repeated. New system entropy is only collected at time of rekeying.

32 Attack on backward security (learning future outputs)  Since we know the algorithm, if we learn the state we can compute future states and outputs until the next entropy refresh. (This requires no cryptanalysis.)  This is not surprising but since entropy is refreshed every 128 Kbytes of output for each thread (e.g., never for IE), the attack is very severe.  The generator should have been refreshed more often. EASY

33 Don’t know how to attack the pseudo- randomness of the generator  The main loop Uses RC4 and SHA1 to advance state Applies SHA1 to (part of) state to compute output RC4, SHA1 SHA1  We don’t know how to distinguish the PRNG’s output from random, or compute state from output.

34 Attack on forward security (learning previous states)  RC4 is a good stream cipher, but it was not designed to provide forward security: given its state at time i+1 it is easy to compute its state at time i. This enables us to break the forward security of the generator  Main result (for CryptGenRandom) : given State i+1 it is possible to compute State i with 2 23 work. Attack is based (among other things) on exploiting the relation between + and  RC4, SHA1 SHA1 Also easy!!!

35 An even simpler attack on forward security  Suppose we know the initial values of State and R These variables are never initialized, but rather take whatever value is on the stack location in which they are stored. These values are quite predictable.  Given current value of RC4 state(s) we can rewind them to the initial values  Now, given initial values of all registers we can simulate the RNG.

36 Implications  MSFT: “this is a local information disclosure vulnerability and has no possibility of code execution and cannot be accessed remotely.”  But, New remote execution attacks are found every week. Our attack can be used to amplify their effect.

37 Implications: possible attack scenario  Attacker learns state E.g., by using an attack based on a buffer overflow, or on physical access. PRNG is implemented in user space rather than kernel, so getting the state is easier.  Attacker can compute all previous and future states and outputs Combining the two attacks, attacker can compute all states until state is refreshed with system entropy. System entropy based refresh is very rare (occurs only after 128KB of output per process).

38 Implications: possible attack scenario  Attacker gets access to the machine Buffer overflow, temporary physical access (@ café).  Attacker learns a single state. Does not need to control the machine afterwards.

39 The new attack  Attacker can now compute all states and outputs from the previous to the next entropy refresh Does not need any more interaction with the system Can now, e.g., decrypt all SSL connections. 128KBytes of output (hundreds of SSL sessions)

40 Previously known attacks - key loggers  Attacker can only learn about the machine in the period of time it owns it Cannot learn about the past To learn about the future it needs a long-lived channel with the attacked machine

41 The Attack on Forward Security

42 The generator

43 The attack on forward security: what is known when the attack begins

44 The attack on forward security

45

46

47 Looking at the previous round (40 bits are missing in every register)

48 Looking at the previous round (one step earlier)

49 Completing the attack

50 False positives  The attack checks 2 40 options for the missing 5 bytes  True value always gives a match. Each other value gives a (false positive) match with probability 2 -40.  False positives are identified if they have no preimages. tt-1t-2t-3 … Not really a problem, since we expect O(k) false positive at time t-k (analysis using martingales).

51 Overhead of our attack  A simple attack requires 2 40 invocations of SHA1  A more intricate attack (using the relation between + and  ) requires 2 23 work on average Our implementation of this attack runs in 19 seconds (no optimization) Dag Arne Osvik implements SHA1 on the PlayStation 3 Performs 83 Million SHA1 invocations per second  can implement our attack in 1/10 of a second!  Details of improved attack on whiteboard.

52 What about XP  The external layer of the PRNG in XP is identical while (Len >0) { R := R  SystemFunction036() State := State  R T := SHA-1’( State ) Buffer := Buffer | T // | denotes concatenation R[0..4] := T[0..4] // copy 5 least significant bytes State := State + R + 1 Len := Len − 20 } More complex than Windows 2000 No forward security here means no forward security for the entire PRNG

53 News Flash!  MSFT’s first answer: “…(later versions of Windows) contain various changes and enhancements to the random number generator.”  MSFT’s later answer: XP is vulnerable to the attack. Vista, Windows Server 2008 and Windows Server 2003 SP2 are not affected by the attack. The XP vulnerability will be fixed in SP3.

54 Future work  Investigate other Windows OS’s XP, Vista, Mobile. In particular, examine the PRNG-OS interaction.  Recommendations: Switch to a design which supports forward security (e.g., Barak-Halevi [ACM CCS 05], which provides theoretical guarantees). Perform entropy rekeys more often (but not too often). ——

55 Analysis of the Linux random number generator With Zvi Gutterman, Tzachy Reinman Hebrew University IEEE Symposium on Security and Privacy, 2006 Black Hat 2006

56 Linux PRNG (LRNG)  Development Started by Theodore Ts’o in 1994  Engineering: Implemented in the kernel. Complex structure, hundreds of patches to date, changes on a weekly base.  Used by many applications TCP, PGP, SSL, S/MIME, …  Two interfaces Kernel interface – get_random_bytes (non-blocking) User interfaces – /dev/random (blocking) “extremely secure” /dev/urandom (non-blocking)

57 Entropy estimation  A counter estimates physical entropy in the LRNG Increased on entropy addition (from OS events) Decreased on output extraction  Blocking and non-blocking interfaces Blocking interface does not provide output when entropy estimation reaches zero, it is considered “more secure”. Non-blocking interface always provides output

58 On reverse engineering  The Linux PRNG is part of the Linux kernel and hence its source is open  The entire code is 2500 lines written in C  The code is unclear, complex and constantly being patched. (Some major bugs took more than a year to be identified.)  Our tools Static analysis Kernel modification: required many kernel builds, while ensuring that kernel changes do not affect generator entropy usage.  We implemented and confirmed our findings with a user mode simulator.

59 LRNG structure C – entropy collection A – entropy addition E – data extraction

60 Entropy Collection  Asynchronous – entropy is constantly gathered and added to the pools (unlike Windows).  Events are represented by two 32-bit words: Event type  E.g., mouse press, keyboard value, HD id. Event time in milliseconds from up time  Bad news: Actual entropy in every event is very limited Most entropy comes from HD reads/writes. We conducted experiments which showed that each op contributes ~1 bit.  Good news: There are many events…

61 Entropy Addition  Cyclic pool, generalization of LFSR, 32 it words.  Different polynomial for each pool size  A is a known matrix  Polynomial: X 32 +X 26 +X 20 +X 14 +X 7 +X+1  Addition algorithm: g – input, j – current pool position

62 Output extraction After extraction: 1.Set: i à i-1 2.Update entropy-estimation add

63 Comparison to Windows  We showed an attack on forward security in Linux, but it is less efficient (2 64 vs. 2 23 )  The implications of the attack are less severe (frequent entropy updates in Linux vs. infrequent updates in Windows)  Also, in Linux Generator is in kernel space Same generator used by all processes Blocking interface (/dev/random)  Susceptible to DoS attacks (even by remote attackers).

64 Implications to disk-less systems

65  More and more systems are using solid state (flash) based storage instead of hard disks. The timing of HD r/w operations is unpredictable. Solid state operations always have the same timing. HD timing is the major source of entropy for the Linux PRNG.  Other sources of entropy are quite limited: user input, system interrupts. They might be guessed by an attacker.  Possible threat to the security of the Linux PRNG in many future systems.

66 What can be done to seed the generator with more entropy?  Known recommendation: Linux PRNG should simulate continuity between shutdown and reboot At shutdown 512 bytes are read from /dev/urandom. At reboot these bytes are written to the PRNG.  An attacker that wants to guess the PRNG state must now record/guess all inputs and interrupts since the first run of the machine. This is done by the Linux distribution (not the kernel).  But, Linux distributions on CD/DVD (Knoppix) cannot save the state. Many other systems (e.g., the OpenWRT Linux based router) do not save the state of their generator.

67 Analysis of a certain Linux based device  A surprising finding: The device always boots with one of 6 possible values of the PRNG state. The device uses solid state storage and no HD. The PRNG does not save its state at shutdown.  During reboot, the PRNG Reads values from a hardware based noise generator. Copies the system clock onto its state.  But, at that time Hardware noise generator provides non-random output. Hardware clock is not loaded to the software clock, so the value read from the software clock is always fixed.

68 Preliminary results  Can predict keys used by this device in SSH sessions, and decrypt communication If these sessions are initiated before an input from user is received  Output of PRNG is a function of user’s input only If we observe the output of the PRNG, can we deduce user’s input? Yes, if the user enters his input slowly enough! Can this be done by an external attacker?

69 Take-home message Be careful when designing (or using) devices without a hard-disk

70 Conclusions  The security of the output of pseudo-random generators is crucial.  It is hard to examine OS based PRNGs The algorithms are not published. One must examine the code. This is not easy. Intricate dependencies with the OS – analyzing the algorithm alone is not enough.  The generators of both Windows and Linux do not provide forward security and have additional design issues

71 Conclusions  Most severe findings The PRNGs of Windows XP/2000 do not provide forward security  Affecting > 90% of all PCs Disk-less Linux systems

72 TODO  Windows Examine other windows systems Understand the relation with the OS (initialization)  Linux Disk-less systems – many new attacks are possible  PRNG usage within a virtual machine?  Change the design of common PRNGs Use the Barak-Halevi construction


Download ppt "On the (in)security of the random number generators of Linux and Windows Benny Pinkas, University of Haifa Zvi Gutterman, Leo Dorrendorf, Tzachy Reinman,"

Similar presentations


Ads by Google