Entropy of Keys and Password Generation Introduction to entropy Entropy and data compression Predictability of random number generation Entropy and system security A weak and predictable method Linux random bit generation devices A programmed set of randomness tests A program to generate random passwords
Introduction to entropy Randomness is the property attributed to behaviour, activity or a sequence of numbers which lack any appearance of order. A system is deterministic if subsequent behaviour can be derived from a knowledge of its starting state, and the physical laws and programmed rules governing changes of state. Programmed computers are inherently deterministic, because a program is a sequence of instructions with an intended (i.e. prespecified) result from given input. Test planning generally assumes a program which is deterministic.
Some quotations "God doesn't play dice with the universe" Albert Einstein. "Random numbers should not be generated with a method chosen at random." Donald Knuth. "The total entropy of any isolated thermodynamic system tends to increase over time, approaching a maximum value." The second law of thermodynamics.
Thermodynamic and Shannon entropy Thermodynamic entropy is a measure of the amount of energy in a physical system that cannot be used to do work. The entropy rate of an information source (Shannon or information entropy) is the number of bits needed to encode a character from this source. If the information is very predictable, it is also very compressible so fewer bits are needed. One way to measure information entropy is to find out how much it is possible to compress a sequence of symbols into a smaller file.
Entropy and data compression compress]$ ls -l total 36 -rwxr-xr-x 1 rich rich 9220 Apr 5 17:17 avltree -rw-r--r-- 1 rich rich 6905 Apr 5 17:15 avltree.c compress]$ gzip * compress]$ ls -l total 16 -rw-r--r-- 1 rich rich 2123 Apr 5 17:15 avltree.c.gz -rwxr-xr-x 1 rich rich 4001 Apr 5 17:17 avltree.gz In the above example: 1. A machine code executable was reduced from 9220 to 4001 bytes, to 43.39%. 2. A 'C' source file used to compile the executable was reduced from 6905 to 2123 bytes, to 30.74%
Entropy and system security One of the most interesting uses of entropy within computing systems is for security purposes, because it is important that passwords and encryption keys should be unpredictable. Some systems used to generate random numbers are themselves inherently deterministic. Before using a random number generator to generate keys or passwords a security evaluation should be carried out. a. Can an attacker predict anything about future states of the generator based upon knowledge of previous system states ? b. Does an attacker have any ability to influence these states ?
Does true randomness exist ? We don't know. Some systems used to generate random numbers are inherently deterministic, though are clearly good enough for security purposes because minor changes in the input which can't be controlled beyond a known precision result in big enough changes in the output. Supposing enough were known about: The exact starting position and velocity of a six sided die, its aerodynamics and the resistance of the air the weight distribution of the die and other properties inclination etc. of the surface on which the die lands etc. Then it would be theoretically possible to compute which number would be on top when the die comes to rest on the surface on which it lands.
How much entropy is needed ? The minimum amount will depend upon the kind of attacks on the system to be secured. Having much more than what is needed can improve system lifetime but in some cases reduces usability. At one time IBM consultants recommended that master system keys and passwords be generated by the system manager using dice. The reason for this is that using a simple system meant that the manager could be as certain as possible about the means by which these keys were created and the conditions surrounding this event. However, the increased performance of computers now requires more entropy within cryptographic keys than can easily be generated using dice. Systems that lock out attackers after a few wrong tries or which use multiple security factors are thought to need less entropy.
A weak and predictable method 1 Using the Posix ANSI 'C' library: 2 functions and a constant are defined. Function void srand(unsigned int seed); is used to seed the pseudorandom number generator. Function int rand(void); generates a pseudorandom sequence of numbers in the range 0 RAND_MAX. RAND_MAX is a system constant, typically
A weak and predictable method 2
A weak and predictable method 3 The POSIX 2003 standard gives these example implementations of rand() and srand(). From this simple implementation it is clear that the sequence generated will repeat whenever a value for next is repeated. As a 32 bit integer is used for next, the maximum possible sequence length will be 2 32.
Linux random devices 1 To improve upon the obvious limitations of predictable pseudorandom number generators, the Linux operating system kernel provides 2 device files for the purpose of generating entropy. One of these, /dev/urandom, is fast and less cautious, the other, /dev/random is slow and cautious. The faster device uses the slower device to reseed a pseudorandom sequence. The following description is from Linux documentation.
Linux random devices 2 RANDOM(4) Linux Programmer's Manual NAME random, urandom - kernel random number source devices DESCRIPTION The character special files /dev/random and /dev/urandom (present since Linux ) provide an interface to the kernel's random number gener- ator. File /dev/random has major device number 1 and minor device number 8. File /dev/urandom has major device number 1 and minor device number 9. The random number generator gathers environmental noise from device drivers and other sources into an entropy pool. The generator also keeps an estimate of the number of bits of noise in the entropy pool. From this entropy pool random numbers are created.
Linux random devices 3 When read, the /dev/random device will only return random bytes within the estimated number of bits of noise in the entropy pool. /dev/random should be suitable for uses that need very high quality randomness such as one-time pad or key generation. When the entropy pool is empty, reads from /dev/random will block until additional environmental noise is gathered. When read, /dev/urandom device will return as many bytes as are requested. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current non-classified liter- ature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.
Linux random devices 4 random]$ cat /dev/random > random (CTRL-C was pressed after counting to 30 seconds) random]$ cat /dev/urandom > urandom (CTRL-C was pressed after counting to 30 seconds) pwgen]$ ls -l total rw-r--r-- 1 rich rich 520 Apr 5 18:19 random -rw-r--r-- 1 rich rich Apr 5 18:19 urandom 520 bytes were generated by the cautious entropy device in about the same time 94 MBytes were generated by the fast device.
A randomness test program 1 Programs can be downloaded from the Internet which will test a source of random numbers using various methods to determine their statistical properties. The following site provides information about some of these methods and a program called ent, which provides a set of these tests:
A randomness test program 2 This ent program is also available as an Ubuntu package, so it was installed using the command: sudo aptitude install ent 3.7Mb of random data was then generated using cat /dev/urandom > randata and pressing and after about 10 seconds.
A randomness test program 3 ent randata Entropy = bits per byte. Optimum compression would reduce the size of this byte file by 0 percent. Chi square distribution for samples is , and randomly would exceed this value percent of the times. Arithmetic mean value of data bytes is (127.5 = random). Monte Carlo value for Pi is (error 0.01 percent). Serial correlation coefficient is (totally uncorrelated =0.0).
A randomness test program 4 ent /usr/share/dict/words Entropy = bits per byte. Optimum compression would reduce the size of this byte file by 44 percent. Chi square distribution for samples is , and randomly would exceed this value 0.01 percent of the times. Arithmetic mean value of data bytes is (127.5 = random). Monte Carlo value for Pi is (error percent). Serial correlation coefficient is (totally uncorrelated = 0.0). Clearly, the spelling dictionary wasn't as random as the output of /dev/urandom. Source code for a program which computes the monte carlo value for PI is available in the older HTML notes.
Program generated passwords 1 Passwords chosen by humans often have too little entropy. When an individual is required to choose a password, one is often selected which an attacker would find very easy to guess. The advantage of getting the user to choose a password is that there is a better chance that the individual won't have to write it down. If the risk mitigated by use of a password is from a remote system attacker rather than a local one, it is better for a strong randomly- generated password to be written down than for a weak password to be used.
Program generated passwords 2
Program generated passwords 3
Program generated passwords 4
Program generated passwords 5
Running pwgen cat /dev/urandom > random.bin ^C How many passwords ? 8 C2MK4Z4z EGJQ5yhy WqHEytfh CajfjaaC KsvWFXTk nSDV4Hm6 8sDbuHC4 vQPWb7gP