You need randomness. A lot of it. Good quality and fast.

If you run any servers which use SSL, you need somewhere around 108 bytes of randomness for each connection. If you don't have enough, or you use a biased source, your private RSA keys might be trivially factorable. Thousands of private RSA keys have been recovered by researchers. If you let others generate your RSA keys they can be easily backdoored and there is no way for you to know.

Where does this randomness come from? On Linux, it comes from a pseudo random number generator (PRNG) called /dev/urandom. PRNGs are just algorithms that have an internal state and produce numbers indistinguishable from randomness to those unaware of that state. If you knew the state, you could figure out what random numbers were next in the stream.

There's another PRNG on Linux and other Unix-like systems that some people mistakenly think is a true RNG, /dev/random. It's the exact same as /dev/urandom except for one key difference: it measures how much entropy it has remaining, and it will block until it has more. The primary entropy pool will be empty when its drained faster than its restored.

Restoring entropy to the primary pool is done through events like mouse activity or network activity. The more events, the more entropy is in the pool. Events must be generated externally and must be able to be measured. For instance, let's say your server receives the following packets:

Timestamp (ms)  Port            Size
------------------------------------------
1410966238020   TCP port 23415  4522 bytes
1410966238021   TCP port 80     40 bytes
1410966238193   TCP port 80     9291 bytes
1410966238261   TCP port 23415  4522 bytes
1410966238311   UDP port 3241   243 bytes

And now you want to turn this information into entropy. You can't just take the size of the packets or the port number and feed them in as random numbers. They're certainly not random, and attackers can control them by just sending you packets. You can't use the data from the packets for the same reason.

You could, however, use the timestamp. But you can't use the entire timestamp as a random number, because the bits of 1410966238020 are not all equally random. The more significant numbers, for instance, the leading 1, change on the order of decades. The numbers at the end change every millisecond. It's hard for attackers to predict the least significant bit, whether the time will be even or odd.

|---- MSB, least random
v                 
000000010100100010000100001001000011001101000100
                                               ^
                        LSB, most random ------|

And this is what the Linux kernel and other operating systems do to seed their entropy pool: they take the LSB of mouse movement timing, packet arrival times, disk read times, and so on. No attacker could predict something so minute. Thermal randomness is enough to throw off such measurements, especially when they're measured in nanoseconds.

If we want to have a continual source of good randomness, we need a way to get some LSBs fast. A popular method in the past as been sound cards with no input, but there are a few problems with this method:

The sampling rate is quite low, so there's no way to get a very large amount of data.
If you plug in a strong, known source into the source card port you can force the RNG to known values. Use epoxy or else.
Servers don't have sound cards anymore.

RdRand, available on Intel Ivy Bridge CPUs, is another hardware random number generator. Originally it was used as the only source of randomness in /dev/random in FreeBSD, while in Linux it was used in conjunction with a PRNG. The FreeBSD developers said that because of the high likelihood of backdoors in hardware RNGs, they could not continue using it without a PRNG.

We can do better. The perfect hardware RNG would be cheap, fast, and verifiable. I think that I've found just that.

The setup

I bought a bladeRF back when it was a Kickstarter project because I wanted to experiment with RF, specifically GSM. My experiments with that will probably be in another blog post, but I really liked how everything with the bladeRF is open source, including the FPGA HDL code. Its sampling rate is miles ahead of other SDRs, so I knew it could do everything I wanted it to.

After reading about the backdoors in Dual_EC_DRBG, I wanted to know more about hardware RNGs. I stumbled upon rtl-entropy, a project that uses RF sources for randomness. I plugged in my handy antenna, compiled brf-entropy, and set to work.

If you don't want to buy a bladeRF, rtl-entropy also supports the RTL-SDR, a cheap software-defined radio that can be used for this purpose. You'll need to change the brf_entropy command to rtl_entropy in the following sections and your frequency range might be different.

Using it

First, we need to start up the entropy collector:

# brf_entropy -f 850MHz -b

Then we need to start up rngd which will sample from the randomness and add it to the /dev/random pool.

# rngd -r /var/run/rtl_entropy.fifo -W95%

Now our bladeRF is connected to /dev/random directly. It won't block as much because we're adding a lot of randomness to it. Here's a simple test:

# timeout 60s /bin/bash -c "cat /dev/random > dev.random.brf"
# killall brf_entropy rngd
# timeout 60s /bin/bash -c "cat /dev/random > dev.random.no.brf"
# timeout 60s /bin/bash -c "cat /dev/urandom > dev.urandom"

This will create three files: one that's sourced from /dev/random for 60 seconds, with rngd feeding randomness from the bladeRF, another that's sourced from /dev/random for 60 seconds but without the bladeRF, and a third from /dev/urandom for comparison. Let's see how many bytes could be read in this time.

# ls -Al
-rw-r--r--  1 root root  21016277 Sep  6 18:29 dev.random.brf
-rw-r--r--  1 root root        22 Sep  6 18:31 dev.random.no.brf
-rw-r--r--  1 root root 769130496 Sep  6 18:35 dev.urandom

We only got a measly 22 bytes from /dev/random with no hardware RNG.

The CPU usage in my one core virtual machine while reading from /dev/random with cat:

PID   USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
16109 root      20   0  130512   9700   1060 S 62.5  0.5   0:23.66 brf_entropy
16124 root      20   0   11412    612    516 S 16.3  0.0   0:06.47 cat
16113 root      20   0    8964    336    224 S 13.6  0.0   0:05.43 rngd

And from /dev/urandom:

PID   USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
16149 root      20   0   11412    612    516 R 94.8  0.0   0:12.87 cat

What's next

I think having an easy-to-verify hardware random number generator is important. DJB points out that we need verifiable RNGs, but he cautions that just adding more entropy won't necessarily help. I think hardware RNGs will be important in a small but critical number of applications.

I'm hoping that the rtl-entropy project will add more randomness tests and maybe implement something like frequency sampling so that if it seems like someone is manipulating a particular frequency, it can find a frequency more amenable for its purpose. We'll also need to audit both it and bladeRF, which is fortunately possible because both are open source.

Random number generators form the cornerstone of secure cryptography. In light of backdoored software RNGs and closed hardware RNGs, we need open, verifiable ways to generate random numbers. Yarrow and Fortuna are what we need in software. The bladeRF and rtl-entropy may be what we need in hardware.