Generating data with entropy, or random number generation (RNG), is a well-known difficult problem. Many crypto algorithms and protocols assumes random data is available. There are many implementations out there, including /dev/random in the BSD and Linux kernels and API calls in crypto libraries such as GnuTLS or OpenSSL. How they work can be understood by reading the source code. The quality of the data depends on actual hardware and what entropy sources were available — the RNG implementation itself is deterministic, it merely convert data with supposed entropy from a set of data sources and then generate an output stream.

In some situations, like on virtualized environments or on small embedded systems, it is hard to find sources of sufficient quantity. Rarely are there any lower-bound estimates on how much entropy there is in the data you get. You can improve the RNG issue by using a separate hardware RNG, but there is deployment complexity in that, and from a theoretical point of view, the problem of trusting that you get good random data merely moved from one system to another. (There is more to say about hardware RNGs, I’ll save that for another day.)

For some purposes, the available solutions does not inspire enough confidence in me because of the high complexity. Complexity is often the enemy of security. In crypto discussions I have said, only half-jokingly, that about the only RNG process that I would trust is one that I can explain in simple words and implement myself with the help of pen and paper. Normally I use the example of rolling a normal six-sided dice (a D6) several times. I have been thinking about this process in more detail lately, and felt it was time to write it down, regardless of how silly it may seem.

A die with six sides produces a random number between 1 and 6. It is relatively straight forward to intuitively convinced yourself that it is not clearly biased: inspect that it looks symmetric and do some trial rolls. By repeatedly rolling the die, you can generate how much data you need, time permitting.

I do not understand enough thermodynamics to know how to estimate the amount of entropy of a physical process, so I need to resort to intuitive arguments. It would be easy to just assume that a die produces 2.5 bits of entropy, because log_2(6)~=2.584. At least I find it easy to convince myself intuitively that 2.5 bits is an upper bound, there appears to me to be no way to get out more entropy than that out looking at a die roll outcome. I suspect that most dice have some form of defect, though, which leads to a very small bias that could be found with a large number of rolls. Thus I would propose that the amount of entropy of most D6’s are slightly below 2.5 bits on average. Further, to establish a lower bound, and intuitively, it seems easy to believe that if the entropy of particular D6 would be closer to 2 bits than to 2.5 bits, this would be noticeable fairly quickly by trial rolls. That assumes the die does not have complex logic and machinery in it that would hide the patterns. With the tinfoil hat on, consider a die with a power source and mechanics in it that allowed it to decide which number it would land on: it could generate seamingly-looking random pattern that still contained 0 bits of entropy. For example, suppose a D6 is built to produce the pattern 4, 1, 4, 2, 1, 3, 5, 6, 2, 3, 1, 3, 6, 3, 5, 6, 4, … this would mean it produces 0 bits of entropy (compare the numbers with the decimals of sqrt(2)). Other factors may also influence the amount of entropy in the output, consider if you roll the die by just dropping straight down from 1cm/1inch above the table. There could also be other reasons why the amount of entropy in a die roll is more limited, intuitive arguments are sometimes completely wrong! With this discussion as background, and for simplicity, going forward, I will assume that my D6 produces 2.5 bits of entropy on every roll.

We need to figure out how many times we need to roll it. I usually find myself needing a 128-bit random number (16 bytes). Crypto algorithms and protocols typically use power-of-2 data sizes. 64 bits of entropy results in brute-force attacks requiring about 2^64 tests, and for many operations, this is feasible with today’s computing power. Performing 2^128 operations does not seem possible with today’s technology. To produce 128 bits of entropy using a D6 that produces 2.5 bits of entropy per roll, you need to perform ceil(128/2.5)=52 rolls.

We also need to design an algorithm to convert the D6 output into the resulting 128-bit random number. While it would be nice from a theoretical point of view to let each and every bit of the D6 output influence each and every bit of the 128-bit random number, this becomes difficult to do with pen and paper. **Update:**This blog post used to include an algorithm here, however it was clearly wrong (written too late in the evening…) so I’ve removed it — I need to come back and think more about this.

So what’s the next step? Depends on what you want to use the random data for. For some purposes, such as generating a high-quality 128-bit AES key, I would be done. The key is right there. To generate a high-quality ECC private key, you need to generate somewhat more randomness (matching the ECC curve size) and do a couple of EC operations. To generate a high-quality RSA private key, unfortunately you will need much more randomness, at the point where it makes more sense to implement a PRNG seeded with a strong 128-bit seed generated using this process. The latter approach is the general solution: generate 128 bits of data using the dice approach, and then seed a CSPRNG of your choice to get large number of data quickly. These steps are somewhat technical, and you lose the pen-and-paper properties, but code to implement these parts are easier to verify compared to verifying that you get good quality entropy out of your RNG implementation.

2^3=8, so actually you’d need a D8, or to tweak the algorithm a bit.

I’m quite sure it’s actually much closer to 2 bits since 2^3=8, not 6. Also, 111 is 7 😉

Well, 2^3 = 8 is not equal to 6. So you either want to to use a D8. Or only use the numbers 1-4 on the dice an roll again otherwise. With tricks and calculations you could reduce the number of times you have to roll the dice again, but there will always be cases where you have to do it, because 3 does not divide a power of 2. (At least if you want maximal entropy in your generated string.)

Note that you don’t get the 3- bits chunks 100 and 110 in your example above.

2^3 = 8 > 6, so a D6 produces less than 3 bits of entropy (in fact it’s about 2.585 bits). 128-bit numbers produced with your algorithm contain only about 110.2 bits of entropy (43 rolls of a D6 and the last bit is thrown away).

I don’t think you can use the full entropy of a D6, i.e. generate a 128-bit random number with only ceil(128/2.585) = 50 rolls, using only pen and paper.

The easiest method would be to only accept rolls of 1,2,3,4 (i.e. roll again on 5 and 6) and then to take just the lower two bits of the binary representation.

D’uh. Many people pointed my math mistake that said 2^3=6. This was written rather late in the night, and I obviously should have saved the draft and waited another day… 🙂

I’ve fixed the text a bit, and pulled the algorithm for converting die outputs to binary. Alas, it doesn’t seem to be as easy to do what I want that I first thought. More pondering needed.

Thanks for feedback!

2^3 = 8 not 6.

You never generate 0 nor 7 with a D6.

You are missing possibilities (e.g. 00…0) so your numbers are not truly random.

Nevertheless, you can use the D6 output to generate a base-6 number of any size and then write it in base-2 (or 16).

However, if you decide to fit the base-6 random numbers in the next superior 2^p number, then, some numbers of p base-2 digits cannot be casted with a D6 (their probability is 0, so they are not random) (that’s what you are doing using a D6 instead of a D8). (Say with n=3, you have 216 base-6 possible numbers, each with a 1/216 probability. If you put them in a p=8 base-2 number, you have 256 possible numbers but 40 of them have a probability of 0. Not random.)

It’s about the same if you decide to truncate the base-6 random numbers that does not fit the previous inferior 2^p number (e.g. use modulo 128): the probablity of the base-2 numbers are not equals (some are 2/6^n, others 1/6^n).

You can also decide to discard the base-6 random numbers that does not fit the previous inferior 2^p number (p=7, every number greater than 127 are discarded). But, then, the time to generate the number is not constant (nor limited: there is always the possibility that you always cast a number greater than 127).

Easier way: odd → 1, even → 0. Or just flip a coin. Or use one of methods used in Yi Jing.

Thanks for feedback! I like the coin flipping idea, that’s readily available. Takes a bit longer, but still manageable. I’m ordering D8, D16 and will make some timing measurements with those and a coin.

Base-X is great idea! It fits to all type of dice and prevents loss of bits.

With dice D100 you need only 20 rolls to generate 128 bit.

“2^3 is 6”

You… might want to check your math, you might have an old Pentium. 2^3 is 8. Easy fix would be to s/D6/D8/g.

And also the singular of dice is “die”, as in “Roll a six-sided die”

I don’t have a Pentium, I have kids that drain my sleep reserve. 🙂

It amounts to pretty much to same result though….

“It would be easy to just assume that a dice produces 3 bits of entropy, because 2^3=6”

It’s really not…

Actually, 2^3 = 8, not 6. In the above scheme, bit string 000 is not possible (and neither is 111, although you got that somehow from 6). Please double check the math.

“It would be easy to just assume that a dice produces 3 bits of entropy, because 2^3=6 which matches the number of possible outcomes.”

Uh, 2^3=8, so you can only a maximum of 2.58 bits/throw (ln 6/ln 2) if you get the maths right (this is tricky), or 2 bits/throw if you simplify by re-rolling 5s and 6s and just use values 1-4.

Alternatively, use a d8.

Thanks — re-rolling 5/6 is an interesting idea, although it will make the number of rounds you will have to do a bit more stocastic.

You should really get an 8-sided dice before you do this. Because 6-sided dice does not give you 3 bits, only about 2.585 bits per roll.

Yep. I’m ordering a couple of D8’s now, although D6’s are much more commonly available.

You can order even D16 🙂

http://en.wikipedia.org/wiki/Dice#Rarer_variations

> It would be easy to just assume that a dice produces 3 bits of entropy, because 2^3=6

A dice has about 2.5 bits of entropy. 2^3=8…

Hi,

Uhm, this is a nice write-up altogether, but… 🙂

But it suffers from a serious flaw near the very beginning of the actual discussion of using a die (“dice” is actually plural :P) directly as an entropy source. A D6 roll does not provide three bits of information – it would have if it had *eight* possible outcomes (2^3=8), not six. Thus, it provides log(6)/log(2) =~ 2.585 bits of information.

More importantly, if you write down the numbers that successive D6 rolls produce, then convert them to binary, then concatenate the three-bit strings, you’ll have a very biased bitstream generator – it would *never* produce the 110 or 111 sequence in any of the concatenated three-bit segments. This leaves 2/8ths, i.e. a full quarter, of the possible bitstreams inaccessible 🙂

Yep. A D8 or another algorithm to convert D6 outputs into binary is needed.

…and then, of course, I made a stupid mistake in the last sentence of my previous comment. The leaving-out of two values of each three-bit segment of the generated bitstream does not leave out only a quarter of the possible bitstreams – it leaves out pretty much all of them. It may produce only (3/4) * (3/4) * (3/4) * … of the possible bitstreams, and the longer the bitstream gets, the smaller this value becomes – for infinitely long bistreams it is practically indistinguishable from zero.

Hi. Isn’t 2^3=8?

That’s what YOU think!

🙂

Three comments from a random Planet Debian lurker:

(1) 2^3=6?? Is that intentional creative rounding? I mean, as a physicist by education, I am not totally adverse to the idea of declaring 6≈8 for the sake of simplifying an argument, but doing so here seems odd since it makes you /over/estimate the maximum amount of entropy in a dice roll result. If at all, I would have rounded down 6≈4=2^2.

The actual amount of entropy for an ideal (fair) dice roll result is ld 6 ≈ 2.58 bits.

(2) Since the result of one dice roll does not exactly fit three bits, arranging three-bit-blocks as you do results in non-uniformly distributed bytes. To give just the most obvious examples, your method will never generate 0x00, nor 0xFF.

A better method is to treat your string of dice roll results as a base-6 number (interpret a rolled 6 as the hexal digit 0). Then you can use conventional methods to derive uniformly distributed random integers within your intended target range. Notice that just truncating unneeded bits would introduce bias. The most straightforward method I remember is to just discard your entire sequence and roll dice again if your base-6 random number does not fit the target range.

For example, 3 dice rolls give you a base-6 random number in the range 0 to 555_6 (0 to 1266 decimal). You can use this to obtain 10 random bits (corresponding to an integer in the range 0-1023): if the number is greater than 1023=4423_6, discard the sequence and roll dice again, otherwise convert the number to dual and be done.

(3) Tangentially related: A group of (originally) PGP users have, since the 1990s, been advocating a method for using dice rolls to generate secure keyring passphrases. The idea is to use a dice roll result sequence to choose random words from an appropriately indexed wordlist. This is known as the Diceware method.

The pertaining website http://world.std.com/~reinhold/diceware.html and links therein contain lots of interesting information, including entropy calculations.

Oh my, I totally miscalculated the example. To obtain 10 random bits, you need 4 dice rolls, giving a number within 0-5555_6 = 0-1295. Rest as above. Since 555_6=215<(2^8-1), three rolls provide 7 bits only.

You appear to have miscalculated and misused binary notation.

Hint: 2^3 is not 6.

Hi!

2^3 = 8

So dice produces about 2.5849 bits of entropy, because of log2(6).

There’s a flaw in your explanation: 2^3 is 8, not 6, so a perfect D6 produces less than 3 bits of entropy, about 2.585 ( log(6) base 2). In this case multiple rolls of a D6 produce less than 3N bits (since the upper bit is twice as likely to be 0 than 1).

You should use octahedral dice for the explanation for the math to be correct. (D8s make for weirder props, but that’s the sacrifice of correctness. Or just use coin flips).

Sorry but you have a small math typo in the post:

2^3 is 8 not 6

111 = 8

110 = 6

Nice article and interesting idea, expecially if applied to teaching.

Shouldn’t you be using 1d8 to get 3 bits of entropy? (2^3 = 8)

It might not be exactly what you meant to write, but the “2^3=6” assumption in the text distracts a bit. “2*3=6”, but “2^3=8”.

nit-picking : 2^3 = 8, not 6

Huh, no, 2^3 = 8 > 6.

D6 can produce at most log(6, 2) ≈ 2.6 bits of entropy.

A few problems here :

– as said earlier: 2^3 = 8

– the binary representation of 6 is 110, not 111

– concatenating 3-bits words biases the outcome, vastly. For example, your first byte cannot be any of :

000*****

111*****

***000**

***111**

and results in ******11 and ******00 are twice more unlikely than the ones in ******10 or ******01.

2³=8, not 6.

2^3 is 8, not six. You don’t seem to be rolling D8s.

If you want a macro-scale RNG, use a larger N for your DN.

The more facets that a D has, the less energy is required to change the final state, and thus it is more chaotic. D20 are easily available cheaply; D100 are easily available reasonably cheaply. A D100 may exhibit bias, but it should exhibit less bias than a D6 with the same mass distribution problem.

2^3=8

A normal dice cannot generate 3 bit of entropy per dice roll (if you only use the information which side is up), since there are only 6 possible outcomes. For 3 bits of entropy you need 8 equally likely outcomes.

Please review the mapping of D6 output to binary strings. If you just encode the D6 output as its binary representation, the resulting bit string gets predictable in that it does never contain 000 and 111 in every group of three bits.

And 111 is not 6.

“So the 16 bytes of random data is “7C9559..89″ with “..” replaced by the 5 pieces of 3-byte chunks of data.”

> 4 is binary 011

>It must be random desaster: – never more than 3 zeros in a row –

> The name of the function is: “Tinted Seed”

> hardware devices

what about sampling some thermal noise?

just about any semiconductor device makes some of it,

(think of background hiss from any kind of amplifier)

I’ve never seen those “hardware devices” but I’d be pretty surprised if none of them did something like putting “white noise” (eg from amplified thermal noise or “hissy” noise sources) into some kind of adc (as in sample it)

probably not hard to make,…

care probably needed re RF shielding though!

just a guess that seems obvious enough to be likely…