Opened 3 years ago

Closed 3 years ago

#10964 closed enhancement (wontfix)

unbound : this should use /dev/urandom rather than /dev/random.

Reported by: ken@… Owned by: ken@…
Priority: normal Milestone: 8.3
Component: BOOK Version: SVN
Severity: normal Keywords:
Cc:

Description

Until the fix for CVE-2018-1108 (4.17-rc1, backported to 4.16.4) /dev/random was available early, even if its initialisation had not completed.

Before that fix https://lkml.org/lkml/2018/4/12/711 unbound started promptly on all of my desktop systems, but after it some machines took two and a half minutes to run the unbound bootscript - keying Ctrl-C a few times, and perhaps <enter>, or just thumping on the keyboard, seems to expedite matters. The affected machines only have SSD drives, but some others with just an SSD continue to start normally (perhaps 5 seconds for unbound to run).

I use unbound to prevent DNS spoofing, IMHO that does not require cryptographic-quality randomness. As noted in http://lists.linuxfromscratch.org/pipermail/blfs-support/2018-July/080277.html et seq. unbound appears to try to fallback to /dev/urandom if using /dev/random fails. But /dev/random now correctly hangs until sufficent entropy is available.

And Nixos apparently creates /var/lib/unbound/dev/random and bind mounts /dev/urandom to it. We, at least in sysv, currently re-seed /dev/urandom after unbound has been started.

1.At the moment I have no idea if (on a regular desktop machine) the value returned by an unseeded /dev/urandom is consistent across boots, or if the re-seeding needs to be brought forward (but it appears that either it should be, or else it might not be needed at all, except perhaps on VMs).

2.The more important issue is whether my evaluation of the code in unbound is correct. For this, trying to patch out the use of /dev/random so that it should fallback to /dev/urandom seems to be the first line of approach, even if the Nixos approach turns out to be better for the long term.

Change History (6)

comment:1 by ken@…, 3 years ago

Owner: changed from blfs-book to ken@…
Status: newassigned

comment:2 by ken@…, 3 years ago

For the first question, I ran three boots on the kaveri (i.e. the machine without a hwrng). In the unbound script I added the following at the start:

      /bin/dd if=/dev/urandom of=/tmp/random-seed count=1 &>/dev/null
      echo "random"
      hexdump /tmp/random-seed | head -n 2
      sleep 20

I only managed to note about 10 bytes before unbound was run, but I managed the remaining 6 while waiting for it to run. In all three case the first 16 bytes were different on each run, so I conclude that the random bootscript is not needed on normal machines. On VMs, maybe it is useful, or perhaps it really adds nothing to the process.

comment:3 by ken@…, 3 years ago

Not making any progress, but for re-seeding urandom, the man page (urandom.4) says:

If a seed file is saved across reboots as recommended below (all major Linux distributions have done this since 2000 at least), the output is cryptographically secure against attackers without local root access as soon as it is reloaded in the boot sequence, and perfectly adequate for network encryption session keys.

So I guess we should continue to re-seed it for protection.

comment:4 by ken@…, 3 years ago

Hmm, apparently (since linux-3.17) /dev/urandom *does* block until properly initialised with 128 bits of entropy (noise). On this machine, I have typically 80 bits of entropy when the unbound bootscript starts.

Link for the design of the change in 3.17 : https://lwn.net/Articles/606141/ and a further confirmation from Python PEP 524 https://www.python.org/dev/peps/pep-0524/ -

Python 3.5.0 was enhanced to use the new getrandom() syscall introduced in Linux 3.17 and Solaris 11.3. The problem is that users started to complain that Python 3.5 blocks at startup on Linux in virtual machines and embedded devices: see issues hash25420 and hash26839. - I changed '#' to 'hash' to stop trac trying to link to tickets with those numbers.

in reply to:  4 comment:5 by Bruce Dubbs, 3 years ago

Replying to ken@…:

I changed '#' to 'hash' to stop trac trying to link to tickets with those numbers.

Use an exclamation mark in trac to skip special processing: #1234

comment:6 by ken@…, 3 years ago

Resolution: wontfix
Status: assignedclosed

According to Ted Ts'o, it is calls to getrandom(2) which are blocking. I queried on lkml, after finding that pointing unbound to /dev/urandom still blocked, and the only mention of random in strace was /dev/urandom.

More importantly, haveged will not 'pollute' the entropy for ever:

This really depends on how paranoid / careful you are.  Remember, your
keyboard controller was almost certainly built in Shenzhen, China, and
Matt Blaze published a paper on the Jitterbug in 2006:
http://www.crypto.com/papers/jbug-Usenix06-final.pdf
In practice, after 30 minutes of operation, especially if you are
using the keyboard, the entropy pool *will* be sufficiently
randomized, whether or not it was sufficientl randomized at boot.  The
real danger of CVE-2018-1108 was always long-term keys generated at
first boot.  That was the problem that was discussed in the "Mining
your p's and q's: Detection of Widespread Weak Keys in Network
Devices" (see https://factorable.net).

So generating long-lived keys means (a) you need to be sure you trust
all of the software on the system --- some very paranoid people such
as Bruce Schneier used a freshly installed machine from CD-ROM that
was never attached to the network before examining materials from
Edward Snowden, and (b) making sure the entropy pool is initialized.

Remember we are constantly feeding input from the hardware sources
into the entropy pool; it doesn't stop the moment we think the entropy
pool is initialized.  And you can always mix extra "stuff" into the
entropy pool by echoing the results of say, taking series of dice
rolls, aond sending it via the "cat" or "echo" command into
/dev/urhandom.

So it should be possible to use the machine for generated long lived
keys; you might just need to be a bit more careful before you do it.

So, since fixing unbound to use /dev/urandom will not speed up the boot when entropy is lacking, this is not worth fixing.

Note: See TracTickets for help on using tickets.