Opened 6 years ago
Closed 6 years ago
#10964 closed enhancement (wontfix)
unbound : this should use /dev/urandom rather than /dev/random.
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | normal | Milestone: | 8.3 |
Component: | BOOK | Version: | SVN |
Severity: | normal | Keywords: | |
Cc: |
Description
Until the fix for CVE-2018-1108 (4.17-rc1, backported to 4.16.4) /dev/random was available early, even if its initialisation had not completed.
Before that fix https://lkml.org/lkml/2018/4/12/711 unbound started promptly on all of my desktop systems, but after it some machines took two and a half minutes to run the unbound bootscript - keying Ctrl-C a few times, and perhaps <enter>, or just thumping on the keyboard, seems to expedite matters. The affected machines only have SSD drives, but some others with just an SSD continue to start normally (perhaps 5 seconds for unbound to run).
I use unbound to prevent DNS spoofing, IMHO that does not require cryptographic-quality randomness. As noted in http://lists.linuxfromscratch.org/pipermail/blfs-support/2018-July/080277.html et seq. unbound appears to try to fallback to /dev/urandom if using /dev/random fails. But /dev/random now correctly hangs until sufficent entropy is available.
And Nixos apparently creates /var/lib/unbound/dev/random and bind mounts /dev/urandom to it. We, at least in sysv, currently re-seed /dev/urandom after unbound has been started.
1.At the moment I have no idea if (on a regular desktop machine) the value returned by an unseeded /dev/urandom is consistent across boots, or if the re-seeding needs to be brought forward (but it appears that either it should be, or else it might not be needed at all, except perhaps on VMs).
2.The more important issue is whether my evaluation of the code in unbound is correct. For this, trying to patch out the use of /dev/random so that it should fallback to /dev/urandom seems to be the first line of approach, even if the Nixos approach turns out to be better for the long term.
Change History (6)
comment:1 by , 6 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:2 by , 6 years ago
comment:3 by , 6 years ago
Not making any progress, but for re-seeding urandom, the man page (urandom.4) says:
If a seed file is saved across reboots as recommended below (all major Linux distributions have done this since 2000 at least), the output is cryptographically secure against attackers without local root access as soon as it is reloaded in the boot sequence, and perfectly adequate for network encryption session keys.
So I guess we should continue to re-seed it for protection.
follow-up: 5 comment:4 by , 6 years ago
Hmm, apparently (since linux-3.17) /dev/urandom *does* block until properly initialised with 128 bits of entropy (noise). On this machine, I have typically 80 bits of entropy when the unbound bootscript starts.
Link for the design of the change in 3.17 : https://lwn.net/Articles/606141/ and a further confirmation from Python PEP 524 https://www.python.org/dev/peps/pep-0524/ -
Python 3.5.0 was enhanced to use the new getrandom() syscall introduced in Linux 3.17 and Solaris 11.3. The problem is that users started to complain that Python 3.5 blocks at startup on Linux in virtual machines and embedded devices: see issues hash25420 and hash26839. - I changed '#' to 'hash' to stop trac trying to link to tickets with those numbers.
comment:5 by , 6 years ago
Replying to ken@…:
I changed '#' to 'hash' to stop trac trying to link to tickets with those numbers.
Use an exclamation mark in trac to skip special processing: #1234
comment:6 by , 6 years ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
According to Ted Ts'o, it is calls to getrandom(2) which are blocking. I queried on lkml, after finding that pointing unbound to /dev/urandom still blocked, and the only mention of random in strace was /dev/urandom.
More importantly, haveged will not 'pollute' the entropy for ever:
This really depends on how paranoid / careful you are. Remember, your keyboard controller was almost certainly built in Shenzhen, China, and Matt Blaze published a paper on the Jitterbug in 2006: http://www.crypto.com/papers/jbug-Usenix06-final.pdf In practice, after 30 minutes of operation, especially if you are using the keyboard, the entropy pool *will* be sufficiently randomized, whether or not it was sufficientl randomized at boot. The real danger of CVE-2018-1108 was always long-term keys generated at first boot. That was the problem that was discussed in the "Mining your p's and q's: Detection of Widespread Weak Keys in Network Devices" (see https://factorable.net). So generating long-lived keys means (a) you need to be sure you trust all of the software on the system --- some very paranoid people such as Bruce Schneier used a freshly installed machine from CD-ROM that was never attached to the network before examining materials from Edward Snowden, and (b) making sure the entropy pool is initialized. Remember we are constantly feeding input from the hardware sources into the entropy pool; it doesn't stop the moment we think the entropy pool is initialized. And you can always mix extra "stuff" into the entropy pool by echoing the results of say, taking series of dice rolls, aond sending it via the "cat" or "echo" command into /dev/urhandom. So it should be possible to use the machine for generated long lived keys; you might just need to be a bit more careful before you do it.
So, since fixing unbound to use /dev/urandom will not speed up the boot when entropy is lacking, this is not worth fixing.
For the first question, I ran three boots on the kaveri (i.e. the machine without a hwrng). In the unbound script I added the following at the start:
I only managed to note about 10 bytes before unbound was run, but I managed the remaining 6 while waiting for it to run. In all three case the first 16 bytes were different on each run, so I conclude that the random bootscript is not needed on normal machines. On VMs, maybe it is useful, or perhaps it really adds nothing to the process.