Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Oct 2016 11:13:10 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        "Philip M. Gollucci" <pgollucci@p6m7g8.com>
Cc:        Alan Cox <alc@freebsd.org>, src-committers@freebsd.org,  svn-src-user@freebsd.org
Subject:   Re: svn commit: r306713 - in user/alc/PQ_LAUNDRY: lib/libc/stdlib lib/msun/ld80 lib/msun/src sys/cam sys/vm
Message-ID:  <20161006101333.N1163@besplex.bde.org>
In-Reply-To: <CACM2dAa0XrUm4zBaPnSQGrHY8%2B2RLYU5PKC%2BtgDTPrP0_r52XQ@mail.gmail.com>
References:  <201610051803.u95I3Hq1040052@repo.freebsd.org> <CACM2dAa0XrUm4zBaPnSQGrHY8%2B2RLYU5PKC%2BtgDTPrP0_r52XQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 5 Oct 2016, Philip M. Gollucci wrote:

> I know you(Alan) didn't write this, its from the MFH, but how would len !=
> expected.
>
> neither are initialized or passed in
> both times its set its expected = len = foo

Er, both are initialized.  len is passed as an input/output parameter to
sysctl().  sysctl() can return a short count.  This is now detected by
recording the expected value in 'expected' and mishandled.  Previously,
the error wasn't even detected.

> On Wed, Oct 5, 2016 at 2:03 PM, Alan Cox <alc@freebsd.org> wrote:
>
>> Modified: user/alc/PQ_LAUNDRY/lib/libc/stdlib/random.c
>> ============================================================
>> ==================
>> --- user/alc/PQ_LAUNDRY/lib/libc/stdlib/random.c        Wed Oct  5
>> 17:32:06 2016        (r306712)
>> +++ user/alc/PQ_LAUNDRY/lib/libc/stdlib/random.c        Wed Oct  5
>> 18:03:17 2016        (r306713)
>> @@ -270,16 +270,17 @@ void
>>  srandomdev(void)
>>  {
>>         int mib[2];
>> -       size_t len;
>> +       size_t expected, len;
>>
>>         if (rand_type == TYPE_0)
>> -               len = sizeof(state[0]);
>> +               expected = len = sizeof(state[0]);
>>         else
>> -               len = rand_deg * sizeof(state[0]);
>> +               expected = len = rand_deg * sizeof(state[0]);
>>
>>         mib[0] = CTL_KERN;
>>         mib[1] = KERN_ARND;
>> -       sysctl(mib, 2, state, &len, NULL, 0);
>> +       if (sysctl(mib, 2, state, &len, NULL, 0) == -1 || len != expected)
>> +               abort();
>>
>>         if (rand_type != TYPE_0) {
>>                 fptr = &state[rand_sep];

I don't like this set of changes.

Originally, srandomdev() read from /dev/random.  read() can also return a
short count.  This was detected and mishandled, but not as badly as now.
The non-error of a short count was treated as an error, and both were
handled by falling back to the weak method of using gettimeofday().

This was broken by switching to the sysctl and not even checking for
errors from the sysctl.  The sysctl can and does fail, mainly when a
new userland is used with an old kernel.

This commit restores some error checking.

The implementation uses the style bugs of using the numbered sysctl
KERN_ARND.  This sysctl shouldn't exist by number, and isn't documented
by number.  It is too new to need a number or benefit from the careful
documentation of old numbered sysctls.  It is only documented by name
(kern.arandom), only in a wrong place (random(4)), and a naive reader
would expect from reading this man page that /dev/random is still the
primary interface.  Its behaviour of returning a short count might be
expected from the device's behaviour, but is not really documented for
the sysctl.  It is only documented that the sysctl will not return
random bytes unless the random device is seeded.  The return value for
this case is undocumented (is it 0 or -1?   Source code seems to say
that it is 0).  Anyway, errors seems to be possible, so abort() is not
acceptable handling.

This was handled much better in arc4random() and is still handled better
there despite recent regressions:
- the sysctl is wrapped by a function that retries for short counts.
   However, the sysctl usage has even larger style bugs -- the sysctl is
   by number, and sysctl() is named __sysctl()
- error handling was not missing.  Perhaps the retry loop can spin forever
   with a short count of 0, but if the sysctl fails then there was a fallback
   first to reading /dev/random and then to the weak gettimeofday() method.

Now there is no fallback, and the difference is just the retry loop.

The first step in the removed fallback in arc4random() was not weaker
like the log message says.  It is to the primary method according to
the man page.  According to the man page, reading /dev/random blocks
until the RNG is seeded for the first time.  Blocking for the sysctl
is undocumented.  I think the sysctl returns a short count of 0, and
we treat this as an error and abort(), so applications running early
in the boot crash instead of having the traditional behaviour of hanging.

It is unclear if short counts of nonzero can actually occur.  I think
they could easily occur in older implementations where /dev/random
blocked for more cases than before the initial seeding.  I think they
should still be possible.  Even if there is no need to block, it is
useful to be able to kill a read() with a preposterously large count.
If interrupts are allowed at all, then after one the return value must
be either -1 or a short count.  That is another reason why abort() in
a library is bad error handling.  It turns most signals into SIGABRT.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161006101333.N1163>