Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 May 2016 09:48:56 -0700
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-arch@freebsd.org, cem@freebsd.org
Subject:   Re: KASSERT: always assert; KWARN
Message-ID:  <31200026.OetD7h0dHc@ralph.baldwin.cx>
In-Reply-To: <CAG6CVpWNzmaqKsOKJtuG1642m0knkupAoK=BoGq5iNHC1TA-Gw@mail.gmail.com>
References:  <CAG6CVpWzuK6cZx3QnQhKOu=6GZBJF4cJQdNXgJZeXYhuJJANJg@mail.gmail.com> <ca489ff1-f520-3feb-0525-425eb015af7a@freebsd.org> <CAG6CVpWNzmaqKsOKJtuG1642m0knkupAoK=BoGq5iNHC1TA-Gw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, May 11, 2016 09:30:46 AM Conrad Meyer wrote:
> On Wed, May 11, 2016 at 9:04 AM, Alfred Perlstein <alfred@freebsd.org> wrote:
> > On 5/10/16 6:24 PM, Conrad Meyer wrote:
> >
> >> Thoughts or objections?  Does anyone like the ability to opt out of
> >> invariants asserts?
> >
> >
> > Yes.
> >
> > During my time at iXsystems we used this facility several times to get a log
> > from a customer site with a number of "kasserts".
> >
> > The reason we did this was multiple reasons:
> > 1) We needed to ship a kernel with asserts enabled.
> > 2) When we did, the first assert hit was:
> >     a) In an unrelated module and not relevant.
> >     b) Not enough information came back from just the first assert.
> > 3) We found it more useful to get multiple errors back from a customer in
> > one trip rather than one fix at a time.  Unfortunately one fix at a time
> > would have had us lose the customer.
> >
> > The KASSERT/assert system is very, very, very useful.  However if you are at
> > a last resort sending a debug kernel (with Kassert enabled) and do not get
> > enough information back then you will lose that customer.
> >
> > I understand that a few vocal folks are upset, like seriously, seriously
> > upset, however at the time this was the only way we could effectively debug
> > a customer problem and my hope was that others could make use of it as well.
> >
> > Linux has had a similar functionality for many years.  In Linux there is the
> > kernel "oops()" which does nearly the same thing.
> >
> > Initially I mocked Linux's "oops" for being silly and "wrong", using the
> > exact same reasons that many have used to dislike "kassert_warn".  However
> > once I was responsible for an extremely pissed off customer who was paying
> > us quite a sum of money AND I was not getting enough information back, I
> > changed my mind.
> >
> > https://en.wikipedia.org/wiki/Linux_kernel_oops
> 
> Hi,
> 
> Here's my follow-up from the Phabricator review.  (Alfred, you've
> already seen it.  But, for everyone else:)
> 
> Here's another proposal:
> 
> Add a mode between INVARIANTS off and INVARIANTS on. Call it
> INVARIANTS_OPTIONAL.
> 
> * In !INVARIANTS mode, you don't have KASSERTs at all (like today).
> 
> * In INVARIANTS && INVARIANTS_OPTIONAL mode, you get the all the
> print-and-continue KASSERT knobs you have today. (So, same default of
> panicing, but optionally they can be disabled and turned into logs.)
> 
> * In INVARIANTS && !INVARIANTS_OPTIONAL mode, KASSERT always panics.
> 
> I would suggest GENERIC move from the current mode, effectively
> INVARIANTS_OPTIONAL, to !INVARIANTS_OPTIONAL. But if you want to ship
> a kernel with pass-through assertions enabled, you can still do that
> by enabling INVARIANTS_OPTIONAL.
> 
> This gives the expected KASSERT behavior for Coverity modeling, and
> still enables the KASSERT-lite use case.  (It would just be kicked out
> of GENERIC.)
> 
> Adrian, would that meet your needs?

Eh, if you keep going past many of the assertions the original code
enabled, you will get _more_ bogus assertions as fallout.  For example,
if you aren't holding the correct locks (or you try to unlock a lock
locked in a different mode), then you are going to start corrupting data
resulting in false positives.  This is like getting 20 compiler errors
due to missing a semicolon earlier in the file.  The first error is
real, the rest are noise.  Sometimes there are actual errors in the noise,
but you have to sort through a lot of noise.  However, the compiler doesn't
normally mangle the rest of your source code after your first error, but
int the case of assertions the kernel often _is_ going to start mangling
your data after your first failure.

If you have things that aren't actual errors but for which you _handle_
the unexpected case instead of just blindly corrupting data, then you
can use KWARN for those cases.  However, you have to actually handle the
condition for that to be safe.

In your case of the cdrom driver, if you weren't using it, then turn it
off (i.e. take it out of the kernel) if you don't have time to debug it.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?31200026.OetD7h0dHc>