Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 8 Jun 2019 19:35:38 -0400
From:      "Jonathan T. Looney" <jtl@freebsd.org>
To:        Enji Cooper <yaneurabeya@gmail.com>
Cc:        src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r348810 - head/sys/x86/x86
Message-ID:  <CADrOrmuKO%2BH1tweQ3Ctj6XixvfUBATSF%2BzmE5Rub2s=mgiaqQg@mail.gmail.com>
In-Reply-To: <760B9EAA-6FD7-4B23-9BE3-5011B3F4C21C@gmail.com>
References:  <201906081826.x58IQnQe067743@repo.freebsd.org> <760B9EAA-6FD7-4B23-9BE3-5011B3F4C21C@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Enji,

On Sat, Jun 8, 2019 at 3:58 PM Enji Cooper <yaneurabeya@gmail.com> wrote:

> > Modified:
> >  head/sys/x86/x86/mca.c
> >
> > Modified: head/sys/x86/x86/mca.c
> >
> ==============================================================================
> > --- head/sys/x86/x86/mca.c    Sat Jun  8 17:49:17 2019    (r348809)
> > +++ head/sys/x86/x86/mca.c    Sat Jun  8 18:26:48 2019    (r348810)
> > @@ -86,7 +86,6 @@ struct amd_et_state {
> >
> > struct mca_internal {
> >    struct mca_record rec;
> > -    int        logged;
> >    STAILQ_ENTRY(mca_internal) link;
> > };
> >
> > @@ -101,6 +100,7 @@ static MALLOC_DEFINE(M_MCA, "MCA", "Machine Check
> Arch
> >
> > static volatile int mca_count;    /* Number of records stored. */
> > static int mca_banks;        /* Number of per-CPU register banks. */
> > +static int mca_maxcount = -1;    /* Limit on records stored. (-1 =
> unlimited) */
> >
> > static SYSCTL_NODE(_hw, OID_AUTO, mca, CTLFLAG_RD, NULL,
> >     "Machine Check Architecture");
> > @@ -125,10 +125,11 @@ SYSCTL_INT(_hw_mca, OID_AUTO, erratum383,
> CTLFLAG_RDTU
> > static STAILQ_HEAD(, mca_internal) mca_freelist;
> > static int mca_freecount;
> > static STAILQ_HEAD(, mca_internal) mca_records;
> > +static STAILQ_HEAD(, mca_internal) mca_pending;
> > static struct callout mca_timer;
> > static int mca_ticks = 3600;    /* Check hourly by default. */
> > static struct taskqueue *mca_tq;
> > -static struct task mca_refill_task, mca_scan_task;
> > +static struct task mca_resize_task, mca_scan_task;
> > static struct mtx mca_lock;
> >
> > static unsigned int
> > @@ -557,32 +558,49 @@ mca_check_status(int bank, struct mca_record *rec)
> > }
> >
> > static void
> > -mca_fill_freelist(void)
> > +mca_resize_freelist(void)
> > {
> > -    struct mca_internal *rec;
> > -    int desired;
> > +    struct mca_internal *next, *rec;
> > +    STAILQ_HEAD(, mca_internal) tmplist;
> > +    int count, i, desired_max, desired_min;
> >
> >    /*
> >     * Ensure we have at least one record for each bank and one
> > -     * record per CPU.
> > +     * record per CPU, but no more than twice that amount.
> >     */
> > -    desired = imax(mp_ncpus, mca_banks);
> > +    desired_min = imax(mp_ncpus, mca_banks);
> > +    desired_max = imax(mp_ncpus, mca_banks) * 2;
> > +    STAILQ_INIT(&tmplist);
> >    mtx_lock_spin(&mca_lock);
> > -    while (mca_freecount < desired) {
> > +    while (mca_freecount > desired_max) {
> > +        rec = STAILQ_FIRST(&mca_freelist);
> > +        KASSERT(rec != NULL, ("mca_freecount is %d, but list is empty",
> > +            mca_freecount));
> > +        STAILQ_REMOVE_HEAD(&mca_freelist, link);
> > +        mca_freecount--;
> > +        STAILQ_INSERT_TAIL(&tmplist, rec, link);
> > +    }
> > +    while (mca_freecount < desired_min) {
> > +        count = desired_min - mca_freecount;
> >        mtx_unlock_spin(&mca_lock);
>
> Should this also be outside the loop, like it was before, to ensure the
> locking is correct, the lock is always unlocked, and the current thread
> yields to the other threads, or should the lock be held over both loops?
>

I think it is correct as is, but welcome feedback if you still think it is
incorrect after my explanation.

Here is what the new code does:

1. While holding the lock, we figure out how many entries we want to add to
the free list. We store it in a local variable (count).

2. We drop the lock and allocate count entries.

3. We reacquire the lock, add those entries to the list, and increment
mca_freecount by count.

4. We then check to see if we need to allocate more entries (which might
occur if, for example, the free list had been depleted by other threads
between steps 1 and 3, while we didn't hold the lock).

The key thing is that we must hold the lock while reading or
writing mca_freecount and mca_freelist. This code meets that criteria.

Even if mca_freecount changes between steps 1 and 3 above, we still add
count entries to the list. And, we still increment mca_freecount by count.
So, in the end, mca_freecount should match the number of entries on the
list. (It might be more or less than our target, but it will be accurate.)

The only reason we need to drop the lock is to allocate more entries, which
we can't do while holding a spin lock. I don't think there is particularly
an impetus to want to yield more often than that.

Also, note that you'll only ever enter one of the two loops. (If you are
freeing excess entries, you won't need to allocate more to meet a shortage.)

Finally, note that you'll never leave this function with less than the
desired minimum, but you might leave with more than the desired maximum (if
entries are freed by other threads between steps 1 and 3). I considered
that to be an acceptable tradeoff between complexity and functionality.
And, in any case, if you are using and freeing so many entries so quickly
that this occurs, it seems like two things are true: you could probably use
the extra headroom on the free list and also it is very likely that
something will call the resize task again very soon.

If I haven't explained this well enough and you still have questions, feel
free to let me know.

Jonathan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADrOrmuKO%2BH1tweQ3Ctj6XixvfUBATSF%2BzmE5Rub2s=mgiaqQg>