Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Jan 2001 01:41:12 -0500
From:      "Bosko Milekic" <bmilekic@technokratis.com>
To:        <arch@FreeBSD.ORG>
Cc:        "John Baldwin" <jhb@FreeBSD.org>, <jasone@freebsd.org>
Subject:   mtx_init: MTX_RECURSE and potential SMP troublespots
Message-ID:  <001d01c08119$a62482b0$1f90c918@jehovah>

next in thread | raw e-mail | index | archive | help
Hello,

    This patch: http://people.freebsd.org/~bmilekic/code/recur_witn.diff
introduces MTX_RECURSE (the old MTX_RECURSE state bit is renamed to
MTX_RECURSED) and makes passing the MTX_RECURSE bit to mtx_init() get
registered in the witness code such that during execution, the WITNESS
portion of mtx_enter/mtx_exit (witness_enter/witness_exit) will call panic if
a mutex is recursed but did not have this bit in its flags during
initialization.
    This is a first change for debugging purposes mainly in preparation for a
modified, cleaner, mtx_*() interface which is likely to be coming to -CURRENT
eventually.

    The diff also cleans up some mtx_recurse != 0 checks to use the already
provided mtx_recursed() macros. Final reviews are welcome.

    (LINT builds with this, by the way, and -CURRENT on my machine seems to
be going fine, so far).

    Now, the important bit. The patch attempts to add MTX_RECURSE to existing
mutex locks that are known to recurse. Amongst these, we have:

eventhandler, Giant, callout, sched_lock, and a whole bunch of network card
drivers in pci/* and company, as well as some others which I was able to find
out recurse. If you are a maintainer or have added locks to some subsystem,
please glance at the patch and make sure that if your locks recurse, that
their mtx_init() has been changed to include the MTX_RECURSE bit. If this is
not the case, please let us know ASAP. Furthermore, if you've added a lock
that you know/think should not be recursing and you see its mtx_init() call
was modified in the patch to include the MTX_RECURSE bit, also please let us
know. While going around the code, I've spotted some definite problems, and
some potential problems. There are likely more which I've missed, so if
you've added any locks, please make sure to review for similar problems.

sys/dev/acpica/Osd/OsdSynch.c probably has a bug where
AcpiOsDeleteSemaphore() does not mtx_destroy() on its previously mtx_init()ed
mutex locks.

sys/dev/ichsmb passes MTX_NORECURSE to mtx_init(), when it's only supposed to
be passed to mtx_exit(). Apparently, this is not in mutex(9) [yet?]

sys/dev/isp probably calls tsleep() with a mutex held and has a wakeup() for
that sleep surrounded by the mutex lock's acquire/drop.

sys/dev/pccbb seems to protect wakeup() calls w/ a mutex enter/exit. This is
odd because just code is typically not supposed to be protected with a mutex.
This may be a problem because I believe I saw this code drop the mutex before
calling tsleep() which may be a race because in this case, a wakeup() can
come before the actual tsleep() is entered (or before the switch happens) and
therefore the thread will go to sleep even if a wakeup() was already issued
just prior to.

sys/dev/yarrow.c "may" be violating lock order (there _may_ be a lock order
reversal); the fact that printf()s are surrounded with Giant lock enter/exit
and that some of the routines containing the printf() calls are called with
the driver's lock held and that some of them aren't may be a problem if those
that are not do attempt to acquire the driver lock and are entered with Giant
already held (from somewhere). There may be a lock reversal; but I haven't
been able to investigate sufficiently to be able to claim this with absolute
certainty.

Regards,
Bosko.




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?001d01c08119$a62482b0$1f90c918>