Date: Tue, 1 Jan 2008 18:57:48 +0100 From: Peter Schuller <peter.schuller@infidyne.com> To: freebsd-current@freebsd.org Cc: Pawel Jakub Dawidek <pjd@freebsd.org>, current@freebsd.org Subject: Re: (ZFS?): panic: lockmgr: locking against myself Message-ID: <200801011857.57757.peter.schuller@infidyne.com> In-Reply-To: <200707310126.06923.peter.schuller@infidyne.com> References: <200707282028.37102.peter.schuller@infidyne.com> <200707292157.09742.peter.schuller@infidyne.com> <200707310126.06923.peter.schuller@infidyne.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1470949.D78ygmVzrc Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline (quoting last post for convenience; more history at=20 http://www.usenetarticles.com/thread/952336.html) > > vnode 0xffffff00037473e0: tag devfs, type VDIR > > usecount 0, writecount 0, refcount 1 mountedhere 0xffffff0003745ca0 > > flags (VV_ROOT) > > lock type devfs: EXCL (count 1) by thread 0xffffff00010e6680 (pid 1) > > Some additional facts: > > Looking at the printouts, there is always a sequence of three or more > (three at least twice; more than three at least once) vrele():s of the sa= me > vnode, in both the successful case and the panicing case. There are no > vrele():s of any other vnodes in either case. > > Inserting enter/exit debug printouts in mountcheckdirs() confirms that all > calls occur within the bounds of a single call to mountcheckdirs(). Does > not this imply there is some locking mismatch in the non-ZFS specific cod= e? > I must admit I find the locking confusing; with several locking/unlocking > functions/macros intermixed at different levels in the callstack. My > (incorrect) reading was that this panic should always be happening, which > is obviously not the case. > > Running with vfs.zfs.debug=3D1 confirms that vdev_geom open/attach/detach= is > happening prior to any vrele() even in the panicing case (i.e., zfs pool > discovery seems to complete). > > In the case of an expected provider not being found, vd->vdev_devid is NU= LL > in vdev_geom_open(), based on the "provider not found" debug printout > (perhaps normal?). I *think* I just experienced the same problem on 7.0-BETA3, except the kern= el=20 does not have WITNESS/INVARIANTS so I just get a hack instead of a panic. I= =20 wanted to post with the information I have for completeness; I realize what= =20 follows is a bunch of anecdotal mumbo-jumbo. The boot-up process hangs right before the would-be 'trying to mount root=20 from....", after all the glabel tasting has completed. This was on a completely different system than the one in the original post= ,=20 but it also has root-on-zfs (this time on a 5 disk raidz2). It's a dual cor= e=20 amd64 machine with a low-end mobo and low-end SATA controllers (SiI and som= e=20 built-in nVidia chipset). It all started when I was booting back into FreeBSD after having Windows=20 booted for a while. It wouldn't boot. If fiddled some wiht vfs.zfs.debug=3D= 1,=20 removing a cd ion the drive (in case it affected timing), but it did not=20 help. I did not try the boot-7-live cd trick this time as I did originally = on=20 the other machine. I looked carefully to make sure all drives were detected, including geom=20 tasting on all but one of them that are in the zfs pool. The I/O indicator= =20 leds on the respective drives that ar part of the zfs pool did not indicate= =20 any I/O after the hang. I waited 5+ minutes at least once in the hope that = it=20 was a drive timing out. After several attempts I turned off the machine and let it do a cold boot -= at=20 this point the system booted fine. This is different from before, in that previously the behavior was seemingl= y=20 triggered by changes in system configuration (loss of a drive, etc). This=20 time it was just a reboot. I *did* touch a bunch of cables in between, and= =20 blew some air on components (for reasons not relating to this) which I=20 originally figured could explain the problem. Before this incident, the system has booted with root-on-zfs several times = (at=20 least 25, probably more like 50+) without any kind of problem, ever. =2D-=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --nextPart1470949.D78ygmVzrc Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQBHen8lDNor2+l1i30RAh9uAJ0XoABn2gWFopb+g0hP73bRS8HJ/ACgm42P Ho33IXjvrscn04uOtk4K31I= =mTaZ -----END PGP SIGNATURE----- --nextPart1470949.D78ygmVzrc--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200801011857.57757.peter.schuller>