From owner-freebsd-fs@FreeBSD.ORG Sat Jan 25 15:54:20 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4A9C71F1 for ; Sat, 25 Jan 2014 15:54:20 +0000 (UTC) Received: from mta05.bitpro.no (mta05.bitpro.no [92.42.64.202]) by mx1.freebsd.org (Postfix) with ESMTP id 911D51CC6 for ; Sat, 25 Jan 2014 15:54:19 +0000 (UTC) Received: from mail.lockless.no (mail.lockless.no [46.29.221.38]) by mta05.bitpro.no (Postfix) with ESMTPS id B6C5017FC76 for ; Sat, 25 Jan 2014 16:54:10 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by mail.lockless.no (Postfix) with ESMTP id 998098FAB6F for ; Sat, 25 Jan 2014 16:55:01 +0100 (CET) X-Virus-Scanned: by amavisd-new-2.6.4 (20090625) (Debian) at lockless.no Received: from mail.lockless.no ([127.0.0.1]) by localhost (mail.lockless.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YDWfrOilaFPY for ; Sat, 25 Jan 2014 16:54:59 +0100 (CET) Received: from mail.lockless.no (localhost [127.0.0.1]) by mail.lockless.no (Postfix) with ESMTP id 987648FAB57 for ; Sat, 25 Jan 2014 16:54:59 +0100 (CET) Subject: FW: lock order reversals w/ backtrace From: =?utf-8?Q?Hans_Petter_Selasky?= To: =?utf-8?Q?freebsd-fs=40freebsd=2Eorg?= Date: Sat, 25 Jan 2014 16:54:59 +0100 Mime-Version: 1.0 X-Priority: 3 (Normal) X-Mailer: Zarafa 7.1.4-41394 Message-Id: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.17 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jan 2014 15:54:20 -0000 FYI - can anyone comment=3F=0D=0A=0D=0A--HPS=0D=0A=20=0D=0A=20=0D=0A-----= Original message-----=0D=0A> From:Thomas Hoffmann >=0D=0A> Sent: Saturday 25th January 2014 15:55=0D=0A= > To: freebsd-current >=0D=0A> Cc: Hans Petter Selasky >=0D=0A> Subject: Re: lo= ck order reversals w/ backtrace=0D=0A>=20=0D=0A> On Fri, Jan 24, 2014 at = 11:47 PM, Thomas Hoffmann > w= rote:=0D=0A>=20=0D=0A> > On Fri, Jan 24, 2014 at 9:52 AM, Thomas Hoffmann= > wrote:=0D=0A> >=0D=0A> >> = On Thu, Jan 23, 2014 at 11:49 AM, Hans Petter Selasky <=0D=0A> >> hans.pe= tter.selasky@bitfrost.no > wrote= :=0D=0A> >>=0D=0A> >>> Hi,=0D=0A> >>>=0D=0A> >>> Can you see if you can = snap some keywords of the backtraces, like usb_xxx usbdx_xxx cam scsi or = something like that.=0D=0A> >>>=0D=0A> >>> Else I believe there are some = sysctl options to prevent the final reboot somehow so that you can write = down the messages.=0D=0A> >>>=0D=0A> >>>=0D=0A> >>> --HPS=0D=0A> >>>=0D=0A= > >>>=0D=0A> >>> -----Original message-----=0D=0A> >>> > From:Thomas Hoff= mann >=0D=0A> >>> > Sent: Thu= rsday 23rd January 2014 17:15=0D=0A> >>>=0D=0A> >>> > To: freebsd-current= >=0D=0A= > >>> > Subject: lock order reversals w/ backtrace=0D=0A> >>> >=0D=0A> >>= > > A few days ago I started running 11.0-CURRENT at r260971 for the firs= t time.=0D=0A> >>>=0D=0A> >>>=0D=0A> >>> >=0D=0A> >>> > The last couple o= f times I shutdown my system I noticed 2 or 3 short "lock=0D=0A> >>> > or= der reversal" messages with accompanying backtraces scroll by. Do these=0D= =0A> >>> > messages represent a problem that I should report or can I ign= ore them as=0D=0A> >>>=0D=0A> >>>=0D=0A> >>> > debug in nature=3F If I sh= ould report them, how or where do these messages=0D=0A> >>> > get logged=3F= I can find no reference to them in syslog or anywhere else upon=0D=0A> >= >> > my subsequent reboot.=0D=0A> >>> >=0D=0A> >>> > I also had a couple = of these messages pop up the other day while=0D=0A> >>>=0D=0A> >>>=0D=0A>= >>> > mounting/umounting USB thumb drives. I did not think anything of t= hem at=0D=0A> >>> > the time as the mounts/umounts completed successfully= =2E=0D=0A> >>> >=0D=0A> >>> > Please advise. Thanks.=0D=0A> >>> >=0D=0A> = >>> > -Tom=0D=0A> >>>=0D=0A> >>> I managed to snap a photo of my screen = during shutdown.=0D=0A> >> Here is the full text of the first of two lock= order reversals I got last=0D=0A> >> night during system shutdown, both = of which are zfs-related to (it=0D=0A> >> appears=3F) unmounts. A few lin= es got chopped as they were out-of-frame when=0D=0A> >> I took the photo:= =0D=0A> >>=0D=0A> >> shutting down local daemons:=0D=0A> >> lock order re= versal:=0D=0A> >> 1st 0xfffff801194f67c8 zfs (zfs) @ /usr/src/sys/kern/= vfs_mount.c:1237=0D=0A> >> 2nd 0xfffff801194f6420 syncer (syncer) @=0D=0A= > >> /usr/src/sys/kern/vfs_subr.c:22[..chopped...]=0D=0A> >> KDB: stack b= acktrace:=0D=0A> >> db_trace_self_wrapper() at db_trace_delf_wrapper+0x2b= /frame=0D=0A> >> 0xfffffe01ac78[...chopped...]=0D=0A> >> kdb_backtrace() = at kdb_backtrace+0x39/frame 0xfffffe01ac784650=0D=0A> >> witness_checkord= er() at witness_checkorder+0xd23/frame 0xfffffe01ac7846e0=0D=0A> >> __loc= kmgr_args() __lockmgr_args+0x878/frame 0xfffffe01ac784810=0D=0A> >> vop_s= tdlock() at vop_stdlock+0x3c/frame 0xfffffe01ac784830=0D=0A> >> VOP_LOCK1= _APV() at VOP_LOCK1_APV+0xf5/frame 0xfffffe01ac784860=0D=0A> >> _vn_lock(= ) at _vn_lock+0xab/frame 0xfffffe01ac7848d0=0D=0A> >> vputx() at vputx+0x= 240/frame 0xfffffe01ac784930=0D=0A> >> dounmount at dounmount+0x327/frame= 0xfffffe01ac7849b0=0D=0A> >> sys_unmount() at sys_unmount+0x356/frame 0x= fffffe01ac784ac0=0D=0A> >> amd64_syscall() at amd64_syscall+0x265/frame 0= xfffffe01ac784bf0=0D=0A> >> Xfast_syscall() at Xfast_syscall+0xfb/frame 0= xfffffe01ac784bf0=0D=0A> >> --- syscall (22, FreeBSD ELF64, sys_unmount, = rip =3D 0x80191c72a,=0D=0A> >> rsp[...chopped...]=0D=0A> >>=0D=0A> >> I h= ave a zpool on an external USB HDD that I export at shutdown via=0D=0A> >= > rc.shutdown.local. Don't know if that is contributing to this or not. W= hen=0D=0A> >> I get a chance to bring down the system I will manually exp= ort it before=0D=0A> >> shutdown to see if that makes any difference.=0D=0A= > >>=0D=0A> >=0D=0A> > Was able to capture the full text of the lock orde= r reversal I've been=0D=0A> > experiencing at shutdown. Turns out to be a= ssociated with the export of one=0D=0A> > of my zpools that is hosted on = an external USB HDD. Normally I export the=0D=0A> > zpool from rc.shutdow= n.local, but tonight I exported it manually before I=0D=0A> > shutdown an= d got the following.=0D=0A> >=0D=0A> > lock order reversal:=0D=0A> > : 1s= t 0xfffff8011952b5f0 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:1237=0D=0A= > > : 2nd 0xfffff8011952b068 syncer (syncer) @=0D=0A> > /usr/src/sys/kern= /vfs_subr.c:2212=0D=0A> > KDB: stack backtrace:=0D=0A> > db_trace_self_wr= apper() at db_trace_self_wrapper+0x2b/frame=0D=0A> > 0xfffffe01add6e5a0=0D= =0A> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe01add6e650=0D=0A= > > witness_checkorder() at witness_checkorder+0xd23/frame 0xfffffe01add6= e6e0=0D=0A> > __lockmgr_args() at __lockmgr_args+0x878/frame 0xfffffe01ad= d6e810=0D=0A> > vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe01add6e83= 0=0D=0A> > VOP_LOCK1_APV() at VOP_LOCK1_APV+0xf5/frame 0xfffffe01add6e860= =0D=0A> > _vn_lock() at _vn_lock+0xab/frame 0xfffffe01add6e8d0=0D=0A> > v= putx() at vputx+0x240/frame 0xfffffe01add6e930=0D=0A> > dounmount() at do= unmount+0x327/frame 0xfffffe01add6e9b0=0D=0A> > sys_unmount() at sys_unmo= unt+0x356/frame 0xfffffe01add6eae0=0D=0A> > amd64_syscall() at amd64_sysc= all+0x265/frame 0xfffffe01add6ebf0=0D=0A> > Xfast_syscall() at Xfast_sysc= all+0xfb/frame 0xfffffe01add6ebf0=0D=0A> > --- syscall (22, FreeBSD ELF64= , sys_unmount), rip =3D 0x80191c72a, rsp =3D=0D=0A> > 0x7fffffffc4c8, rbp= =3D 0x7fffffffc960 ---=0D=0A> >=0D=0A> > When imported and mounted, the = zpool reports no errors and I have not=0D=0A> > experienced any anomalies= reading or writing to the zpool. Should I be=0D=0A> > concerned=3F=0D=0A= > >=0D=0A> > Hmm, seems like I'm the only one interested in this.=0D=0A>=20= =0D=0A> I think I have resolved the problem with the zpool on the externa= l USB HDD.=0D=0A> I copied off the data, destroyed the datasets and zpool= , recreated the=0D=0A> zpool and datasets and restored the data. I have e= xecuted several exports=0D=0A> on the zpool and have received no "lock or= der reversal" messages.=0D=0A>=20=0D=0A> I am still getting lock order re= versal messages for my bootpool and zroot=0D=0A> zpools at shutdown. Recr= eating bootpool would be relatively easy, but if=0D=0A> I'm going to recr= eate zroot, I might as well do a reinistall, am I am not=0D=0A> prepared = to do at this time. So I'm still in search of an alternative way=0D=0A> t= o resolve this issue. My zpools all show "no known data errors", scrub=0D= =0A> cleanly, zpool upgrade reports that all zpools support all features,= and=0D=0A> I've not experienced any other issues related to my zpools.=0D= =0A>=20=0D=0A> Based on the above lock order reversal data, the error is = in this block of=0D=0A> code (, specifically /usr/src/sys/kern/vfs_mount.= c:1237)=0D=0A>=20=0D=0A> if ((coveredvp =3D mp->mnt_vnodecovered) !=3D N= ULL) {=0D=0A> mnt_gen_r =3D mp->mnt_gen;=0D=0A> = VI_LOCK(coveredvp);=0D=0A> vholdl(coveredvp);=0D=0A= > vn_lock(coveredvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_RET= RY);=0D=0A> /* LINE 1237 */=0D=0A> vdrop(coveredvp);=0D= =0A> /*=0D=0A> * Check for mp being unmo= unted while waiting for the=0D=0A> * covered vnode lock.= =0D=0A> */=0D=0A> if (coveredvp->v_mount= edhere !=3D mp ||=0D=0A> coveredvp->v_mountedhere->mn= t_gen !=3D mnt_gen_r) {=0D=0A> VOP_UNLOCK(covered= vp, 0);=0D=0A> return (EBUSY);=0D=0A> = }=0D=0A> }=0D=0A>=20=0D=0A>=20=0D=0A> -Tom=0D=0A> __________________= _____________________________=0D=0A> freebsd-current@freebsd.org mailing list=0D=0A> http://lists.freebsd.or= g/mailman/listinfo/freebsd-current =20=0D=0A> To unsubscribe, send any mail to "freebsd= -current-unsubscribe@freebsd.org "=0D=0A>=20=0D=0A=0D=0A From owner-freebsd-fs@FreeBSD.ORG Sat Jan 25 16:28:19 2014 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 68EDA7F4; Sat, 25 Jan 2014 16:28:19 +0000 (UTC) Received: from tensor.andric.com (unknown [IPv6:2001:7b8:3a7:1:2d0:b7ff:fea0:8c26]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 131141E7F; Sat, 25 Jan 2014 16:28:19 +0000 (UTC) Received: from [IPv6:2001:7b8:3a7::719e:bbe8:ba8a:34e0] (unknown [IPv6:2001:7b8:3a7:0:719e:bbe8:ba8a:34e0]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id E54C85C44; Sat, 25 Jan 2014 17:28:12 +0100 (CET) Subject: Re: BUG: possible NULL pointer dereference in nfs server Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Content-Type: multipart/signed; boundary="Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04"; protocol="application/pgp-signature"; micalg=pgp-sha1 X-Pgp-Agent: GPGMail 2.1 (6062eb4) From: Dimitry Andric In-Reply-To: <1577222508.16050114.1390610315654.JavaMail.root@uoguelph.ca> Date: Sat, 25 Jan 2014 17:27:58 +0100 Message-Id: <97D8A7EA-12F3-4EFC-A933-BD8528B6306A@FreeBSD.org> References: <1577222508.16050114.1390610315654.JavaMail.root@uoguelph.ca> To: Rick Macklem X-Mailer: Apple Mail (2.1827) Cc: Roman Divacky , fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jan 2014 16:28:19 -0000 --Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 25 Jan 2014, at 01:38, Rick Macklem wrote: > Roman Divacky wrote: >> Hi, >>=20 >> In nfs_nfsdstate.c:nfsrv_lockctrl() we call >>=20 >> getlckret =3D nfsrv_getlockfh(vp, new_stp->ls_flags, NULL, &nfh, p); >>=20 >> then in nfsrv_getlockfh() we, based on the value of flags, might >> dereference the NULL pointer: >>=20 >>=20 >> nfsrv_getlockfh(vnode_t vp, u_short flags, >> struct nfslockfile **new_lfpp, fhandle_t *nfhp, NFSPROC_T *p) >>=20 >>=20 >> if (flags & NFSLCK_OPEN) { >> new_lfp =3D *new_lfpp; >> fhp =3D &new_lfp->lf_fh; >>=20 >>=20 > I took a look and don't see a problem. NFSLCK_OPEN is only set for = Opens > { nfsrvd_open() } and it never calls nfsrv_lockctrl(). = nfsrv_lockctrl() is > always called with other flags in new_stp->ls_flags set, but never = NFSLCK_OPEN, > because nfsrv_lockctrl() is handling byte range lock cases after a = file has been > opened { which means the "else" case for this "if" always applies. The point is that the compiler cannot always know this, when it inlines nfsrv_getlockfh(). In the first invocation of nfsrv_getlockfh(), there is no problem: 712 APPLESTATIC void 713 nfsrv_dumplocks(vnode_t vp, struct nfsd_dumplocks *ldumpp, int = maxcnt, 714 NFSPROC_T *p) 715 { ... 726 ret =3D nfsrv_getlockfh(vp, 0, NULL, &nfh, p); Here both the 'flags' and 'new_lfpp' arguments are zero, so the part in nfsrv_getlockfh() where it writes to a NULL pointer can never be executed, therefore it can be optimized away safely. In the next invocation, the state of the 'flags' argument is not known to the compiler, however: 1369 tryagain: 1370 if (new_stp->ls_flags & NFSLCK_LOCK) 1371 MALLOC(other_lop, struct nfslock *, sizeof = (struct nfslock), 1372 M_NFSDLOCK, M_WAITOK); ... 1387 getlckret =3D nfsrv_getlockfh(vp, new_stp->ls_flags, = NULL, &nfh, p); If it inlines this, the result looks approximately like: 1 { 2 fhandle_t *fhp =3D NULL; 3 struct nfslockfile *new_lfp; 4 int error; 5 6 if (new_stp->ls_flags & NFSLCK_OPEN) { 7 new_lfp =3D *NULL; 8 fhp =3D &new_lfp->lf_fh; 9 } else if (&nfh) { 10 fhp =3D &nfh; 11 } else { 12 panic("nfsrv_getlockfh"); 13 } 14 error =3D nfsvno_getfh(vp, fhp, p); 15 NFSEXITCODE(error); 16 getlckret =3D error; 17 } The code in line 7 is the problematic part. Since this is undefined, the compiler inserts a trap instruction here. I think the problem Roman encountered is that on sparc64, there is no equivalent to x86's ud2 instruction, so it inserts a call to abort() instead, and that function is not available in kernel-land. ... > Sorry, I'm not a compiler guy, so I don't know why a compiler would > generate a trap instruction, but since new_lfpp is never NULL when > this is executed, I don't see a problem. >=20 > If others feel that this needs to be re-coded, please let me know what > you think the code should look like? (A test for non-NULL with a = panic() > before it is used?) >=20 > Is a trap instruction that never gets executed a problem? It's better to avoid undefined behavior in any case. Just add a NULL check, that should be sufficient. -Dimitry --Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iEYEARECAAYFAlLj5hkACgkQsF6jCi4glqO2hACeKUFQagxvdOrYIgc3YqdmERwq VbIAnR5Td7N+Dqjj4BAiAN8nVTmqkW2A =9r0K -----END PGP SIGNATURE----- --Apple-Mail=_70801F41-810B-4C7F-9BA3-A9451B541D04--