From owner-freebsd-stable@FreeBSD.ORG Thu Jun 6 18:59:25 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 61607E59; Thu, 6 Jun 2013 18:59:25 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 3EE5E1233; Thu, 6 Jun 2013 18:59:25 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 11379B918; Thu, 6 Jun 2013 14:59:23 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Subject: Re: Reproducable Infiniband panic Date: Thu, 6 Jun 2013 14:57:52 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <51B07705.207@os.inf.tu-dresden.de> <51B0949B.1050606@FreeBSD.org> In-Reply-To: <51B0949B.1050606@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201306061457.52278.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 06 Jun 2013 14:59:23 -0400 (EDT) Cc: Andriy Gapon , Julian Stecklina X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jun 2013 18:59:25 -0000 On Thursday, June 06, 2013 9:54:35 am Andriy Gapon wrote: > on 06/06/2013 14:48 Julian Stecklina said the following: > > #7 0xffffffff807a3d83 in linux_file_dtor (cdp=0xfffffe000aeabb80) at > > /usr/home/julian/src/freebsd/sys/ofed/include/linux/linux_compat.c:214 > > filp = (struct linux_file *) 0xfffffe000aeabb80 > > #8 0xffffffff80513c39 in devfs_destroy_cdevpriv (p=0xfffffe0005772980) > > at /usr/home/julian/src/freebsd/sys/fs/devfs/devfs_vnops.c:159 > > No locals. > > #9 0xffffffff80513e47 in devfs_close_f (fp=0xfffffe000b0e9aa0, > > td=) > > at /usr/home/julian/src/freebsd/sys/fs/devfs/devfs_vnops.c:619 > > error = 0 > > fpop = (struct file *) 0x0 > > The problem seems to be in incorrect interaction between devfs_close_f and > linux_file_dtor. The latter expects curthread->td_fpop to have a valid reasonable > value. But the former sets curthread->td_fpop to fp only around vnops.fo_close() > call and then restores it back to some (what?) previous value before calling > devfs_fpdrop->devfs_destroy_cdevpriv. In this case the previous value is NULL. It is normally NULL in this case. Why does linux_file_dtor even look at td_fpop? Ah. I think it should not do that and make the data it uses in the dtor more self-contained: Index: sys/ofed/include/linux/linux_compat.c =================================================================== --- linux_compat.c (revision 251465) +++ linux_compat.c (working copy) @@ -212,7 +212,7 @@ linux_file_dtor(void *cdp) struct linux_file *filp; filp = cdp; - filp->f_op->release(curthread->td_fpop->f_vnode, filp); + filp->f_op->release(filp->f_vnode, filp); kfree(filp); } @@ -232,6 +232,7 @@ linux_dev_open(struct cdev *dev, int oflags, int d filp->f_dentry = &filp->f_dentry_store; filp->f_op = ldev->ops; filp->f_flags = file->f_flag; + filp->f_vnode = file->f_vnode; if (filp->f_op->open) { error = -filp->f_op->open(file->f_vnode, filp); if (error) { -- John Baldwin