Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Feb 2012 00:27:06 +0200
From:      Mikolaj Golub <trociny@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        "Robert N. M. Watson" <rwatson@freebsd.org>, freebsd-arch@freebsd.org
Subject:   Re: unix domain sockets on nullfs(5)
Message-ID:  <86vcn3u69x.fsf@kopusha.home.net>
In-Reply-To: <20120218215003.GM3283@deviant.kiev.zoral.com.ua> (Konstantin Belousov's message of "Sat, 18 Feb 2012 23:50:03 %2B0200")
References:  <86sjjobzmn.fsf@kopusha.home.net> <D1B8F00C-1E0D-4916-BD4B-FBCAE28E6F22@FreeBSD.org> <86fwfnti5t.fsf@kopusha.home.net> <CAOnPXZ_y5G6uEBWmfuH7qYBh%2B4Pw=O91ztCPEFCOTzWdCzx%2BRA@mail.gmail.com> <BBDE763A-F55E-453D-A503-2489C9040EF6@freebsd.org> <20120112215106.GC31224@deviant.kiev.zoral.com.ua> <86hazntwmu.fsf@kopusha.home.net> <20120123031238.GL31224@deviant.kiev.zoral.com.ua> <86zkcfu9ac.fsf@kopusha.home.net> <20120218215003.GM3283@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sat, 18 Feb 2012 23:50:03 +0200 Konstantin Belousov wrote:

 KB> On Sat, Feb 18, 2012 at 11:22:03PM +0200, Mikolaj Golub wrote:
 >> After collecting all suggestions and additional testing I have got this patch
 >> set:
 >> 
 >> http://people.freebsd.org/~trociny/unp_prepare_reclaim.1.patch
 KB> Including unpcb.h into vfs_subr.c looks too extreme. Put the prototype
 KB> into vnode.h, possibly renaming the function to vfs_unp_reclaim.

 >> http://people.freebsd.org/~trociny/unp_connect.LOCKSHARED.1.patch
 >> http://people.freebsd.org/~trociny/VOP_UNP.3.patch
 KB> I has a painting suggestion there, call the vops VOP_UNP_DETACH etc,
 KB> otherwise it takes too much reading to understand that it is not undetach.

Thanks, will do both.

 >> 
 >> It has survived some bind/connect/force umount stress testing revealing only
 >> some issues that are also observed without patching.

 KB> What are the issues ?

I triggered this assert:

#9  0x80a678cc in panic (fmt=0x80fcbba0 "sofree: so_comp populated")
    at /home/golub/freebsd/git/freebsd/sys/kern/kern_shutdown.c:633
#10 0x80ada1da in sofree (so=0x8c71b820)
    at /home/golub/freebsd/git/freebsd/sys/kern/uipc_socket.c:638
#11 0x80adbab6 in soclose (so=0x8c71b820)
    at /home/golub/freebsd/git/freebsd/sys/kern/uipc_socket.c:741
#12 0x80abe3e9 in soo_close (fp=0x89fb5cb0, td=0x8ad9db80)
    at /home/golub/freebsd/git/freebsd/sys/kern/sys_socket.c:294
#13 0x80a26c13 in _fdrop (fp=0x89fb5cb0, td=0x8ad9db80) at file.h:310
#14 0x80a29640 in closef (fp=0x89fb5cb0, td=0x8ad9db80)
    at /home/golub/freebsd/git/freebsd/sys/kern/kern_descrip.c:2246
#15 0x80a29a09 in kern_close (td=0x8ad9db80, fd=3)
    at /home/golub/freebsd/git/freebsd/sys/kern/kern_descrip.c:1232
#16 0x80a29baa in sys_close (td=0x8ad9db80, uap=0xdef62cec)
    at /home/golub/freebsd/git/freebsd/sys/kern/kern_descrip.c:1178
#17 0x80deca20 in syscall (frame=0xdef62d28) at subr_syscall.c:135
#18 0x80dd5ce1 in Xint0x80_syscall ()
    at /home/golub/freebsd/git/freebsd/sys/i386/i386/exception.s:266

(kgdb) fr 10
#10 0x80ada1da in sofree (so=0x8c71b820)
    at /home/golub/freebsd/git/freebsd/sys/kern/uipc_socket.c:638
638                     KASSERT((TAILQ_EMPTY(&so->so_incomp)), ("sofree: so_comp populated"));
(kgdb) l
633                 (so->so_qstate & SQ_INCOMP) == 0,
634                 ("sofree: so_head == NULL, but still SQ_COMP(%d) or SQ_INCOMP(%d)",
635                 so->so_qstate & SQ_COMP, so->so_qstate & SQ_INCOMP));
636             if (so->so_options & SO_ACCEPTCONN) {
637                     KASSERT((TAILQ_EMPTY(&so->so_comp)), ("sofree: so_comp populated"));
638                     KASSERT((TAILQ_EMPTY(&so->so_incomp)), ("sofree: so_comp populated"));
639             }
640             SOCK_UNLOCK(so);
641             ACCEPT_UNLOCK();
642     

(BTW, the panic message should be: "sofree: so_incomp populated".)

The test was to run bind/listen/accept/umount -f/close (with some variations)
loop in one thread and connect/close in several others.

I was able to trigger it for nullfs and was not able for ufs. I have not
looked close at it yet, just checked that the same panic is observed on a
kernel without my modifications.

Another issue is a socket leak observed when running the above test: after the
test stale sockets remain:

kopusha:~% sockstat|grep test
?        ?          ?     ?  stream /mnt/upper/test.sock
?        ?          ?     ?  stream /mnt/lower/test.sock
?        ?          ?     ?  stream /mnt/upper/test.sock

I am going to investigate this more some day.

-- 
Mikolaj Golub



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86vcn3u69x.fsf>