Date: Sun, 4 Mar 2007 10:51:23 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Yoshihiro Ota <ota@j.email.ne.jp> Cc: Randall Stewart <rrs@cisco.com>, Scott Robbins <scottro@nyc.rr.com>, "Stephane E. Potvin" <sepotvin@FreeBSD.org>, brooks@FreeBSD.org, current@FreeBSD.org Subject: Re: HEADS UP: UNIX domain socket locking changes merged to CVS HEAD Message-ID: <20070304104439.M60688@fledge.watson.org> In-Reply-To: <20070304010553.4c288aa6.ota@j.email.ne.jp> References: <20070226204916.C56223@fledge.watson.org> <45E5D589.3080202@FreeBSD.org> <20070228234754.Q13593@fledge.watson.org> <45E6178F.8040302@cisco.com> <20070301031907.GD94643@mail.scottro.net> <45E67908.9090707@cisco.com> <20070301090253.M13593@fledge.watson.org> <45E69EE3.9010407@cisco.com> <20070302005803.GC26188@mail.scottro.net> <45E82030.7000402@cisco.com> <20070302132436.GB46154@mail.scottro.net> <20070302234750.7b57c23c.ota@j.email.ne.jp> <20070303221906.I60688@fledge.watson.org> <20070304010553.4c288aa6.ota@j.email.ne.jp>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 4 Mar 2007, Yoshihiro Ota wrote: >> Could you confirm that if you run the code precisely before the commits in >> question (i.e., back out to uipc_usrreq.c:1.196 and unpcb.h:1.22) the >> problem goes away completely? If so, could you try running ktrace on >> kinput2 and see if it's looping around any particular syscalls and getting >> an error repeatedly? It could be that an error is now (possibly >> incorrectly) being returned and that kinput2 is not handling that well. > > I changed to uipc_usrreq.c 1.199 to 1.196 and I already had unpcb.h 1.11. > After rebooting, the problem still remains. Hmm. That's odd -- you should really have needed to have unpcb.h:1.22 in order for the kernel to compile with the recent uipc_usrreq.c changes -- perhaps you're referring to the unpcb.h in /usr/include/sys rather than /usr/src/sys/sys? > The below is what I got from ktrace/kdump run on uipc_usrreq.c@1.199. I > think I started seeing this problem on last Sat. or Sun day. > > It seems that when I kill kinput2, canna dies together so that when I see > like this: > > $ sh /usr/local/etc/rc.d/canna.sh stop Cannot connect with cannaserver > "unix". > > % ktrace -f ktrace.out > > 1274 kinput2 RET poll 1 > 1274 kinput2 CALL poll(0x88450fb0,0x2,0) > 1274 kinput2 RET poll 1 > 1274 kinput2 CALL poll(0x88450fb0,0x2,0) > 1274 kinput2 RET poll 1 > > %grep poll ktrace.txt | wc > 621264 2795688 22986795 Returning imediately from poll() with a third argument (timeout) of 0 is expected. However, it could be that the application is expecting something that is no longer true, or that we're handling something differently than we used to. I take it that kinput2 doesn't have this way in -STABLE? Have you tried using portupgrade (or a related tool) to rebuild everything and make sure that a library didn't get out of sync? I may need to ask you to do a binary kernel search for the date where the problem started occuring in order to get much further -- or it may take someone familiar with (or becoming familiar with) the kinput and canna internals to figure out if this is a new kernel bug or a bug in the application. Thanks, Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070304104439.M60688>