From owner-freebsd-bugs Tue May 11 15:30:57 1999 Delivered-To: freebsd-bugs@freebsd.org Received: from luke.pmr.com (luke.pmr.com [207.170.114.132]) by hub.freebsd.org (Postfix) with ESMTP id B74BF15A38; Tue, 11 May 1999 15:30:42 -0700 (PDT) (envelope-from bob@luke.pmr.com) Received: (from bob@localhost) by luke.pmr.com (8.9.3/8.9.2) id RAA34133; Tue, 11 May 1999 17:30:19 -0500 (CDT) (envelope-from bob) Date: Tue, 11 May 1999 17:30:19 -0500 From: Bob Willcox To: Pierre Beyssac Cc: Bob Willcox , freebsd-bugs@freebsd.org, FreeBSD-gnats-submit@freebsd.org Subject: Re: kern/10872: Panic in sorecieve() Message-ID: <19990511173019.A33995@luke.pmr.com> Reply-To: Bob Willcox References: <19990511185956.A12679@enst.fr> <19990511124117.A28606@luke.pmr.com> <19990511195311.R427@enst.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.1i In-Reply-To: <19990511195311.R427@enst.fr>; from Pierre Beyssac on Tue, May 11, 1999 at 07:53:11PM +0200 Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Well, I can easily recreate the panic with -current as of this morning. I tried the "maxusers 128" change and that did not help. I have attached a slightly modified test shell script that I have been using. I run this shell script on three other systems simultaneously, all writing to the same SCSI disk on the test system (this sort of simulates amanda activity with multiple systems all dumping to the holding disk). As I mentioned in an earlier note, these systems are all connected together via a 100mbps full-duplex switching hub. Two of them are running 3.1-stable and the other is running 2.2.8-release. I run the tests simultaneously on the three systems as follows: On obiwan: ./panic_test 5 10000 lando /stuff/tmp/obiwan On deathstar: ./panic_test 5 10000 lando /stuff/tmp/deathstar On luke: ./panic_test 5 10000 lando /stuff/tmp/luke (I've got kind of a Star Wars theme going here) Usually within about 5 minutes lando panics. Note that I have built lando's kernel with the options INVARIANTS and INVARIANT_SUPPORT. If you don't, you'll still get a panic (sbdrop) but it will occur later on during the close of the socket instead of the "receive 1" panic due to the KASSERT() that we've been talking about. One more thing...I never got low on mbufs prior to the panic. Thanks, Bob On Tue, May 11, 1999 at 07:53:11PM +0200, Pierre Beyssac wrote: > On Tue, May 11, 1999 at 12:41:17PM -0500, Bob Willcox wrote: > > fix). The problem as I have seen it is that the mbuf chain pointer (m) > > is NULL and so_rcv.sb_cc is not zero. Its as though somewhere either > > the mbuf chain pointer gets zapped with NULL or something fails to > > This can happen when the system is out of mbufs. Sadly there are > many places in the kernel where the condition is not trapped at > all. > > How many mbufs does netstat -m report on your system? Maybe I > couldn't reproduce it because my kernel is configured with maxusers > 128, which yields more mbufs. You can try that as a temporary fix. > > > properly update so_rcv.sb_cc as mbufs are processed. > > > > I believe one can expand the KASSERT macro and rewrite the line: > > if (m == 0 && so->so_rcv.sb_cc != 0) > > Oops, you're right. I stupidly looked at so_snd.sb_cc in the debug > output, which is 0. > > I prefer that, it'll probably be easier to fix. > -- > Pierre Beyssac pb@enst.fr -- Bob Willcox The man who follows the crowd will usually get no bob@luke.pmr.com further than the crowd. The man who walks alone is Austin, TX likely to find himself in places no one has ever been. -- Alan Ashley-Pitt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message