From owner-freebsd-stable Thu Jan 4 1:59: 9 2001 From owner-freebsd-stable@FreeBSD.ORG Thu Jan 4 01:59:03 2001 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from tuminfo2.informatik.tu-muenchen.de (tuminfo2.informatik.tu-muenchen.de [131.159.0.81]) by hub.freebsd.org (Postfix) with ESMTP id 3A86A37B400 for ; Thu, 4 Jan 2001 01:59:03 -0800 (PST) Received: from atrbg11.informatik.tu-muenchen.de ([131.159.9.196] HELO atrbg11.informatik.tu-muenchen.de ident: postfix [port 1589]) by tuminfo2.informatik.tu-muenchen.de with SMTP id <113555-224>; Thu, 4 Jan 2001 10:58:56 +0000 Received: by atrbg11.informatik.tu-muenchen.de (Postfix, from userid 20455) id D3C7C13631; Thu, 4 Jan 2001 10:58:45 +0100 (CET) From: Daniel Lang To: Greg Lehey Cc: freebsd-stable@freebsd.org Subject: Re: kern/21148: multiple crashes while using vinum Message-ID: <20010104105845.A14755@atrbg11.informatik.tu-muenchen.de> References: <200101012239.f01MdiH40906@freefall.freebsd.org> <20010103145232.B10169@atrbg11.informatik.tu-muenchen.de> <20010104105428.D4336@wantadilla.lemis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2.5i In-Reply-To: <20010104105428.D4336@wantadilla.lemis.com>; from grog@lemis.com on Thu, Jan 04, 2001 at 01:25:57AM +0000 X-Geek: GCS d-- s: a- C++ UB++++$ P+++$ L- E W+++(--) N+ o K w--- O? M- V@ PS+(++) PE--(+) Y+ PGP+ t++ 5@ X R+(-) tv+ b+ DI++ D++ G++ e+++ h---(-) r++>+++ y Sender: langd@informatik.tu-muenchen.de Date: Thu, 4 Jan 2001 10:58:46 +0000 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Dear Greg, Greg Lehey wrote on Thu, Jan 04, 2001 at 01:25:57AM +0000: [..] > As my closing message says, the reason I closed the PR was: > > >> No feedback from submitter. > > I sent you a message on 10 September 2000 asking for additional > information. I received none. There's no reason to get all upset I've sent _two_ direct replies. If you have not received them, then maybe some MX had a problem? If you decided they still did not contain any of the information you need, or have been malformed/mutilated, etc. a short hint would have been appreciated. The first was: [..] Date: Sun, 10 Sep 2000 16:18:11 +0200 Message-ID: <20000910161811.A56954@atrbg11.informatik.tu-muenchen.de> [..] Still in the archives at: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=4693+0+archive/2000/freebsd-bugs/20000917.freebsd-bugs It includes three stack traces then with full debugging symbols. The next day, I sent a followup: Date: Mon, 11 Sep 2000 13:54:21 +0200 Message-ID: <20000911135421.C58840@atrbg11.informatik.tu-muenchen.de> http://docs.freebsd.org/cgi/getmsg.cgi?fetch=77649+0+archive/2000/freebsd-bugs/20000917.freebsd-bugs They have been sent to you personally and to freebsd-bugs. The second one as a pr-followup too. Alas, I omitted freebsd-gnats-submit on the first reply. > now, or make claims about my intentions. This was just a dead PR, and > you've made it clear, both before and now, that you have no intention > of following up on it. This is not a question of "ignorant morons" or > "trust". Sorry, I'm not upset, even if it sounded that way, and I would not dare to speculate about your intentions. My apologies. My memory and my Mailfolders tell me, I did try to be helpful and did indeed follow your request from September 10, to provide you with additional information. I did not hear from you since then. If the reason is, that they still did not meet your requirements, I'm sorry that I would have needed some more hints. IMHO the lacking information was a valid backtrace, which I supplied. So my intention then was indeed to following up on it. After I did not hear anything from you, I even emailed Søren Schmidt, the ATA guy, because I suspected that the bug was in the ATA driver (turned out to be unlikely because scsi-only systems showed similar problems). Even now, I'm still interested in helping to fix the problem, but I may not be able to help with crash-dumps at the moment. [..] > Yes, this looks very much like the other issues. But you must > understand that there's nothing I can do without further information. Agreed. [..] > The trouble with that is that this only happens when the system is > very active, and there are thousands of potential buffer headers which > could be trashed. I do have a trace facility within Vinum, but even > with that it's difficult to figure out what's going on. No doubt. [..] > If you mean that the same part of the buffer header gets smashed every > time, yes, this is reliably reproducible (well, in other words, when > it happens (at random), it happens in the same place every time). It > may mean that Vinum is doing it, but as far as I can tell it's always > 6 words being zeroed out, and I don't do that anywhere in Vinum. The > other possibility, which I consider most likely, is that the data > structures accidentally get freed and used by some other driver (or, > possibly, that some other driver freed them first and then continued > using them). This would explain the observed correlation with the fxp > driver. This is indeed interesting, and maybe a reason why dmesg is not utterly useless. ;) My boxes have a fxp NIC, as well: [..] fxp0: port 0xff80-0xff9f mem 0xfe900000-0xfe9fffff,0xfe2ff000-0xfe2fffff irq 11 at device 12.0 on pci0 [..] [..] > Well, I sent you a message on 10 September 2000, asking for additional > information. You didn't send it to me. See above. I at least tried. :-) [..] > Correct. I have no doubt about it. But some bugs are difficult to > find, and I need help. Ok. Here we are. Unfortunately we should have had this discussion already in September, when the issue was more current to many of us. :-/ However, I've got a twin box, which is not in production at the moment, but currently runs Slowlaris X86. I'm going to put a current FreeBSD on it, and if I find some time and enough disks, I will set up a raid5 again. Maybe we can still find the nasty bugger. Unfortunately I cannot tell, when I find time to do this, it may still take a month or two. Best regards, Daniel -- IRCnet: Mr-Spock - ceterum censeo Microsoftinem esse delendam - *Daniel Lang * dl@leo.org * +49 89 289 25735 * http://www.leo.org/~dl/* To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message