From owner-freebsd-stable Wed Jan 3 5:52:58 2001 From owner-freebsd-stable@FreeBSD.ORG Wed Jan 3 05:52:53 2001 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from tuminfo2.informatik.tu-muenchen.de (tuminfo2.informatik.tu-muenchen.de [131.159.0.81]) by hub.freebsd.org (Postfix) with ESMTP id 51FDB37B400; Wed, 3 Jan 2001 05:52:53 -0800 (PST) Received: from atrbg11.informatik.tu-muenchen.de ([131.159.9.196] HELO atrbg11.informatik.tu-muenchen.de ident: postfix [port 1553]) by tuminfo2.informatik.tu-muenchen.de with SMTP id <111581-223>; Wed, 3 Jan 2001 14:52:44 +0000 Received: by atrbg11.informatik.tu-muenchen.de (Postfix, from userid 20455) id 179F513631; Wed, 3 Jan 2001 14:52:33 +0100 (CET) From: Daniel Lang To: grog@FreeBSD.org Cc: Andy Newman , Roman Shterenzon , Daniel Lang , freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org Subject: Re: kern/21148: multiple crashes while using vinum Message-ID: <20010103145232.B10169@atrbg11.informatik.tu-muenchen.de> References: <200101012239.f01MdiH40906@freefall.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200101012239.f01MdiH40906@freefall.freebsd.org>; from grog@FreeBSD.org on Mon, Jan 01, 2001 at 11:41:19PM +0000 X-Geek: GCS d-- s: a- C++ UB++++$ P+++$ L- E W+++(--) N+ o K w--- O? M- V@ PS+(++) PE--(+) Y+ PGP+ t++ 5@ X R+(-) tv+ b+ DI++ D++ G++ e+++ h---(-) r++>+++ y Sender: langd@informatik.tu-muenchen.de Date: Wed, 3 Jan 2001 14:52:35 +0000 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Dear Greg, Andy, Roman, grog@FreeBSD.org wrote on Mon, Jan 01, 2001 at 11:41:19PM +0000: > Synopsis: multiple crashes while using vinum [..] > State-Changed-Why: > No feedback from submitter. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=21148 Well, I've sent you stack-traces, with (and alas as well without) debugging symbols, I am perfectly aware of your instruction page about debugging vinum, and not an ignorant moron, who complains without reading. Unfortunately you don't seem to trust me or other people in this matter. If you look at my stack-traces again you will notice, that no stack-frame is part of the vinum module, so your .gdb-debugging scripts cannot apply. The reason is, that _some code_ writes into unallocated memory, in my case overwriting a data-structure of an ata-request with a few zero bytes, causing the panic. The stack trace allows me to trace the problem back to this point, but not further. I later experienced a similar problem on a scsi-only system. The reason, why I filed this pr unter 'vinum' is, that it only occured on boxes using vinum, and perfectly reproducable via simple operations like a 'find /vinum/file/system -print' on a larger and moderately filled vinum-filesystem. Perfectly reproducable means: each night, periodic daily caused the panic (traceable to the find call in /etc/security, finding files with setuid bits). As far as I know, the only way to trace this writing into unallocated/otherallocated memory resp. buffer overrun would be to set a watchpoint to the overwritten data-structure within the kernel-debugger. My stack-traces showed that this memory region stays the same on the same machine with the same kernel (although I can't tell how reliable this is). My experiences with kernel code and kernel-debugging with ddb are very limited. So is my time (I know this applies to anyone). Therefore I ceased spending time to set up remote-gdb sessions and sending you stack traces trying to be helpful, since you obviously didn't seem to be interested. I further decided not to use vinum any more. We spent some cash on a few hardware RAIDs, and the boxes run smooth now, since. I am just writing this to state: a) I did respond to your requests, trying to be as helpful as I could. You could blame me for not knowing or willing to learn how to set up a ddb/gdb session using watchpoints and waiting for the next crash in an environmen that should be productive (and now is). b) I still believe, that there is a problem somewhere in the vinum code (probably within raid5 routines, since a mirror setup worked fine). And in fact, I wouldn't have bothered if there weren't any other people like Roman Shterenzon and Andy Newman, who seem to have the same problems. Best regards, Daniel Lang P.S.: I don't use vinum anymore, nor can I take my boxes out of production. The debugging kernels and crash-dumps are no longer present, sorry. -- IRCnet: Mr-Spock - Der Schatten von Hasenfuss ist ziemlich dunkel - *Daniel Lang * dl@leo.org * +49 89 289 25735 * http://www.leo.org/~dl/* To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message