Skip site navigation (1)Skip section navigation (2)
From:      Daniel Lang <dl@leo.org>
To:        grog@FreeBSD.org
Cc:        Andy Newman <andy@silverbrook.com.au>, Roman Shterenzon <roman@jamus.xpert.com>, Daniel Lang <dl@leo.org>, freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: kern/21148: multiple crashes while using vinum
Message-ID:  <20010103145232.B10169@atrbg11.informatik.tu-muenchen.de>
In-Reply-To: <200101012239.f01MdiH40906@freefall.freebsd.org>; from grog@FreeBSD.org on Mon, Jan 01, 2001 at 11:41:19PM %2B0000
References:  <200101012239.f01MdiH40906@freefall.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Dear Greg, Andy, Roman,

grog@FreeBSD.org wrote on Mon, Jan 01, 2001 at 11:41:19PM +0000:
> Synopsis: multiple crashes while using vinum
[..]
> State-Changed-Why: 
> No feedback from submitter.
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=21148

Well, I've sent you stack-traces, with (and alas as well without)
debugging symbols, I am perfectly aware of your instruction page
about debugging vinum, and not an ignorant moron, who complains
without reading. Unfortunately you don't seem to trust me
or other people in this matter.

If you look at my stack-traces again you will notice, that no
stack-frame is part of the vinum module, so your .gdb-debugging
scripts cannot apply.

The reason is, that _some code_ writes into unallocated memory,
in my case overwriting a data-structure of an ata-request
with a few zero bytes, causing the panic. The stack trace
allows me to trace the problem back to this point, but not
further. I later experienced a similar problem on a 
scsi-only system.

The reason, why I filed this pr unter 'vinum' is, that it only
occured on boxes using vinum, and perfectly reproducable
via simple operations like a 'find /vinum/file/system -print'
on a larger and moderately filled vinum-filesystem.
Perfectly reproducable means: each night, periodic daily
caused the panic (traceable to the find call in /etc/security,
finding files with setuid bits).

As far as I know, the only way to trace this writing into
unallocated/otherallocated memory resp. buffer overrun
would be to set a watchpoint to the overwritten data-structure
within the kernel-debugger. My stack-traces showed that this
memory region stays the same on the same machine with the
same kernel (although I can't tell how reliable this is).
My experiences with kernel code and kernel-debugging with
ddb are very limited. So is my time (I know this applies
to anyone). Therefore I ceased spending time to set up
remote-gdb sessions and sending you stack traces trying to be
helpful, since you obviously didn't seem to be interested.

I further decided not to use vinum any more. We spent some
cash on a few hardware RAIDs, and the boxes run smooth now,
since.

I am just writing this to state:
 a) I did respond to your requests, trying to be as helpful as
    I could. You could blame me for not knowing or willing to 
    learn how to set up a ddb/gdb session using watchpoints
    and waiting for the next crash in an environmen that should
    be productive (and now is).
 b) I still believe, that there is a problem somewhere in the
    vinum code (probably within raid5 routines, since a mirror
    setup worked fine).

And in fact, I wouldn't have bothered if there weren't any
other people like Roman Shterenzon and  Andy Newman,
who seem to have the same problems.

Best regards,
 Daniel Lang

P.S.: I don't use vinum anymore, nor can I take my boxes
      out of production. The debugging kernels and crash-dumps
      are no longer present, sorry.
-- 
IRCnet: Mr-Spock     - Der Schatten von Hasenfuss ist ziemlich dunkel -  
*Daniel Lang * dl@leo.org * +49 89 289 25735 * http://www.leo.org/~dl/*


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010103145232.B10169>