Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 Sep 2010 21:38:24 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        a.smith@ukgrid.net
Cc:        freebsd-fs@freebsd.org, Andriy Gapon <avg@icyb.net.ua>
Subject:   Re: ZFS related kernel panic
Message-ID:  <4C8A7B20.7090408@FreeBSD.org>
In-Reply-To: <20100910184921.16956kbaskhrsmg4@webmail2.ukgrid.net>
References:  <20100909140000.5744370gkyqv4eo0@webmail2.ukgrid.net> <20100909182318.11133lqu4q4u1mw4@webmail2.ukgrid.net> <4C89D6A8.1080107@icyb.net.ua> <20100910143900.20382xl5bl6oo9as@webmail2.ukgrid.net> <20100910141127.GA13056@icarus.home.lan> <20100910155510.11831w104qjpyc4g@webmail2.ukgrid.net> <20100910152544.GA14636@icarus.home.lan> <20100910173912.205969tzhjiovf8c@webmail2.ukgrid.net> <4C8A6B26.8050305@icyb.net.ua> <20100910184921.16956kbaskhrsmg4@webmail2.ukgrid.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi.

a.smith@ukgrid.net wrote:
> Quoting Andriy Gapon <avg@icyb.net.ua>:
>> That's useful information I think.
>> For completeness you might also run kgdb and then do:
>> (kgdb) info line *siis_end_transaction+0x45
>> and so on for all stack trace lines involving siis.
> 
> Ok, I got this running against the data in /var/crash, is that ok?
> 
> (kgdb) info line *siis_end_transaction+0x45
> Line 1188 of "/usr/src/sys/modules/siis/../../dev/siis/siis.c"
>    starts at address 0xffffffff80ea5a05 <siis_end_transaction+69>
>    and ends at 0xffffffff80ea5a13 <siis_end_transaction+83>.
> (kgdb) info line *siis_ch_intr_locked+0x3e
> Line 809 of "/usr/src/sys/modules/siis/../../dev/siis/siis.c"
>    starts at address 0xffffffff80ea780e <siis_ch_intr_locked+62>
>    and ends at 0xffffffff80ea7811 <siis_ch_intr_locked+65>.
> (kgdb) info line *siis_intr+0x61
> Line 298 of "/usr/src/sys/modules/siis/../../dev/siis/siis.c"
>    starts at address 0xffffffff80ea7a61 <siis_intr+97>
>    and ends at 0xffffffff80ea7a72 <siis_intr+114>.

It looks like during timeout handling (it is quite complicated process
when port multiplier is used) some request was completed twice. So
original problem is probably in hardware (try to check/replace cables,
multiplier, ...), that caused timeout, but the fact that drive was
unable to handle it is probably a siis(4) driver bug.

At this moment I have no idea where to look, except dumb rereading whole
error handling logic. It could help much if I could reproduce the
problem in realistic time in controlled environment with access to
serial (or some other) console and possibility to insert additional
debugging.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C8A7B20.7090408>