Date: Fri, 10 Sep 2010 21:38:24 +0300 From: Alexander Motin <mav@FreeBSD.org> To: a.smith@ukgrid.net Cc: freebsd-fs@freebsd.org, Andriy Gapon <avg@icyb.net.ua> Subject: Re: ZFS related kernel panic Message-ID: <4C8A7B20.7090408@FreeBSD.org> In-Reply-To: <20100910184921.16956kbaskhrsmg4@webmail2.ukgrid.net> References: <20100909140000.5744370gkyqv4eo0@webmail2.ukgrid.net> <20100909182318.11133lqu4q4u1mw4@webmail2.ukgrid.net> <4C89D6A8.1080107@icyb.net.ua> <20100910143900.20382xl5bl6oo9as@webmail2.ukgrid.net> <20100910141127.GA13056@icarus.home.lan> <20100910155510.11831w104qjpyc4g@webmail2.ukgrid.net> <20100910152544.GA14636@icarus.home.lan> <20100910173912.205969tzhjiovf8c@webmail2.ukgrid.net> <4C8A6B26.8050305@icyb.net.ua> <20100910184921.16956kbaskhrsmg4@webmail2.ukgrid.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi. a.smith@ukgrid.net wrote: > Quoting Andriy Gapon <avg@icyb.net.ua>: >> That's useful information I think. >> For completeness you might also run kgdb and then do: >> (kgdb) info line *siis_end_transaction+0x45 >> and so on for all stack trace lines involving siis. > > Ok, I got this running against the data in /var/crash, is that ok? > > (kgdb) info line *siis_end_transaction+0x45 > Line 1188 of "/usr/src/sys/modules/siis/../../dev/siis/siis.c" > starts at address 0xffffffff80ea5a05 <siis_end_transaction+69> > and ends at 0xffffffff80ea5a13 <siis_end_transaction+83>. > (kgdb) info line *siis_ch_intr_locked+0x3e > Line 809 of "/usr/src/sys/modules/siis/../../dev/siis/siis.c" > starts at address 0xffffffff80ea780e <siis_ch_intr_locked+62> > and ends at 0xffffffff80ea7811 <siis_ch_intr_locked+65>. > (kgdb) info line *siis_intr+0x61 > Line 298 of "/usr/src/sys/modules/siis/../../dev/siis/siis.c" > starts at address 0xffffffff80ea7a61 <siis_intr+97> > and ends at 0xffffffff80ea7a72 <siis_intr+114>. It looks like during timeout handling (it is quite complicated process when port multiplier is used) some request was completed twice. So original problem is probably in hardware (try to check/replace cables, multiplier, ...), that caused timeout, but the fact that drive was unable to handle it is probably a siis(4) driver bug. At this moment I have no idea where to look, except dumb rereading whole error handling logic. It could help much if I could reproduce the problem in realistic time in controlled environment with access to serial (or some other) console and possibility to insert additional debugging. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C8A7B20.7090408>