Date: Mon, 11 Feb 2008 13:11:33 +0100 From: remco@spacemarines.us (Remco van Bekkum) To: Remco van Bekkum <remco@spacemarines.us> Cc: freebsd-stable@freebsd.org Subject: Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1 Message-ID: <20080211121133.GA5910@marshal.spacemarines.us> In-Reply-To: <20080211120057.GA5821@marshal.spacemarines.us> References: <479A0731.6020405@skyrush.com> <20080125162940.GA38494@eos.sc1.parodius.com> <479A3764.6050800@skyrush.com> <3803988D-8D18-4E89-92EA-19BF62FD2395@mac.com> <479A4CB0.5080206@skyrush.com> <20080126003845.GA52183@eos.sc1.parodius.com> <20080211120057.GA5821@marshal.spacemarines.us>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 11, 2008 at 01:00:57PM +0100, Remco van Bekkum wrote: > On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote: > > Joe, I wanted to send you a note about something that I'm still in the > > process of dealing with. The timing couldn't be more ironic. > > > > I decided it would be worthwhile to migrate from my two-disk ZFS stripe > > with a non-ZFS disk for nightly backups, to to a RAIDZ pool of all 3 > > disks combined (since they're all the same size). I had another > > terminal with gstat -I500ms running in it, so I could see overall I/O. > > > > All was going well until about the 81GB mark of the copy. gstat started > > showing 0KB in/out on all the drives, and the rsync was stalled. ^Z did > > nothing, which is usually a bad sign. :-) I ssh'd in and did a dmesg > > (summarised): > > > > ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly > > ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly > > ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly > > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951071 > > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951327 > > ad6: FAILURE - WRITE_DMA timed out LBA=13951071 > > ad6: FAILURE - WRITE_DMA timed out LBA=13951327 > > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951583 > > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951839 > > ad6: FAILURE - WRITE_DMA timed out LBA=13951583 > > ad6: FAILURE - WRITE_DMA timed out LBA=13951839 > > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952095 > > ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952351 > > g_vfs_done():ad6s1d[WRITE(offset=7142916096, length=131072)]error = 5 > > g_vfs_done():ad6s1d[WRITE(offset=7143047168, length=131072)]error = 5 > > g_vfs_done():ad6s1d[WRITE(offset=7143178240, length=131072)]error = 5 > > g_vfs_done():ad6s1d[WRITE(offset=7143309312, length=131072)]error = 5 > > g_vfs_done():ad6s1d[WRITE(offset=7143440384, length=131072)]error = 5 > > > > It appears my /dev/ad6 (a Seagate -- more irony) must have some bad > > blocks. Actually, after letting things go for a while, I realised the > > box just locked up. Probably kernel panic'd due to the I/O problem. > > I'll have to poke at SMART stats later to see what showed up. > > > > -- > > | Jeremy Chadwick jdc at parodius.com | > > | Parodius Networking http://www.parodius.com/ | > > | UNIX Systems Administrator Mountain View, CA, USA | > > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > > _______________________________________________ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > > Hi all, > > After having replaced my first SATA disk with one of the same type, > having still the same errors, I replaced this 1TB drive with 4x500GB > Hitachi P7K500 in raidz. It worked fine for a week, but yesterday I > cvsupped and rebuild world. This afternoon everything is breaking down > again with the same errors: > > Feb 11 12:34:09 xaero kernel: ad6: WARNING - SETFEATURES SET TRANSFER > MODE taskqueue timeout - completing request directly > Feb 11 12:34:13 xaero kernel: ad6: WARNING - SETFEATURES SET TRANSFER > MODE taskqueue timeout - completing request directly > Feb 11 12:34:17 xaero kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE > taskqueue timeout - completing request directly > Feb 11 12:34:21 xaero kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE > taskqueue timeout - completing request directly > Feb 11 12:34:25 xaero kernel: ad6: WARNING - SET_MULTI taskqueue timeout > - completing request directly > Feb 11 12:34:25 xaero kernel: ad6: FAILURE - WRITE_DMA48 timed out > LBA=298014274 > > Feb 11 12:34:29 xaero kernel: ad8: WARNING - SETFEATURES SET TRANSFER > MODE taskqueue timeout - completing request directly > Feb 11 12:34:33 xaero kernel: ad8: WARNING - SETFEATURES SET TRANSFER > MODE taskqueue timeout - completing request directly > Feb 11 12:34:37 xaero kernel: ad8: WARNING - SETFEATURES ENABLE RCACHE > taskqueue timeout - completing request directly > Feb 11 12:34:41 xaero kernel: ad8: WARNING - SETFEATURES ENABLE WCACHE > taskqueue timeout - completing request directly > Feb 11 12:34:45 xaero kernel: ad8: WARNING - SET_MULTI taskqueue timeout > - completing request directly > Feb 11 12:34:45 xaero kernel: ad8: FAILURE - WRITE_DMA48 timed out > LBA=298013590 > > So of 6 new disk I have 4 with the same errors. It would be quite safe then > to not blame the disks imho. I've tested the second drive in another > machine, but still got these timeout errors. What's wrong here? > It's on an amd64, Asus m2a-vm with ati xp600, AMD BE-2350 CPU, 2GB > 800MHz RAM. > > Regards, > > Remco > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" Sorry, ati ixp sb600 that is... Remco
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080211121133.GA5910>