Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jan 2008 16:38:46 -0800
From:      Jeremy Chadwick <koitsu@FreeBSD.org>
To:        Joe Peterson <joe@skyrush.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1
Message-ID:  <20080126003845.GA52183@eos.sc1.parodius.com>
In-Reply-To: <479A4CB0.5080206@skyrush.com>
References:  <479A0731.6020405@skyrush.com> <20080125162940.GA38494@eos.sc1.parodius.com> <479A3764.6050800@skyrush.com> <3803988D-8D18-4E89-92EA-19BF62FD2395@mac.com> <479A4CB0.5080206@skyrush.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Joe, I wanted to send you a note about something that I'm still in the
process of dealing with.  The timing couldn't be more ironic.

I decided it would be worthwhile to migrate from my two-disk ZFS stripe
with a non-ZFS disk for nightly backups, to to a RAIDZ pool of all 3
disks combined (since they're all the same size).  I had another
terminal with gstat -I500ms running in it, so I could see overall I/O.

All was going well until about the 81GB mark of the copy.  gstat started
showing 0KB in/out on all the drives, and the rsync was stalled.  ^Z did
nothing, which is usually a bad sign.  :-)  I ssh'd in and did a dmesg
(summarised):

ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951071
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951327
ad6: FAILURE - WRITE_DMA timed out LBA=13951071
ad6: FAILURE - WRITE_DMA timed out LBA=13951327
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951583
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13951839
ad6: FAILURE - WRITE_DMA timed out LBA=13951583
ad6: FAILURE - WRITE_DMA timed out LBA=13951839
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952095
ad6: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=13952351
g_vfs_done():ad6s1d[WRITE(offset=7142916096, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143047168, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143178240, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143309312, length=131072)]error = 5
g_vfs_done():ad6s1d[WRITE(offset=7143440384, length=131072)]error = 5

It appears my /dev/ad6 (a Seagate -- more irony) must have some bad
blocks.  Actually, after letting things go for a while, I realised the
box just locked up.  Probably kernel panic'd due to the I/O problem.
I'll have to poke at SMART stats later to see what showed up.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080126003845.GA52183>