FreeBSD Mail Archives

Date:      Thu, 30 Aug 2007 23:48:49 +0100
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        "Mark Powell" <M.S.Powell@salford.ac.uk>, <freebsd-current@freebsd.org>
Subject:   Re: Another ZFS kernel panic on same block on every drive in raidz
Message-ID:  <03b401c7eb57$ee714030$b6db87d4@multiplay.co.uk>
References:  <20070830183305.X60345@rust.salford.ac.uk>

That sounds very much like an overflow error on the controller / drive.

We had a very similar issue with the Highpoint 1820a drivers which turned out
to be compatibility issue with the drive firmware and the controller.

The controller was using standard LBA to access the drive up until the point
where 48-bit LBA was required. This caused issues with, in this case Seagate
drives, which would report an error when using this method after a specific
point. The fix was for the controller to always use 48-bit addressing for
the drives which supported it.

Hope this helps.

    Regards
    Steve

----- Original Message ----- 
From: "Mark Powell" <M.S.Powell@salford.ac.uk>
To: <freebsd-current@freebsd.org>
Sent: Thursday, August 30, 2007 6:47 PM
Subject: Another ZFS kernel panic on same block on every drive in raidz


> Hi,
>   I am testing a 3 drive raidz1 array which has been built with 3 new WD 
> 500GB SATA drives /dev/ad1[468], bought from 2 different sources.
>   I am being told that a dma error is occuring on the same block on all 3 
> drives at the same time:
> 
> Aug 30 18:13:15 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:15 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:15 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:25 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad14s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:25 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad16s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:25 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad18s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad18s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad14s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad16s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path= offset=396215451648 size=131072 error=5
> 
> And then the kernel panics:
> 
> panic: ZFS: I/O failure (write on <unknown> off 0: zio 0xffffff0013b0d000 
> [L0 ZFS plain file] 20000L/20000P DVA[0]=<5:5c40480000:30000> fletcher2 
> uncompressed LE contiguous birth=20167 fill=1 cksum=cfcfcfcfcfcfce00:cfcfcfcfcfcfce00:8a8a8a8a8a56e700:8a8a8a8a8a56e
> cpuid = 0
> 
>   I think I saw someone else have a similar problem to this. There were 
> told their hardware was probably flakey on to look for errors with geli.
>   Just performing a scrub now to see what happens.
>   Let me know if you need any further info.
>   Cheers.
> 
> -- 
> Mark Powell - UNIX System Administrator - The University of Salford
> Information Services Division, Clifford Whitworth Building,
> Salford University, Manchester, M5 4WT, UK.
> Tel: +44 161 295 4837  Fax: +44 161 295 5888  www.pgp.com for PGP key
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
>

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?03b401c7eb57$ee714030$b6db87d4>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation