Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Dec 2014 16:39:06 +0200
From:      George Kontostanos <gkontos.mail@gmail.com>
To:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, freebsd-hardware@freebsd.org
Subject:   LSI SAS 9300-8i weird ZFS checksum errors
Message-ID:  <CA%2BdUSyo56ioZC4Kn4XTcf_GgeSsQrtd7FYpCxjsqOxQ5ON-_CA@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello, list and Merry Christmas to all

I am facing some weird checksum errors during scrub. The configuration is
the following:

Board:        Supermicro Motherboard X10DRi-T4+ (
http://www.supermicro.com/products/motherboard/xeon/c600/x10dri-t4_.cfm)
Controller:  LSI SAS 9300-8i (
http://www.lsi.com/products/host-bus-adapters/pages/lsi-sas-9300-8i.aspx)
HDD:         21X6TB Western Digital WD60EFRX
HDD:         2XIntel SATA 600GB Solid-State Drive SSDSC2BB600G401 DC S3500
(SWAP, ZIL, CACHE)
Chassis:    Supermicro 847BE1C-R1K28LPB 4U Storage Chassis
RAM:         64 GB

I installed initially FreeBSD 10.1-RELEASE created one pool consistent by 3
X7disk VDEVs in RAIDZ3. I used NFS to start copying some data. After
copying around 3TB I initiated a scrub.
The result was the following: http://pastebin.com/rswgCY2A and
http://pastebin.com/DQ2urGXk

I tried to flash the controller but the LSI utility did not recognize the
controller. I installed FreeBSD 9.3-RELEASE and used LSI's mpslsi3 driver.
I was able to flash the latest bios and firmware that way.

LSI Corporation SAS3 Flash Utility
Version 07.00.00.00 (2014.08.14)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

Adapter Selected is a LSI SAS: SAS3008(C0)

Controller Number              : 0
Controller                     : SAS3008(C0)
PCI Address                    : 00:82:00:00
SAS Address                    : 500605b-0-06ce-27e0
NVDATA Version (Default)       : 06.03.00.05
NVDATA Version (Persistent)    : 06.03.00.05
Firmware Product ID            : 0x2221 (IT)
Firmware Version               : 06.00.00.00
NVDATA Vendor                  : LSI
NVDATA Product ID              : SAS9300-8i
BIOS Version                   : 08.13.00.00
UEFI BSD Version               : 02.00.00.00
FCODE Version                  : N/A
Board Name                     : SAS9300-8i
Board Assembly                 : H3-25573-00E
Board Tracer Number            : SV32928040

I recreated the pool again and started writing data via NFS again. After 3
TB of data I started a scrub and I am still getting checksum errors though
there are no messages regarding the drives anymore in /var/log/messages

  pool: Pool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P

  scan: scrub in progress since Thu Dec 25 08:46:21 2014
        2.28T scanned out of 5.54T at 816M/s, 1h9m to go
        11.9M repaired, 41.26% done
config:

NAME                     STATE     READ WRITE CKSUM
Pool                     ONLINE       0     0     0
  raidz3-0               ONLINE       0     0     0
    gpt/WD-WX41D94RN5A3  ONLINE       0     0    15  (repairing)
    gpt/WD-WX41D948YE1U  ONLINE       0     0    14  (repairing)
    gpt/WD-WX41D94RN879  ONLINE       0     0    16  (repairing)
    gpt/WD-WX21D947NC83  ONLINE       0     0    24  (repairing)
    gpt/WD-WX21D947NT77  ONLINE       0     0    15  (repairing)
    gpt/WD-WX41D948YAKV  ONLINE       0     0    19  (repairing)
    gpt/WD-WX21D9421SCV  ONLINE       0     0    20  (repairing)
  raidz3-1               ONLINE       0     0     0
    gpt/WD-WX21D9421F6F  ONLINE       0     0    16  (repairing)
    gpt/WD-WX41D948YPN4  ONLINE       0     0    14  (repairing)
    gpt/WD-WX21D947NE2K  ONLINE       0     0    22  (repairing)
    gpt/WD-WX41D948Y2PX  ONLINE       0     0    19  (repairing)
    gpt/WD-WX41D94RNAX7  ONLINE       0     0    17  (repairing)
    gpt/WD-WX21D947N1RP  ONLINE       0     0    12  (repairing)
    gpt/WD-WX21D94216X7  ONLINE       0     0    20  (repairing)
  raidz3-2               ONLINE       0     0     0
    gpt/WD-WX41D948YAHP  ONLINE       0     0    25  (repairing)
    gpt/WD-WX21D947N06F  ONLINE       0     0    18  (repairing)
    gpt/WD-WX21D947N3T1  ONLINE       0     0    21  (repairing)
    gpt/WD-WX41D94RNT7D  ONLINE       0     0     5  (repairing)
    gpt/WD-WX41D948Y9VV  ONLINE       0     0    18  (repairing)
    gpt/WD-WX41D94RNS62  ONLINE       0     0    24  (repairing)
    gpt/WD-WX21D9421ZP9  ONLINE       0     0    28  (repairing)
logs
  mirror-3               ONLINE       0     0     0
    gpt/zil0             ONLINE       0     0     0
    gpt/zil1             ONLINE       0     0     0
cache
  gpt/cache0             ONLINE       0     0     0
  gpt/cache1             ONLINE       0     0     0

errors: No known data errors

This is really driving me crazy since smartmon tools do not display any
errors on the drives.

Any suggestions are most welcomed!!!

Thank you for your time,

--
George Kontostanos
---



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BdUSyo56ioZC4Kn4XTcf_GgeSsQrtd7FYpCxjsqOxQ5ON-_CA>