Date: Wed, 27 Mar 2024 16:00:06 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 277992] mpr and possible trim issues Message-ID: <bug-277992-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D277992 Bug ID: 277992 Summary: mpr and possible trim issues Product: Base System Version: 14.0-STABLE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: mike@sentex.net The thread https://lists.freebsd.org/archives/freebsd-hardware/2024-March/000094.html = has most of the details.=20 In summary, a set of WD Blue SA510 SSDs with the latest firmware as of Mar = 2024 will eventually start throwing errors and detach from the controller when I copy and then destroy a zfs dataset with several million files. It sort of feels like a TRIM issue, but not sure. Putting the disks off the onboard S= ATA controller does not recreate the issue.=20 If I start with a low level trim (trim -f /dev/daX), create a raidz1 zfs po= ol with 4, one TB WD disks, import a dataset of about 280GB (compressed) that = has many (20+mill files), do a zfs send original pool | zfs recv copy-of-pool, = then zfs destroy copy-of-pool and repeat about 4 or 5 times, the drives in the p= ool will start throwing errors. If I do a hard trim of the disks, I can start from scratch and again get 4 = or 5 cycles before the errors. Hence, it sort of feels like a broken trim issue= ? I tried with auto trim on and off, a manual zfs trim <pool> between zfs sen= d| zfs recv tests to no avail. When the disks are on the mpr controller I will= get errors such as=20 (da6:mpr0:0:16:0): READ(10). CDB: 28 00 6d e0 ae 28 00 00 08 00 (da6:mpr0:0:16:0): CAM status: CCB request completed with an error (da6:mpr0:0:16:0): Retrying command, 3 more tries remain (da6:mpr0:0:16:0): WRITE(10). CDB: 2a 00 0c cb 3f 00 00 00 e8 00 (da6:mpr0:0:16:0): CAM status: CCB request completed with an error (da6:mpr0:0:16:0): Retrying command, 3 more tries remain (da6:mpr0:0:16:0): READ(10). CDB: 28 00 6d e0 ad 28 00 01 00 00 (da6:mpr0:0:16:0): CAM status: CCB request completed with an error (da6:mpr0:0:16:0): Retrying command, 3 more tries remain (da6:mpr0:0:16:0): READ(10). CDB: 28 00 6d e0 ac 28 00 00 f8 00 (da6:mpr0:0:16:0): CAM status: CCB request completed with an error (da6:mpr0:0:16:0): Retrying command, 3 more tries remain (da6:mpr0:0:16:0): WRITE(10). CDB: 2a 00 40 07 df 88 00 01 00 00 (da6:mpr0:0:16:0): CAM status: CCB request completed with an error (da6:mpr0:0:16:0): Retrying command, 3 more tries remain (da6:mpr0:0:16:0): WRITE(10). CDB: 2a 00 3f 48 72 08 00 01 00 00 (da6:mpr0:0:16:0): CAM status: SCSI Status Error (da6:mpr0:0:16:0): SCSI status: Check Condition (da6:mpr0:0:16:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset,=20 or bus device reset occurred) (da6:mpr0:0:16:0): Retrying command (per sense data) mpr0: Controller reported scsi ioc terminated tgt 15 SMID 2036 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 637 loginfo=20 31110f00 (da5:mpr0:0:15:0): WRITE(10). CDB: 2a 00 41 98 42 00 00 01 00 00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 1242 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 979 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 1243 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 2091 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 1612 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 2093 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 152 loginfo=20 31110f00 mpr0: Controller reported scsi ioc terminated tgt 15 SMID 2132 loginfo=20 31110f00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): WRITE(10). CDB: 2a 00 43 17 dc 88 00 01 00 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): WRITE(10). CDB: 2a 00 41 98 43 00 00 00 50 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): WRITE(10). CDB: 2a 00 0c d4 f6 80 00 00 68 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): WRITE(10). CDB: 2a 00 0c d4 f5 80 00 01 00 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): READ(10). CDB: 28 00 05 dc 12 28 00 00 f8 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): READ(10). CDB: 28 00 05 dc 0f b0 00 00 88 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): WRITE(10). CDB: 2a 00 02 96 7e 80 00 00 10 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): READ(10). CDB: 28 00 6f 5b 8d 68 00 01 00 00 (da5:mpr0:0:15:0): CAM status: CCB request completed with an error (da5:mpr0:0:15:0): Retrying command, 3 more tries remain (da5:mpr0:0:15:0): WRITE(10). CDB: 2a 00 41 98 42 00 00 01 00 00 (da5:mpr0:0:15:0): CAM status: SCSI Status Error (da5:mpr0:0:15:0): SCSI status: Check Condition (da5:mpr0:0:15:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset,=20 or bus device reset occurred) (da5:mpr0:0:15:0): Retrying command (per sense data) The same tests with Samsung disks work without issue or at least I was not = able to recreate the error.=20 # mprutil show adapter mpr0 Adapter: Board Name: INSPUR 3008IT Board Assembly: INSPUR Chip Name: LSISAS3008 Chip Revision: ALL BIOS Revision: 18.00.00.00 Firmware Revision: 16.00.12.00 Integrated RAID: no SATA NCQ: ENABLED PCIe Width/Speed: x8 (8.0 GB/sec) IOC Speed: Full Temperature: 56 C I originally ran into this problem with the same series of LSI adapter, but= it was not in IT mode and instead was using the mrsas driver.=20=20 When on the ATA controller the disks are DSM_TRIM. When on MPR, they are ATA_TRIM. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-277992-227>