Date: Tue, 09 Feb 2021 14:14:48 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 253373] mps(4) drive status request during spin-up timeout results in zpool degradation Message-ID: <bug-253373-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D253373 Bug ID: 253373 Summary: mps(4) drive status request during spin-up timeout results in zpool degradation Product: Base System Version: 12.2-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: grembo@FreeBSD.org Short description: ------------------ When a drive spins up while `camcontrol epc -c status' is run on the drive,= a command timeout happens that results in the drive being unavailable tempora= rily and marked as FAULTED in the zpool. Increasing MPS_ATA_ID_TIMEOUT (in code, not tunable) might be a workaround. Long version: ------------- I've started seeing this after updating from 11.3 to 12.1 and, subsequently, 12.2. It's not necessarily caused to the update itself, but it might be rel= ated to switching to `camcontrol epc' after the update - I used `camcontrol cmd = -a "E5 00..."' beforehand. I have an LSI (Avago/Broadcom) SAS2008 based controller with two spindles connected to it (Seagate IronWolf) storing data only (system drives are separate). As this is a home NAS, drives are configured to spin-down when idle for more than 30 minutes using camcontrol: /sbin/camcontrol standby /dev/da2 -v -t 1800 /sbin/camcontrol standby /dev/da3 -v -t 1800 There's a cron job running every ten minutes logging the spin state of the drives using camcontrol: camcontrol epc da2 -c status -P camcontrol epc da2 -c status -P All of this is working just fine, but once in a while, the zpool goes to a DEGRADED state, because one of the drives is in a FAULTED state (too many errors). Based on the logs and the timestamps therein, this seems to be caused when a drive spins up exactly at the same time when an EPC status check is done. So the order of events is like this: 1. System is idle, drives are on standby 2. Cronjob checks drive spin state using camcontrol epc status 3. At the same time, a disk access happens, causing the drives in the zpo= ol to spin up 4. Spinning up of the drives delays reporting the spin state 5. The timeout causes the mps driver to abort the command, subsequent requests to the drive fail 6. Failed requests to the drive cause it to go to a FAILED state and degr= ade the zpool This happens every few weeks by chance, e.g., today I accessed a file exact= ly in the right moment, which was recorded in the local webserver's error log,= so everything matched up. This is the log output, first, the drive status inquiry (which was started = at 10:40:00) fails Feb 9 10:40:05 nas kernel: (pass2:mps0:0:2:0): INQUIRY. CDB: 12 01= 00 00 ff 00 length 255 SMID 803 Command timeout on target 2(0x0009) 5000 set, 5.388682389 elapsed Then, the command is aborted: Feb 9 10:40:05 nas kernel: mps0: Sending abort to target 2 for SMID 803 Feb 9 10:40:05 nas kernel: (pass2:mps0:0:2:0): INQUIRY. CDB: 12 01= 00 00 ff 00 length 255 SMID 803 Aborting command 0xfffffe00026c5708 Feb 9 10:40:08 nas kernel: mps0: Controller reported scsi ioc terminated= tgt 2 SMID 369 loginfo 31140000 Feb 9 10:40:08 nas kernel: mps0: Finished abort recovery for target 2 Feb 9 10:40:08 nas kernel: mps0: Unfreezing devq for target ID 2 At 10:40:09 the webservers log shows a 502 error (request for the file hung= for a couple of seconds beforehand, so the two events clearly overlapped - I ca= n't say if the spin-up was already initiated/the webserver request started shor= tly before the status request or if it was the other way round). Following this, syslog shows a ton of messages about the drive not being re= ady: Feb 9 10:40:08 nas kernel: (da2:mps0:0:2:0): READ(10). CDB: 28 00 e4 00 = 4a 40 00 00 20 00=20 Feb 9 10:40:08 nas kernel: (da2:mps0:0:2:0): CAM status: CCB request completed with an error Feb 9 10:40:08 nas kernel: (da2:mps0:0:2:0): Retrying command, 9 more tr= ies remain Feb 9 10:40:08 nas kernel: (da2:mps0:0:2:0): READ(10). CDB: 28 00 e4 00 = 4a 40 00 00 20 00=20 Feb 9 10:40:08 nas kernel: (da2:mps0:0:2:0): CAM status: SCSI Status Err= or Feb 9 10:40:08 nas kernel: (da2:mps0:0:2:0): SCSI status: Check Condition Feb 9 10:40:08 nas kernel: (da2:mps0:0:2:0): SCSI sense: NOT READY asc:4= ,0 (Logical unit not=20 ready, cause not reportable) ... Feb 9 10:40:09 nas kernel: (da2:mps0:0:2:0): Error 5, Retries exhausted ... Until ZFS realizes the vdev has issues: Feb 9 10:40:09 nas ZFS[38943]: vdev state changed, pool_guid=3D$xxxx vdev_guid=3D$xxxx After that there are more NOT READY errors, and another timeout/abort: Feb 9 10:40:14 nas kernel: (pass2:mps0:0:2:0): INQUIRY. CDB: 12 01= 00 00 ff 00 length 255 SMID 724 Command timeout on target 2(0x0009) 5000 set, 5.353551387 elapsed Feb 9 10:40:14 nas kernel: mps0: Sending abort to target 2 for SMID 724 Feb 9 10:40:14 nas kernel: (pass2:mps0:0:2:0): INQUIRY. CDB: 12 01= 00 00 ff 00 length 255 SMID 724 Aborting command 0xfffffe00026bece0 Feb 9 10:40:14 nas kernel: mps0: Finished abort recovery for target 2 Feb 9 10:40:14 nas kernel: mps0: Unfreezing devq for target ID 2 Afterwards, the drive is just fine and running "zpool clear" + scrubbing sh= ows that all is good again - it's also not always the same drive, it just depen= ds on timing). Now, the obvious workaround is to remove logging of the drive spin state for the time being, which should fix the that specific issue, but I wonder if t= here is a solution for this (either, mps should be able to handle the timeout scenario gracefully, or it should be possible to avoid it increasing a timeout). I haven't increased kern.cam.scsi_delay yet (it's also 5 seconds, but seems unrelated otherwise), but looking at the mps code, it seems like that increasing MPS_ATA_ID_TIMEOUT in /usr/src/sys/dev/mps/mpsvar.h might help to avoid this specific timeout: #define MPS_ATA_ID_TIMEOUT 5/* 5 second timeout for SATA ID cmd */ I would prefer to hear from someone familiar with the driver before startin= g to poke around in there though. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-253373-227>