Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 05 Apr 2024 21:28:12 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 277992] mpr and possible trim issues
Message-ID:  <bug-277992-227-EOjYk1VDVt@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-277992-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-277992-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D277992

--- Comment #2 from Warner Losh <imp@FreeBSD.org> ---
"Power on reset" sense code means that the drive dropped off the bus
(typically). Usually if you can eliminate flakey power (either bad connecto=
rs,
or inadequate power to cope with maximum power draw), then you are left with
"commands sent to the drive freaked it out". I always bet at least a nickel=
 on
'TRIM'.

If you think that this is TRIM related, then you can try disabling TRIM at =
the
da device level by changing the delete method to 'none':
kern.cam.da.0.delete_method: NONE

It may also be too large a TRIM, you can see what delete_max is, and try
reducing it (I usually go by 1/2 when looking for problems like this). I've
never had to do this, but it is tunable, and 'too large' is a known issue, =
or
used to be years ago:
kern.cam.da.0.delete_max: 1048576

You can do the trim type at normal to trim everything, and then you can tur=
n it
off in the drive to do the zfs send / receive. For testing purposes, trim
doesn't matter in terms of drive life (it only matters when you write to the
drive day in / day out and keep some parts of the drive unused for more tha=
n,
say, a few hours or a day).

It may also be that weird commands are being sent by something like smartd,
etc. If you can reproduce this, then we may be able to adopt a prototype dt=
race
script I have to make a tcpdump-like script. We can use it to dump the last=
 N
commands when we get a unit attention.

Now, having said all this, I think that this sense code might be mishandled
right now. Regardless of how we got here, I think that this should pause the
I/O, and reset the parameters we've set in the drive to reset its state to =
what
the driver expects (though I think only the write through cache and similar
settings might matter).

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-277992-227-EOjYk1VDVt>