Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 May 2016 10:09:10 +0100
From:      Steven Hartland <killing@multiplay.co.uk>
To:        freebsd-stable@freebsd.org
Subject:   Re: ZFS and NVMe, trim caused stalling
Message-ID:  <20a155fd-8695-ca42-6a72-32cb78864a22@multiplay.co.uk>
In-Reply-To: <BD7424F9-2968-410D-8146-27496054BCFA@sarenet.es>
References:  <5E710EA5-C9B0-4521-85F1-3FE87555B0AF@bsdimp.com> <BD7424F9-2968-410D-8146-27496054BCFA@sarenet.es>

next in thread | previous in thread | raw e-mail | index | archive | help
On 17/05/2016 08:49, Borja Marcos wrote:
>> On 05 May 2016, at 16:39, Warner Losh <imp@bsdimp.com> wrote:
>>
>>> What do you think? In some cases it’s clear that TRIM can do more harm than good.
>> I think it’s best we not overreact.
> I agree. But with this issue the system is almost unusable for now.
>
>> This particular case is cause by the nvd driver, not the Intel P3500 NVME drive. You need
>> a solution (3): Fix the driver.
>>
>> Specifically, ZFS is pushing down a boatload of BIO_DELETE requests. In ata/da land, these
>> requests are queued up, then collapsed together as much as makes sense (or is possible).
>> This vastly helps performance (even with the extra sorting that I forced to be in there that I
>> need to fix before 11). The nvd driver needs to do the same thing.
> I understand that, but I don’t think it’s a good that ZFS depends blindly on a driver feature such
> as that. Of course, it’s great to exploit it.
>
> I have also noticed that ZFS has a good throttling mechanism for write operations. A similar
> mechanism should throttle trim requests so that trim requests don’t clog the whole system.
It already does.
>
>> I’d be extremely hesitant to tossing away TRIMs. They are actually quite important for
>> the FTL in the drive’s firmware to proper manage the NAND wear. More free space always
>> reduces write amplification. It tends to go as 1 / freespace, so simply dropping them on
>> the floor should be done with great reluctance.
> I understand. I was wondering about choosing the lesser between two evils. A 15 minute
> I/O stall (I deleted 2 TB of data, that’s a lot, but not so unrealistic) or settings trims aside
> during the peak activity.
>
> I see that I was wrong on that, as a throttling mechanism would be more than enough probably,
> unless the system is close to running out of space.
>
> I’ve filed a bug report anyway. And copying to -stable.
>
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571
>
TBH it sounds like you may have badly behaved HW, we've used ZFS + TRIM 
and for years on large production boxes and while we're seen slow down 
we haven't experienced the total lockups you're describing.

The graphs on you're ticket seem to indicate peak throughput of 250MB/s 
which is extremely slow for standard SSD's let alone NVMe ones and when 
you add in the fact you have 10 well it seems like something is VERY wrong.

I just did a quick test on our DB box here creating and then deleting a 
2G file as you describe and I couldn't even spot the delete in the 
general noise it was so quick to process and that's a 6 disk machine 
with P3700's.

     Regards
     Steve





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20a155fd-8695-ca42-6a72-32cb78864a22>