Date: Thu, 14 May 2020 11:26:43 -0400 From: Mark Johnston <markj@freebsd.org> To: Andriy Gapon <avg@freebsd.org> Cc: Bryan Drewery <bdrewery@freebsd.org>, freebsd-current@freebsd.org Subject: Re: zfs deadlock on r360452 relating to busy vm page Message-ID: <20200514152643.GC4268@raichu> In-Reply-To: <b1cc2021-274d-f7c9-665d-c6c71d2f40c3@FreeBSD.org> References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> <e10d3c61-0b47-55f2-0fe1-9fafaafe7799@FreeBSD.org> <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> <889cb93b-85c7-3ec4-4ccf-5fb56ec38fa5@FreeBSD.org> <20200513144257.GA24239@raichu> <b1cc2021-274d-f7c9-665d-c6c71d2f40c3@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, May 14, 2020 at 02:16:15PM +0300, Andriy Gapon wrote: > On 13/05/2020 17:42, Mark Johnston wrote: > > On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote: > >> On 13/05/2020 10:35, Andriy Gapon wrote: > >>> In r329363 I re-worked zfs_getpages and introduced range locking to it. > >>> At the time I believed that it was safe and maybe it was, please see the commit > >>> message. > >>> There, indeed, have been many performance / concurrency improvements to the VM > >>> system and r358443 is one of them. > >> > >> Thinking more about it, it could be r352176. > >> I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are not > >> equivalent to the code that they replaced. > >> The original code would check valid field before any locking and it would > >> attempt any locking / busing if a page is invalid. The object was required to > >> be locked though. > >> The new code tries to busy the page in any case. > >> > >>> I am not sure how to resolve the problem best. Maybe someone who knows the > >>> latest VM code better than me can comment on my assumptions stated in the commit > >>> message. > > > > The general trend has been to use the page busy lock as the single point > > of synchronization for per-page state. As you noted, updates to the > > valid bits were previously interlocked by the object lock, but this is > > coarse-grained and hurts concurrency. I think you are right that the > > range locking in getpages was ok before the recent change, but it seems > > preferable to try and address this in ZFS. > > > >>> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in > >>> this corner of the code because of a similar deadlock a long time ago. > > > > Do they just not implement readahead? > > I think so, but not 100% sure. > I recall seeing a comment in illumos code that they do not care about read-ahead > because there is ZFS prefetch and the data will be cached in ARC. That makes > sense from the I/O point of view, but it does not help with page faults. > > > Can you explain exactly what the > > range lock accomplishes here - is it entirely to ensure that znode block > > size remains stable? > > As far as I can recall, this is the reason indeed. It seems to me that zfs_getpages() could use a non-blocking rangelock_enter() operation to avoid the deadlock. The ZFS rangelock implementation doesn't have one, but it is easy to add. I'm not able to trigger the deadlock with this patch: https://reviews.freebsd.org/D24839
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200514152643.GC4268>