From owner-freebsd-stable@freebsd.org Mon Mar 28 17:19:54 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ACB75AE0DC0 for ; Mon, 28 Mar 2016 17:19:54 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 995261B60 for ; Mon, 28 Mar 2016 17:19:54 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: by mailman.ysv.freebsd.org (Postfix) id 9500EAE0DBB; Mon, 28 Mar 2016 17:19:54 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 920E4AE0DB8; Mon, 28 Mar 2016 17:19:54 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 784C31B58; Mon, 28 Mar 2016 17:19:52 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA09541; Mon, 28 Mar 2016 20:19:50 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1akapS-0003Qy-CM; Mon, 28 Mar 2016 20:19:50 +0300 Subject: Re: Process stuck in "vnread" To: Konstantin Belousov , Maxim Sobolev References: <20160328162310.GJ1741__41334.1269981631$1459182219$gmane$org@kib.kiev.ua> Cc: freebsd-fs@FreeBSD.org, Kirk McKusick , stable@FreeBSD.org, kib@FreeBSD.org From: Andriy Gapon Message-ID: <56F96792.2010800@FreeBSD.org> Date: Mon, 28 Mar 2016 20:19:14 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 In-Reply-To: <20160328162310.GJ1741__41334.1269981631$1459182219$gmane$org@kib.kiev.ua> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2016 17:19:54 -0000 On 28/03/2016 19:23, Konstantin Belousov wrote: > On Mon, Mar 28, 2016 at 08:52:03AM -0700, Maxim Sobolev wrote: >> Done some head scratching, it looks like it's got page fault in the >> copyin() (cp(1) AFAIK mmaps source file). There might be some interlock >> issue between competing write to the same ZFS, the md0 device is locked >> forever waiting for the write operation to complete at the very same time. >> I am curious as to whether we are allowed to sleep in the dmu_write_uio_dbuf(), >> AFAIK dmu is ZFS's transaction layer, so maybe copyin() should be done >> earlier to avoid possible page fault in there? Maxim, is this copy from UFS to ZFS? It looks like that because the copyin() fault goes to vnode_pager_generic_getpages() -> bwait()... > No idea about ZFS, but if the issue is due to copyin(9) recursing into > VM and then VFS while owning file system locks, it is well-known and > long-standing issue. I sometimes call it 'ups deadlock', for some > reasons, see tools/test/upsdl/ for the distilled test case. > > It is handled for UFS and NFS, read the long comment starting with 'The > vn_io_fault() is a wrapper' in sys/kern/vfs_vnops.c, which describes the > deadlock in details and explains the mechanism which is used to prevent > it. Filesystems must opt-in into it by specifiying MNTK_NO_IOPF flag, > and then being ready to get an array of pages for io instead of the buffer > KVA. I don't have any idea why the thread would be stuck in bwait() and what locks and threads are involved here. But, as Kostik said, there is a general problem and I have a patch for ZFS: https://reviews.freebsd.org/D2790 -- Andriy Gapon