From owner-freebsd-stable@FreeBSD.ORG Tue Mar 21 22:56:25 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A04BB16A400; Tue, 21 Mar 2006 22:56:25 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9C84C43D5E; Tue, 21 Mar 2006 22:56:24 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.13.4/8.13.4) with ESMTP id k2LMuHIe006843; Tue, 21 Mar 2006 14:56:17 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.13.4/8.13.4/Submit) id k2LMuHT0006842; Tue, 21 Mar 2006 14:56:17 -0800 (PST) Date: Tue, 21 Mar 2006 14:56:17 -0800 (PST) From: Matthew Dillon Message-Id: <200603212256.k2LMuHT0006842@apollo.backplane.com> To: Mikhail Teterin References: <200603211607.30372.mi+mx@aldan.algebra.com> <200603211747.36251.mi+mx@aldan.algebra.com> Cc: alc@freebsd.org, stable@freebsd.org Subject: Re: more weird bugs with mmap-ing via NFS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Mar 2006 22:56:25 -0000 :When the client is in this state it remains quite usable except for the :following: : : 1) Trying to start `systat 1 -vm' stalls ALL access to local disks, : apparently -- no new programs can start, and the running ones : can not access any data either; attempts to Ctrl-C the starting : systat succeed only after several minutes. : : 2) The writing process is stuck unkillable in the following state: : : CPU PRI NI VSZ RSS MWCHAN STAT TT TIME : 27 -4 0 1351368 137764 nfs DL p4 1:05,52 : : Sending it any signal has no effect. (Large sizes are explained : by it mmap-ing its large input and output.) : : 3) Forceful umount of the share, that the program is writing to, : paralyzes the system for several minutes -- unlike in 1), not : even the mouse is moving. It would seem, the process is dumping : core, but it is not -- when the system unfreezes, the only : message from the kernel is: : : vm_fault: pager read error, pid XXXX (mzip) : :Again, this is on 6.1/i386 from today, which we are about to release into the :cruel world. : :Yours, : : -mi There are a number of problems using a block size of 65536. First of all, I think you can only safely do it if you use a TCP mount, also assuming the TCP buffer size is appropriately large to hold an entire packet. For UDP mounts, 65536 is too large (the UDP data length can only be 65536 bytes. For that matter, the *IP* packet itself can not exceed 65535 bytes. So 65536 will not work with a UDP mount. The second problem is related to the network driver. The packet MTU is 1500, which means, typically, a limit of around 1460-1480 payload bytes per packet. A UDP large UDP packet that is, say, 48KB, will be broken down into over 33 IP packet fragments. The network stack could very well drop some of these packet fragments making delivery of the overall UDP packet unreliable. The NFS protocol itself does allow read and write packets to be truncated providing that the read or write operation is either bounded by the file EOF or (for a read) the remaining data is all zero's. Typically the all-zero's case is only optimized by the NFS server when the underlying filesystem block itself is unallocated (i.e. a 'hole' in the file). In all other cases the full NFS block size is passed between client and server. I would stick to an NFS block size of 8K or 16K. Frankly, there is no real reason to use a larger block size. -Matt Matthew Dillon