From owner-freebsd-fs@FreeBSD.ORG Tue Jan 19 07:58:37 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 695D1106566B for ; Tue, 19 Jan 2010 07:58:37 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f209.google.com (mail-bw0-f209.google.com [209.85.218.209]) by mx1.freebsd.org (Postfix) with ESMTP id EA2208FC14 for ; Tue, 19 Jan 2010 07:58:36 +0000 (UTC) Received: by bwz1 with SMTP id 1so1291033bwz.13 for ; Mon, 18 Jan 2010 23:58:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:to:subject:references :organization:from:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=q3N6dq+7CCW6pXUw1lWl8D6W76wJXtmBa/Pos4f/ETU=; b=Ndoi5gr51/D5WGcjQd+Pq24weYYkc/fJnle5M4aLfOKrv/rjUObTsjAtae7ZT9qgI9 4qeKdzvxMXFt2CWYVy461t5Ie+P4XZY2MuBlxxLUZilXHrbecIxFvGR9z5bc9NonMqVU b9R5blRtAGGKK9y3mefJKoLTj7bKdY1D7F5aw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=to:subject:references:organization:from:date:in-reply-to:message-id :user-agent:mime-version:content-type; b=j7hnVAJbwgLT7F3yLa5R59aIeitIJC6pL++UF4zfn86AV9LLBvgwx2secB5QPgRx0/ Fnl3jH+Oj7LKBbxT+VZbXZc6rOtud9MlCQCvpWqN9Yi5276dzNzVGrE0sJleOS3Nm/u2 S0K+XOwVVni/1FY8LWshgQ1Gd4/otbqY9/waQ= Received: by 10.204.30.208 with SMTP id v16mr4059308bkc.18.1263887915454; Mon, 18 Jan 2010 23:58:35 -0800 (PST) Received: from localhost (ms.singlescrowd.net [80.85.90.67]) by mx.google.com with ESMTPS id 15sm1706259bwz.0.2010.01.18.23.58.33 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 18 Jan 2010 23:58:34 -0800 (PST) To: freebsd-fs@FreeBSD.org References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> Organization: TOA Ukraine From: Mikolaj Golub Date: Tue, 19 Jan 2010 09:58:32 +0200 In-Reply-To: <86tyuqnz9x.fsf@zhuzha.ua1> (Mikolaj Golub's message of "Wed\, 13 Jan 2010 11\:13\:14 +0200") Message-ID: <86zl4awmon.fsf@zhuzha.ua1> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2010 07:58:37 -0000 On Wed, 13 Jan 2010 11:13:14 +0200 Mikolaj Golub wrote: > On Sun, 10 Jan 2010 11:03:56 +0200 Mikolaj Golub wrote: > So because it was appending to the file every php write call caused the > sequence of the following rpc: ACCESS - READ - WRITE - COMMIT. And trying to > flush the next line of the log it got stuck after READ call (the next should > be WRITE call but client never did it). > > The same thing is for other log file written by othe php process. The last rpc > for this file: > > 30990 18:02:05.050063 172.30.10.54 172.30.10.83 NFS V3 READ Call (Reply In 31068), FH:0x532fa29d Offset:131072 Len:2686 > 31068 18:02:05.062801 172.30.10.83 172.30.10.54 NFS V3 READ Reply (Call In 30990) Len:2685 > > A bit later there were several successful COMMIT calls (when php processes > were closing other files I think). And other NFS activity was observed -- our > nagios checks and other applications, which was just looking for presence and > status of certain files, were running successfully and in tcpdump there are > successful readdir/access/lookup/fstat calls. df utility did not hanged then > too. > > Later when our engineer tried to access the mounted folder with mc the > process locked acquiring nfs vn_lock held by php script (td=0xc6bf4690): Analyzing logs of our php scripts we have found that we had cases when a process (or two simultaneously) got stuck writing to NFS and then later they were "unfrozen" by another started php process when it was writing to this NFS share (in some other log file). We have tcpdump for such case and it looks like the following: 1) ACCESS - READ - WRITE - COMMIT sequences when the php process is writing to log file. 2) Then at some moment this stops after READ rpc call and successful reply. 3) After this successful readdir/access/lookup/fstat calls are observed from our other utilities, which just check the presence of some files. 4) New php process starts and writes to some other log file (successful ACCESS - READ - WRITE - COMMIT sequences). After this writing to the first file continues too (starting from WRITE rpc, so there is no any retransmits). As a workaround we installed cron scripts that just write to some file every 2 minutes. We have been running this for 3 days and there have not been incidents since then but actually we will be able to say if this really has helped only after running a week and more. Also we are upgrading one of our servers, where the problem has been observed most frequently to 7.2). Actually we have many FreeBSD7.1 hosts with NFS mounts but the problem has been observed only on 3 of them and currently we don't know a way to reproduce it. -- Mikolaj Golub