From owner-freebsd-stable@FreeBSD.ORG Wed Dec 14 19:47:14 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9E908106566B for ; Wed, 14 Dec 2011 19:47:14 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 398818FC15 for ; Wed, 14 Dec 2011 19:47:13 +0000 (UTC) Received: by eekc50 with SMTP id c50so1458462eek.13 for ; Wed, 14 Dec 2011 11:47:13 -0800 (PST) Received: by 10.180.7.195 with SMTP id l3mr72177wia.30.1323892032967; Wed, 14 Dec 2011 11:47:12 -0800 (PST) Received: from [10.254.254.77] (ppp94-29-56-7.pppoe.spdop.ru. [94.29.56.7]) by mx.google.com with ESMTPS id k5sm5042380wiz.9.2011.12.14.11.47.12 (version=SSLv3 cipher=OTHER); Wed, 14 Dec 2011 11:47:12 -0800 (PST) Message-ID: <4EE8FD3E.8030902@zonov.org> Date: Wed, 14 Dec 2011 23:47:10 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Jeremy Chadwick References: <4EE7BF77.5000504@zonov.org> <20111213221501.GA85563@icarus.home.lan> <4EE8E6E3.7050202@zonov.org> <20111214182252.GA5176@icarus.home.lan> In-Reply-To: <20111214182252.GA5176@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: directory listing hangs in "ufs" state X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Dec 2011 19:47:14 -0000 On 14.12.2011 22:22, Jeremy Chadwick wrote: > On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote: >> Hi Jeremy, >> >> This is not hardware problem, I've already checked that. I also ran >> fsck today and got no errors. >> >> After some more exploration of how mongodb works, I found that then >> listing hangs, one of mongodb thread is in "biowr" state for a long >> time. It periodically calls msync(MS_SYNC) accordingly to ktrace >> out. >> >> If I'll remove msync() calls from mongodb, how often data will be >> sync by OS? >> >> -- >> Andrey Zonov >> >> On 14.12.2011 2:15, Jeremy Chadwick wrote: >>> On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote: >>>> >>>> Have you any ideas what is going on? or how to catch the problem? >>> >>> Assuming this isn't a file on the root filesystem, try booting the >>> machine in single-user mode and using "fsck -f" on the filesystem in >>> question. >>> >>> Can you verify there's no problems with the disk this file lives on as >>> well (smartctl -a /dev/disk)? I'm doubting this is the problem, but >>> thought I'd mention it. > > I have no real answer, I'm sorry. msync(2) indicates it's effectively > deprecated (see BUGS). It looks like this is effectively a mmap-version > of fsync(2). I replaced msync(2) with fsync(2). Unfortunately, from man pages it is not obvious that I can do this. Anyway, thanks. > > I'm extremely confused by this problem. What you're describing above is > that the process is "stuck in biowr state for a long time", but what you > stated originally was that the process was "stuck in ufs state for a > few minutes": Listing of the directory with mongodb files by ls(1) stuck in "ufs" state when one of mongodb's thread in "biowr" state. It looks like system holds global lock of the file which is msync(2)-ed and can't immediately return from lstat(2) call. > >> I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it. A >> couple days ago I observed that listing of mongodb directory stuck in >> a few minutes in "ufs" state. > > Can we narrow down what we're talking about here? Does the process > actually deadlock? Or are you concerned about performance implications? > > I know nothing about this "mongodb" software, but the reason it's > calling msync() is because it wants to try and ensure that the data it > changed in an mmap()-mapped page to be reflected (fully written) on the > disk. This behaviour is fairly common within database software, but > "how often" the software chooses to do this is entirely a design > implementation choice by the authors. > > Meaning: if mongodb is either 1) continually calling msync(), or 2) > waiting for too long a period of time before calling msync(), > performance within the process will suffer. #1 could result in overall > bad performance, while #2 could result in a process that's spending a > lot of time doing I/O (flushing to disk) and therefore appears > "deadlocked" when in fact the kernel/subsystems are doing exactly what > they were told to do. > > Removing the msync() call could result in inconsistent data (possibly > non-recoverable) if the mongodb software crashes or if some other piece > (thread or child? Not sure) expects to open a new fd on that file which > has mmap()'d data. Yes, I clearly understand this. I think of any system tuning instead, but nothing arose in my head. > > This is about all I know. I would love to be able to tell you "consider > a different database" but that seems like an excuse rather than an > actual solution. I guess if all you're seeing is the process "stall" > for long periods of time, but recover normally, then I would open up a > support ticket with the mongodb folks to discuss performance. > -- Andrey Zonov