From owner-freebsd-stable@FreeBSD.ORG  Wed Dec 14 18:22:57 2011
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 164791065678
	for <freebsd-stable@freebsd.org>; Wed, 14 Dec 2011 18:22:57 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta13.westchester.pa.mail.comcast.net
	(qmta13.westchester.pa.mail.comcast.net [76.96.59.243])
	by mx1.freebsd.org (Postfix) with ESMTP id B76D28FC0C
	for <freebsd-stable@freebsd.org>; Wed, 14 Dec 2011 18:22:56 +0000 (UTC)
Received: from omta17.westchester.pa.mail.comcast.net ([76.96.62.89])
	by qmta13.westchester.pa.mail.comcast.net with comcast
	id 96CB1i0021vXlb85D6NwG3; Wed, 14 Dec 2011 18:22:56 +0000
Received: from koitsu.dyndns.org ([67.180.84.87])
	by omta17.westchester.pa.mail.comcast.net with comcast
	id 96Nu1i00d1t3BNj3d6NuTV; Wed, 14 Dec 2011 18:22:56 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id B9510102C19; Wed, 14 Dec 2011 10:22:52 -0800 (PST)
Date: Wed, 14 Dec 2011 10:22:52 -0800
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Andrey Zonov <andrey@zonov.org>
Message-ID: <20111214182252.GA5176@icarus.home.lan>
References: <4EE7BF77.5000504@zonov.org>
	<20111213221501.GA85563@icarus.home.lan>
	<4EE8E6E3.7050202@zonov.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4EE8E6E3.7050202@zonov.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-stable@freebsd.org
Subject: Re: directory listing hangs in "ufs" state
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Dec 2011 18:22:57 -0000

On Wed, Dec 14, 2011 at 10:11:47PM +0400, Andrey Zonov wrote:
> Hi Jeremy,
> 
> This is not hardware problem, I've already checked that. I also ran
> fsck today and got no errors.
> 
> After some more exploration of how mongodb works, I found that then
> listing hangs, one of mongodb thread is in "biowr" state for a long
> time. It periodically calls msync(MS_SYNC) accordingly to ktrace
> out.
> 
> If I'll remove msync() calls from mongodb, how often data will be
> sync by OS?
> 
> -- 
> Andrey Zonov
> 
> On 14.12.2011 2:15, Jeremy Chadwick wrote:
> >On Wed, Dec 14, 2011 at 01:11:19AM +0400, Andrey Zonov wrote:
> >>
> >>Have you any ideas what is going on? or how to catch the problem?
> >
> >Assuming this isn't a file on the root filesystem, try booting the
> >machine in single-user mode and using "fsck -f" on the filesystem in
> >question.
> >
> >Can you verify there's no problems with the disk this file lives on as
> >well (smartctl -a /dev/disk)?  I'm doubting this is the problem, but
> >thought I'd mention it.

I have no real answer, I'm sorry.  msync(2) indicates it's effectively
deprecated (see BUGS).  It looks like this is effectively a mmap-version
of fsync(2).

I'm extremely confused by this problem.  What you're describing above is
that the process is "stuck in biowr state for a long time", but what you
stated originally was that the process was "stuck in ufs state for a
few minutes":

> I've got STABLE-8 (r221983) with mongodb-1.8.1 installed on it.  A
> couple days ago I observed that listing of mongodb directory stuck in
> a few minutes in "ufs" state.

Can we narrow down what we're talking about here?  Does the process
actually deadlock?  Or are you concerned about performance implications?

I know nothing about this "mongodb" software, but the reason it's
calling msync() is because it wants to try and ensure that the data it
changed in an mmap()-mapped page to be reflected (fully written) on the
disk.  This behaviour is fairly common within database software, but
"how often" the software chooses to do this is entirely a design
implementation choice by the authors.

Meaning: if mongodb is either 1) continually calling msync(), or 2)
waiting for too long a period of time before calling msync(),
performance within the process will suffer.  #1 could result in overall
bad performance, while #2 could result in a process that's spending a
lot of time doing I/O (flushing to disk) and therefore appears
"deadlocked" when in fact the kernel/subsystems are doing exactly what
they were told to do.

Removing the msync() call could result in inconsistent data (possibly
non-recoverable) if the mongodb software crashes or if some other piece
(thread or child?  Not sure) expects to open a new fd on that file which
has mmap()'d data.

This is about all I know.  I would love to be able to tell you "consider
a different database" but that seems like an excuse rather than an
actual solution.  I guess if all you're seeing is the process "stall"
for long periods of time, but recover normally, then I would open up a
support ticket with the mongodb folks to discuss performance.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |