From owner-freebsd-stable@FreeBSD.ORG  Wed Jan 20 00:16:43 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A4A3C1065670;
	Wed, 20 Jan 2010 00:16:43 +0000 (UTC)
	(envelope-from ohartman@mail.zedat.fu-berlin.de)
Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de
	[130.133.4.66])
	by mx1.freebsd.org (Postfix) with ESMTP id 4B6308FC1C;
	Wed, 20 Jan 2010 00:16:43 +0000 (UTC)
Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69])
	by outpost1.zedat.fu-berlin.de (Exim 4.69) with esmtp
	(envelope-from <ohartman@mail.zedat.fu-berlin.de>)
	id <1NXOFW-0006mw-4d>; Wed, 20 Jan 2010 01:16:42 +0100
Received: from e178038234.adsl.alicedsl.de ([85.178.38.234]
	helo=thor.walstatt.dyndns.org)
	by inpost2.zedat.fu-berlin.de (Exim 4.69) with esmtpsa
	(envelope-from <ohartman@mail.zedat.fu-berlin.de>)
	id <1NXOFW-0001qD-06>; Wed, 20 Jan 2010 01:16:42 +0100
Message-ID: <4B564B69.6080102@mail.zedat.fu-berlin.de>
Date: Wed, 20 Jan 2010 01:16:41 +0100
From: "O. Hartmann" <ohartman@mail.zedat.fu-berlin.de>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.1.5) Gecko/20091219 Thunderbird/3.0
MIME-Version: 1.0
To: krad <kraduk@googlemail.com>
References: <4B54C100.9080906@mail.zedat.fu-berlin.de>	<4B54C5EE.5070305@pp.dyndns.biz>
	<d36406631001190109x154ff1c1lb2354d5a07212455@mail.gmail.com>
In-Reply-To: <d36406631001190109x154ff1c1lb2354d5a07212455@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: 85.178.38.234
Cc: =?UTF-8?B?TW9yZ2FuIFdlc3N0csO2bQ==?= <freebsd-questions@pp.dyndns.biz>,
	FreeBSD Stable <freebsd-stable@freebsd.org>, freebsd-questions@freebsd.org
Subject: Re: immense delayed write to file system (ZFS and UFS2), performance
 issues
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jan 2010 00:16:43 -0000

On 01/19/10 10:09, krad wrote:
> 2010/1/18 Morgan Wesstr=EF=BF=BDm<freebsd-questions@pp.dyndns.biz>
>
>> O. Hartmann wrote:
>>> I realise a strange behaviour of several FreeBSD 8.0-STABLE/amd64 box=
es.
>>> All boxes have the most recent STABLE. One box is a UP system, two
>>> others SMP boxes, one with a Q6600 4-core, another XEON with 2x 4-cor=
es
>>> (Dell Poweredge III).
>>>
>>> Symptome: All boxes have ZFS and UFS2 filesystems. Since two weeks or=

>>> so, sometimes the I/O performance drops massively when doing 'svn
>>> update', 'make world' or even 'make kernel'. It doesn't matter what
>>> memory and how many cpu the box has, it get stuck for several seconds=

>>> and freezing. On the UP box, this is sometimes for 10 - 20 seconds.
>>> A very interesting phenomenon is the massively delayed file writing o=
n
>>> ZFS filesystems I realise. Editing a file in 'vi' running on one XTer=
m
>>> and having in another Xterminal my shell for compiling this file, it
>>> takes sometimes up to 20 seconds to get the file updated after it has=

>>> been written. It's like having an old, slow NFS connection with long
>>> cache delays.
>>> These massively delayed file transactions are not necessarely under
>>> heavy load, sometimes they occur in a relaxed situation. They seem to=

>>> occur much more often on the UP box than on the SMP boxes, but this
>>> strange phenomenon also occur on the Dell Poweredge II, which has 16G=
B
>>> RAM and summa summarum 16 cores. This phenomenon does occur on ZFS- a=
nd
>>> UFS2 filesystems as well. It is hardly reproducable.
>>>
>>> Is there any known issue?
>>>
>>> Ragrds,
>>> Oliver
>>
>>
>> The disks involved don't happen to be Western Digital Green Power disk=
s,
>> do they? The Intelli-Park function in these disks are wrecking havoc
>> with I/O in Linux-land at least, causing massive stalls and iowait
>> through the roof during the 25-30 seconds it takes for the heads to
>> unload after parking. I have two of these disks sitting on my desk now=

>> collecting dust...
>> /Morgan
>> _______________________________________________
>> freebsd-questions@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>> To unsubscribe, send any mail to "
>> freebsd-questions-unsubscribe@freebsd.org"
>>
>
>
> ZFS is copy on write, therefore to optimize the write performance it de=
lays
> writes for a long as possible, upto a set maximum time. It will then fl=
ush
> to the disks. How long this time is depends on how much free ram you ha=
ve
> available. Assuming processes are eating up all your ram I would imagin=
e you
> are hitting the max limit. I'm not sure exactly what its set to on bsd =
but I
> know the default on opensolaris is 30s. I think this explains your dela=
yed
> writes.
>
> Not sure what will cause the lock ups though.
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.or=
g"

This could end in a bad situation, where one process writes a files, say =

with some arbitrary stuff and another successing process is intended to=20
read this file. even if the processes are run serial, those 'delays'=20
could break the chain! The delay situation in a development environment=20
is harsh, but in other circumstances it could develop very bad.

I see this strange behaviour now for several weeks, something essential=20
has changed in the code, I guess.
On UP boxes the situation is worse sometimes, on SMp boxes with lots of=20
RAM ( 8 and 16 GB and 4 or 8 CPU cores) it is still bad. I have a server =

that acts as a 'rsync' backup system gathering data from satellite=20
servers from time to time. Since this problem of slowness occured, this=20
4-core 8 gig RAM box crawls for minutes. Even when X11 is disabled=20
working on console is 'bumpy': terminal out slows down, mouse pointer=20
jumps etc.As I wrote, the same on a 8 core/16 gig box, but not that harsh=
=2E