Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Jun 2005 22:20:03 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Glenn Dawson <glenn@antimatter.net>
Cc:        freebsd-performance@freebsd.org
Subject:   Re: vn(4) performance on 4.11 versus md(4) on 5.4
Message-ID:  <20050614213135.K38258@delplex.bde.org>
In-Reply-To: <6.1.0.6.2.20050604230636.01bf68c0@cobalt.antimatter.net>
References:  <6.1.0.6.2.20050604230636.01bf68c0@cobalt.antimatter.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 4 Jun 2005, Glenn Dawson wrote:

> I have a number of systems running 4.11 that have file backed virtual disks, 
> each of which contains a jail.  I need to start using 5.4 for new servers. 
> The catch is, file backed virtual disks using md(4) seem to be much slower 
> than similar virtual disks on 4.11 using vn(4).  vn(4) on 4.11 is about 2.24 
> times faster than the equivalent setup using md(4) on 5.4.
>
> I've posted the results of some tests that I ran at 
> http://www.antimatter.net/md-versus-vn.txt
>
> Is this decrease in performance known?  Is there something I can do in order 
> to come close to the performance that 4.11 has?  I've tried changing some of 
> the parameters of the filesystem on the virtual disk, but the performance 
> didn't change.

Writes by md are now synchronous.  Try turning this off using
"mdconfig -o async ...", though this is probably too dangerous to use in
production -- the sync writes are a hack to work around hangs, and my
system hung almost instantly while testing this.

For copying a cached copy of /usr/src/sys/ (~100MB) on an old de-GEOMed
version of -current, with all filesystems mounted -async -noatime, I
got the following times:

# ffs1 fs on ad2s2d
         6.21 real         0.52 user         3.39 sys
# ffs2 fs on md2 (default) on file zz on previous fs
        63.83 real         0.56 user         3.34 sys
# ffs2 fs on md3 (-o async) on same file (after mdconfig -u 2)
        16.10 real         0.50 user         3.40 sys

Syncing of the last fs deadlocked the file systems on md3 and ad2s2d :-(
but not others.

For dd'ing /dev/zero to large file, the sync writes gave a loss of
performance of almost exactly your factor of 2.24 relative to the
non-md fs: the raw disk speed is about 55MB/sec and writing to the
native ffs gave 54MB/sec by mostly writing with a physical block size
of 64K and writing via md2 gave 25MB/sec by writing always with a
physical block size of 16K.  The size of 64K results from clustering
and the size of 16K results from sync writes breaking clustering
(md always writes the fs block size which is 16K in my tests, and
since the writes are sync they must be done individually so they
cannot be clustered).

>From mdconfig(1):

%      -o [no]option
%              Set or reset options.
% 
%              [no]async
%                      For vnode backed devices: avoid IO_SYNC for increased
%                      performance but at the risk of deadlocking the entire
%                      kernel.
%              ...
%              [no]cluster
%                      Enable clustering on this disk.

A nearby bug in md is that "-o cluster" has always been silently ignored.
I think we decided that it is the user's responsibility to mount md-backed
(and other file systems on non-physical or memory-like devices) with
-o noclusterw -o noclusterr to prevent wasteful clustering).  This is easy
to forget, however.  vn used to turn off clustering non-optionally to avoid
some deadlock problems but this was removed long before 4.11 when the
deadlock problems were supposed to be fixed, so turning off clustering
was supposed to be only a small optimization.  Try turning it off to
see if it reduces deadlocks.

>From md.c's cvs history:

% RCS file: /home/ncvs/src/sys/dev/md/md.c,v
% Working file: md.c
% head: 1.124
% ...
% ----------------------------
% revision 1.115
% date: 2004/03/10 20:41:08;  author: phk;  state: Exp;  lines: +5 -3
% Fix a long-standing deadlock issue with vnode backed md(4) devices:
% 
% On vnode backed md(4) devices over a certain, currently undetermined
% size relative to the buffer cache our "lemming-syncer" can provoke
% a buffer starvation which puts the md thread to sleep on wdrain.
% 
% This generally tends to grind the entire system to a stop because the
% event that is supposed to wake up the thread will not happen until a fair
% bit of the piled up I/O requests in the system finish, and since a lot
% of those are on a md(4) vnode backed device which is currently waiting
% on wdrain until a fair amount of the piled up ... you get the picture.
% 
% The cure is to issue all VOP_WRITES on the vnode backing the device
% with IO_SYNC.
% 
% In addition to more closely emulating a real disk device with a
% non-lying write-cache, this makes the writes exempt from rate-limited
% (there to avoid starving the buffer cache) and consequently prevents
% the deadlock.
% 
% Unfortunately performance takes a hit.
% 
% Add "async" option to give people who know what they are doing the
% old behaviour.
% ----------------------------

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050614213135.K38258>