Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 Mar 2013 12:17:45 -0600
From:      Josh Beard <josh@signalboxes.net>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS + NFS poor performance after restarting from 100 day uptime
Message-ID:  <CAHDrHSvXCu%2Bv%2Bps3ctg=T0qtHjKGkXxvnn_EaNrt_eenkJ9dbQ@mail.gmail.com>
In-Reply-To: <D763F64A24B54755BBF716E91D646F6A@multiplay.co.uk>
References:  <CAHDrHSsCunt9eQKjMy9epPBYTmaGs5HNgKV2%2BUKuW0RQZPpw%2BA@mail.gmail.com> <D763F64A24B54755BBF716E91D646F6A@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 21, 2013 at 10:14 AM, Steven Hartland
<killing@multiplay.co.uk>wrote:

>
> ----- Original Message ----- From: "Josh Beard" <josh@signalboxes.net>
> To: <freebsd-fs@freebsd.org>
> Sent: Thursday, March 21, 2013 3:53 PM
> Subject: ZFS + NFS poor performance after restarting from 100 day uptime
>
>
>
>  Hello,
>>
>> I have a system with 12 disks spread between 2 raidz1.  I'm using the
>> native ("new") NFS to export a pool on this.  This has worked very well
>> all
>> along, but since a reboot, has performed horribly - unusably under load.
>>
>> The system was running 9.1-rc3 and I upgraded it to 9.1-release-p1
>> (GENERIC
>> kernel) after ~110 days of running (with zero performance issues).  After
>> rebooting from the upgrade, I'm finding the disks seem constantly slammed.
>> gstat reports 90-100% busy most of the day with only ~100-130 ops/s.
>>
>> I didn't change any settings in /etc/sysctl.conf or /boot/loader.  No ZFS
>> tuning, etc.  I've looked at the commits between 9.1-rc3 and
>> 9.1-release-p1
>> and I can't see any reason why simply upgrading it would cause this.
>>
> ...
>
>> A snip of gstat:
>> dT: 1.002s  w: 1.000s
>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
>>    0      0      0      0    0.0      0      0    0.0    0.0| cd0
>>    0      1      0      0    0.0      1     32    0.2    0.0| da0
>>    0      0      0      0    0.0      0      0    0.0    0.0| da0p1
>>    0      1      0      0    0.0      1     32    0.2    0.0| da0p2
>>    0      0      0      0    0.0      0      0    0.0    0.0| da0p3
>>    4    160    126   1319   31.3     34    100    0.1  100.3| da1
>>    4    146    110   1289   33.6     36     98    0.1   97.8| da2
>>    4    142    107   1370   36.1     35    101    0.2  101.9| da3
>>    4    121     95   1360   35.6     26     19    0.1   95.9| da4
>>    4    151    117   1409   34.0     34    102    0.1  100.1| da5
>>    4    141    109   1366   35.9     32    101    0.1   97.9| da6
>>    4    136    118   1207   24.6     18     13    0.1   87.0| da7
>>    4    118    102   1278   32.2     16     12    0.1   89.8| da8
>>    4    138    116   1240   33.4     22     55    0.1  100.0| da9
>>    4    133    117   1269   27.8     16     13    0.1   86.5| da10
>>    4    121    102   1302   53.1     19     51    0.1  100.0| da11
>>    4    120     99   1242   40.7     21     51    0.1   99.7| da12
>>
>
> Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s
> is what HD's will peak at, so whatever our machine is doing is causing
> it to max the available IO for your disks.
>
> If you boot back to your previous kernel does the problem go away?
>
> If so you could look at the changes between the two kernel revisions
> for possible causes and if needed to a binary chop with kernel builds
> to narrow down the cause.
>
>    Regards
>    Steve
>
>    Regards
>    Steve
>
>
>
Steve,

Thanks for your response.  I booted with the old kernel (9.1-RC3) and the
problem disappeared!  We're getting 3x the performance with the previous
kernel than we do with the 9.1-RELEASE-p1 kernel:

Output from gstat:

    1    362      0      0    0.0    345  20894    9.4   52.9| da1
    1    365      0      0    0.0    348  20893    9.4   54.1| da2
    1    367      0      0    0.0    350  20920    9.3   52.6| da3
    1    362      0      0    0.0    345  21275    9.5   54.1| da4
    1    363      0      0    0.0    346  21250    9.6   54.2| da5
    1    359      0      0    0.0    342  21352    9.5   53.8| da6
    1    347      0      0    0.0    330  20486    9.4   52.3| da7
    1    353      0      0    0.0    336  20689    9.6   52.9| da8
    1    355      0      0    0.0    338  20669    9.5   53.0| da9
    1    357      0      0    0.0    340  20770    9.5   52.5| da10
    1    351      0      0    0.0    334  20641    9.4   53.1| da11
    1    362      0      0    0.0    345  21155    9.6   54.1| da12

The kernels were compiled identically using GENERIC with no modification.
 I'm no expert, but none of the stuff I've seen looking at svn commits
looks like it would have any impact on this.  Any clues?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHDrHSvXCu%2Bv%2Bps3ctg=T0qtHjKGkXxvnn_EaNrt_eenkJ9dbQ>