Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 03 May 2017 16:21:58 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 219049] ftruncate() slow on UFS+journaled softupdates
Message-ID:  <bug-219049-8@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219049

            Bug ID: 219049
           Summary: ftruncate() slow on UFS+journaled softupdates
           Product: Base System
           Version: 10.3-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs@FreeBSD.org
          Reporter: walter@lifeforms.nl

After upgrading to pkg 1.10, I noticed on my package build machine (bare me=
tal
i7+SSD) that extracting packages was about 100x slower than with pkg 1.9, a
'pkg upgrade' took hours instead of minutes, making it unusable.

I noticed that pkg was in 'biowr' state, and 'systat -vmstat' showed the di=
sk
at '100% busy' even though there were maybe 10-50 transactions/sec and less
than 1MB/s throughput. Doing 'dd' in a separate terminal can write with 170=
MB/s
so the physical disk is not saturated.

The truss tool showed that a lot of time was spent in the ftruncate() call,
which was introduced in pkg 1.10
(https://github.com/freebsd/pkg/commit/2085a3e975fc3b1102cbf809f325cb2d3941=
17bc)
hence why the problem appeared only recently.

I created a minimal example which exhibits the problem:

________ example.c ________

#include <fcntl.h>
#include <unistd.h>

int main()
{
        char* outfile =3D "woot2";

        int fd =3D open(outfile, O_WRONLY|O_CREAT|O_EXCL, 0100644);
        ftruncate(fd, 100000);
        close(fd);
}

________ loop.sh ________

#!/bin/sh

/usr/bin/cc -o example example.c || exit 1

/usr/bin/uname -a
/sbin/mount | grep ada

while true;
do
        rm -f ./woot2
        sleep 1
        /usr/bin/truss -D -o truss.out ./example
        grep ftruncate truss.out
done

_________________________

This produces the following output on my machine:

FreeBSD builder.dt.lfms.nl 10.3-RELEASE-p18 FreeBSD 10.3-RELEASE-p18 #0: Tue
Apr 11 10:31:00 UTC 2017=20=20=20=20
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
0.439747415 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.431741841 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.000638086 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.000648680 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.477675635 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.000652645 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.467652474 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.000646409 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.466081847 ftruncate(3,0x186a0)                 =3D 0 (0x0)
0.432613030 ftruncate(3,0x186a0)                 =3D 0 (0x0)
[...]

As can be seen, a significant part of the ftruncate() calls take almost hal=
f a
second to complete. You would expect this operation to be 0.0 seconds almost
always. It explains why pkg, which ftruncates thousands of files, is sudden=
ly
so slow.

I have tried disabling SU-journaling, and also disabling softupdates entire=
ly.
The problem only appears when using UFS with journaled softupdates:

mount                          | delay
-------------------------------|--------
journaled softupdates          | PRESENT
softupdates without journaling | absent
no softupdates                 | absent

I have tried booting in single user mode from various live CDs, letting the
loop script run for a longer time. The problem is most severe on FreeBSD 10=
.3,
is absent on FreeBSD 11.0, and seems back in very diminished form on a
12.0-CURRENT (Thu Apr 20 05:39:11) image. I took the following measurements:

Kernel                     | slow | fast | slow%
---------------------------|------|------|------
10.3-RELEASE-p18 amd64     | 54   | 138  | 28%
11.0-RELEASE-p1 amd64      | 0    | 299  | 0%
12.0-CURRENT r317181 amd64 | 5    | 224  | 2%

I have tried to reproduce this problem on various VMs, but could not get it=
 to
appear. I just have a single bare metal Core i7 SSD machine which has it.

The disk is a:=20

ada0 at ahcich2 bus 0 scbus2 target 0 lun 0
ada0: <OCZ-VERTEX4 1.5> ACS-2 ATA SATA 3.x device
ada0: Serial Number OCZ-****************
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 244198MB (500118192 512 byte sectors)
ada0: quirks=3D0x1<4K>
ada0: Previously was known as ad8

Partition table should be good I think:

=3D>       34  500118125  ada0  GPT  (238G)
         34          6        - free -  (3.0K)
         40        128     1  freebsd-boot  (64K)
        168  490733432     2  freebsd-ufs  (234G)
  490733600    8388608     3  freebsd-swap  (4.0G)
  499122208     995951        - free -  (486M)

I would like to debug further into the kernel but have little dtrace
experience. Any hints would be welcome.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-219049-8>