Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Jun 2016 13:07:15 -0700
From:      Bryan Drewery <bdrewery@FreeBSD.org>
To:        Mateusz Guzik <mjguzik@gmail.com>, freebsd-pkg@freebsd.org
Subject:   Re: pkg vs namecache
Message-ID:  <73fee5f2-5a18-7eb1-be66-c0ea981f15de@FreeBSD.org>
In-Reply-To: <20160613232827.GA6841@dft-labs.eu>
References:  <20160613232827.GA6841@dft-labs.eu>

next in thread | previous in thread | raw e-mail | index | archive | help
On 6/13/16 4:28 PM, Mateusz Guzik wrote:
> Hello,
>=20
> it is not a secret that poudriere builds expose a lot of contention in
> the kernel.
>=20
> I performed several tests few months back. On a 24-way machine, once al=
l
> distfiles are downloaded, the top contending lock belongs to the
> namecache. The namecache is unfortunately protected with one big lock
> taken in shared or exclusive mode. I have a WIP patch which introduces
> fine grained locking and in effect pushes it way down. It is unlikely t=
o
> hit 11.0, but 11.1 is an option.
>=20
> Regardless, if performance is a concern, userspace should not do
> unnecessary stuff (which in this case means hammering the namecache for
> no reason).
>=20
> Playing around with dtrace e.g. like this:
> dtrace -n 'vfs:namecache:: { @[execname] =3D count(); } tick-30s {
> exit(0); }'
>=20
> shows that pkg-static is amongst top namecache users and further
> inspection suggests that several (if not most) lookups are unnecessary.
>=20
> For any path-based operation, each component needs to be looked up.
> Lookups of non-cached files add entries, which blocks other consumers.
>=20
> Note that the startup procedure for dynamically linked binaries is
> wasteful to an extent and it may beget fixed later.
>=20
> With this out of the way, let's have a look what pkg 1.8.4_1 does when
> prompted to install rexima-1.4. I only left entries I consider
> suspicious + enough context to see what's up.=20
>=20
> Would be nice if this stuff got fixed. Thaks.
>=20
> 63984: __getcwd("/root",1024)			 =3D 0 (0x0)
>=20
> Why?
>=20
> 63984: open("/etc/pkg/",O_NONBLOCK|O_DIRECTORY|O_CLOEXEC,041673240) =3D=
 4 (0x4)
> 63984: fstatfs(4,{ fstypename=3Dzfs,mntonname=3D/,mntfromname=3Dzrootmj=
g,fsid=3Da294c20dde42cb7d }) =3D 0 (0x0)
> 63984: getdirentries(0x4,0x80305f000,0x1000,0x80305c268) =3D 72 (0x48)
> 63984: getdirentries(0x4,0x80305f000,0x1000,0x80305c268) =3D 0 (0x0)
> 63984: close(4)					 =3D 0 (0x0)
> 63984: __getcwd("/root",1024)			 =3D 0 (0x0)
>=20
> Another __getcwd. If the first one was really needed (was it?), the

I think an important point to remember here is that pkg itself uses a
lot of other libraries that do redundant things.  Libc and the kernel
should be optimized more for doing such things.  getcwd() should in
theory not be expensive to call and should cache the value somewhere in
userspace and change it on [f]chdir(2).  Of course this breaks ENOENT,
but tradeoffs are needed to fix our expensive syscalls IMO.

getpid() is another that should be a free lunch every time IMO.

> result should be remembered. This particular run got 7 getcwds in total=
,
> bigger packages take more.
>=20
> 63984: lstat("/etc",{ mode=3Ddrwxr-xr-x ,inode=3D17,size=3D113,blksize=3D=
7680 }) =3D 0 (0x0)
> 63984: lstat("/etc/pkg",{ mode=3Ddrwxr-xr-x ,inode=3D36,size=3D4,blksiz=
e=3D4096 }) =3D 0 (0x0)
> 63984: lstat("/etc/pkg/FreeBSD.conf",{ mode=3D-rw-r--r-- ,inode=3D32163=
7,size=3D436,blksize=3D4096 }) =3D 0 (0x0)
> 63984: stat("/etc/pkg/FreeBSD.conf",{ mode=3D-rw-r--r-- ,inode=3D321637=
,size=3D436,blksize=3D4096 }) =3D 0 (0x0)
> 63984: openat(AT_FDCWD,"/etc/pkg/FreeBSD.conf",O_RDONLY,00) =3D 4 (0x4)
>=20

This particular pattern looks like realpath(3).  It was hugely
problematic for buildworld and bmake's processing of .meta files with
filemon data.  It would spam a ton of realpath(3) that led to a 48
minutes NOP build.  We added in a simple realpath(3) cache in bmake
which dropped it down to 8 minutes.  The caching fix was tested on Linux
and saw no improvement there since their realpath(3) was already fast.

For pkg, indeed it is from UCL's ucl_parser_add_file() ->
ucl_parser_add_file_priority() which does a realpath(3), stat(2) and
then open(2) on the file provided.

It seems odd that UCL would need to realpath(3) the file being opened.

> Why not just open the file. If some kind of includes are supported, you
> can do the last stat and then compare inode + device id pairs.
>=20
> 63984: munmap(0x80064f000,436)			 =3D 0 (0x0)
> 63984: open("/usr/local/etc/pkg/repos/",O_NONBLOCK|O_DIRECTORY|O_CLOEXE=
C,041673240) ERR#2 'No such file or directory'
> 63984: stat("/var/db/pkg",{ mode=3Ddrwxr-xr-x ,inode=3D78,size=3D5,blks=
ize=3D4096 }) =3D 0 (0x0)
> 63984: eaccess("/var/db/pkg",R_OK|W_OK)		 =3D 0 (0x0)
> 63984: stat("/var/db/pkg/local.sqlite",{ mode=3D-rw-r--r-- ,inode=3D475=
5,size=3D11584512,blksize=3D131072 }) =3D 0 (0x0)
> 63984: eaccess("/var/db/pkg/local.sqlite",R_OK|W_OK) =3D 0 (0x0)
> 63984: stat("/var/db/pkg/repo-FreeBSD.sqlite",{ mode=3D-rw-r--r-- ,inod=
e=3D323679,size=3D46235648,blksize=3D131072 }) =3D 0 (0x0)
> 63984: eaccess("/var/db/pkg/repo-FreeBSD.sqlite",R_OK|W_OK) =3D 0 (0x0)
> 63984: stat("/var/db/pkg",{ mode=3Ddrwxr-xr-x ,inode=3D78,size=3D5,blks=
ize=3D4096 }) =3D 0 (0x0)
> 63984: eaccess("/var/db/pkg",R_OK|W_OK)		 =3D 0 (0x0)
> 63984: stat("/var/db/pkg/repo-FreeBSD.sqlite",{ mode=3D-rw-r--r-- ,inod=
e=3D323679,size=3D46235648,blksize=3D131072 }) =3D 0 (0x0)
> 63984: eaccess("/var/db/pkg/repo-FreeBSD.sqlite",R_OK|W_OK) =3D 0 (0x0)
> 63984: write(1,"Updating FreeBSD repository cata"...,41) =3D 41 (0x29)
> 63984: statfs("/var/db/pkg",{ fstypename=3Dzfs,mntonname=3D/,mntfromnam=
e=3Dzrootmjg,fsid=3Da294c20dde42cb7d }) =3D 0 (0x0)
> 63984: access("/var/db/pkg/FreeBSD.meta",R_OK)	 =3D 0 (0x0)
> 63984: __getcwd("/root",1024)			 =3D 0 (0x0)
> 63984: lstat("/var",{ mode=3Ddrwxr-xr-x ,inode=3D58,size=3D25,blksize=3D=
4096 }) =3D 0 (0x0)
> 63984: lstat("/var/db",{ mode=3Ddrwxr-xr-x ,inode=3D72,size=3D22,blksiz=
e=3D4096 }) =3D 0 (0x0)
> 63984: lstat("/var/db/pkg",{ mode=3Ddrwxr-xr-x ,inode=3D78,size=3D5,blk=
size=3D4096 }) =3D 0 (0x0)
> 63984: lstat("/var/db/pkg/FreeBSD.meta",{ mode=3D-rw-r--r-- ,inode=3D32=
3678,size=3D246,blksize=3D4096 }) =3D 0 (0x0)
> 63984: stat("/var/db/pkg/FreeBSD.meta",{ mode=3D-rw-r--r-- ,inode=3D323=
678,size=3D246,blksize=3D4096 }) =3D 0 (0x0)
> 63984: openat(AT_FDCWD,"/var/db/pkg/FreeBSD.meta",O_RDONLY,00) =3D 4 (0=
x4)
>=20
> Again, several passes over /var/db/pkg.
>=20
> 63984: stat("/etc/nsswitch.conf",{ mode=3D-rw-r--r-- ,inode=3D14316,siz=
e=3D255,blksize=3D4096 }) =3D 0 (0x0)
> 63984: open("/etc/nsswitch.conf",O_CLOEXEC,0666) =3D 7 (0x7)
> 63984: ioctl(7,TIOCGETA,0xffff90f0)		 ERR#25 'Inappropriate ioctl for d=
evice'
> 63984: fstat(7,{ mode=3D-rw-r--r-- ,inode=3D14316,size=3D255,blksize=3D=
4096 }) =3D 0 (0x0)
> 63984: read(7,"#\n# nsswitch.conf(5) - name ser"...,4096) =3D 255 (0xff=
)
> 63984: read(7,0x80305f000,4096)			 =3D 0 (0x0)
> 63984: access("/usr/local/lib/nss_compat.so.1",F_OK) ERR#2 'No such fil=
e or directory'
> 63984: access("/lib/nss_compat.so.1",F_OK)	 ERR#2 'No such file or dire=
ctory'
> 63984: access("/usr/lib/nss_compat.so.1",F_OK)	 ERR#2 'No such file or =
directory'
> 63984: access("/usr/lib/compat/nss_compat.so.1",F_OK) ERR#2 'No such fi=
le or directory'
> 63984: access("/usr/local/lib/nss_compat.so.1",F_OK) ERR#2 'No such fil=
e or directory'
> 63984: access("/usr/local/lib/gcc5/nss_compat.so.1",F_OK) ERR#2 'No suc=
h file or directory'
> 63984: access("/usr/local/lib/perl5/5.20/mach/CORE/nss_compat.so.1",F_O=
K) ERR#2 'No such file or directory'
> 63984: access("/usr/local/llvm36/lib/nss_compat.so.1",F_OK) ERR#2 'No s=
uch file or directory'
> 63984: access("/usr/local/llvm37/lib/nss_compat.so.1",F_OK) ERR#2 'No s=
uch file or directory'
> 63984: access("/lib/casper/nss_compat.so.1",F_OK) ERR#2 'No such file o=
r directory'
> 63984: access("/lib/nss_compat.so.1",F_OK)	 ERR#2 'No such file or dire=
ctory'
> 63984: access("/usr/lib/nss_compat.so.1",F_OK)	 ERR#2 'No such file or =
directory'
> 63984: access("/usr/local/lib/nss_nis.so.1",F_OK) ERR#2 'No such file o=
r directory'
> 63984: access("/lib/nss_nis.so.1",F_OK)		 ERR#2 'No such file or direct=
ory'
> 63984: access("/usr/lib/nss_nis.so.1",F_OK)	 ERR#2 'No such file or dir=
ectory'
> 63984: access("/usr/lib/compat/nss_nis.so.1",F_OK) ERR#2 'No such file =
or directory'
> 63984: access("/usr/local/lib/nss_nis.so.1",F_OK) ERR#2 'No such file o=
r directory'
> 63984: access("/usr/local/lib/gcc5/nss_nis.so.1",F_OK) ERR#2 'No such f=
ile or directory'
> 63984: access("/usr/local/lib/perl5/5.20/mach/CORE/nss_nis.so.1",F_OK) =
ERR#2 'No such file or directory'
> 63984: access("/usr/local/llvm36/lib/nss_nis.so.1",F_OK) ERR#2 'No such=
 file or directory'
> 63984: access("/usr/local/llvm37/lib/nss_nis.so.1",F_OK) ERR#2 'No such=
 file or directory'
> 63984: access("/lib/casper/nss_nis.so.1",F_OK)	 ERR#2 'No such file or =
directory'
> 63984: access("/lib/nss_nis.so.1",F_OK)		 ERR#2 'No such file or direct=
ory'
> 63984: access("/usr/lib/nss_nis.so.1",F_OK)	 ERR#2 'No such file or dir=
ectory'
> 63984: access("/usr/local/lib/nss_files.so.1",F_OK) ERR#2 'No such file=
 or directory'
> 63984: access("/lib/nss_files.so.1",F_OK)	 ERR#2 'No such file or direc=
tory'
> 63984: access("/usr/lib/nss_files.so.1",F_OK)	 ERR#2 'No such file or d=
irectory'
> 63984: access("/usr/lib/compat/nss_files.so.1",F_OK) ERR#2 'No such fil=
e or directory'
> 63984: access("/usr/local/lib/nss_files.so.1",F_OK) ERR#2 'No such file=
 or directory'
> 63984: access("/usr/local/lib/gcc5/nss_files.so.1",F_OK) ERR#2 'No such=
 file or directory'
> 63984: access("/usr/local/lib/perl5/5.20/mach/CORE/nss_files.so.1",F_OK=
) ERR#2 'No such file or directory'
> 63984: access("/usr/local/llvm36/lib/nss_files.so.1",F_OK) ERR#2 'No su=
ch file or directory'
> 63984: access("/usr/local/llvm37/lib/nss_files.so.1",F_OK) ERR#2 'No su=
ch file or directory'
> 63984: access("/lib/casper/nss_files.so.1",F_OK) ERR#2 'No such file or=
 directory'
> 63984: access("/lib/nss_files.so.1",F_OK)	 ERR#2 'No such file or direc=
tory'
> 63984: access("/usr/lib/nss_files.so.1",F_OK)	 ERR#2 'No such file or d=
irectory'
> 63984: access("/usr/local/lib/nss_dns.so.1",F_OK) ERR#2 'No such file o=
r directory'
> 63984: access("/lib/nss_dns.so.1",F_OK)		 ERR#2 'No such file or direct=
ory'
> 63984: access("/usr/lib/nss_dns.so.1",F_OK)	 ERR#2 'No such file or dir=
ectory'
> 63984: access("/usr/lib/compat/nss_dns.so.1",F_OK) ERR#2 'No such file =
or directory'
> 63984: access("/usr/local/lib/nss_dns.so.1",F_OK) ERR#2 'No such file o=
r directory'
> 63984: access("/usr/local/lib/gcc5/nss_dns.so.1",F_OK) ERR#2 'No such f=
ile or directory'
> 63984: access("/usr/local/lib/perl5/5.20/mach/CORE/nss_dns.so.1",F_OK) =
ERR#2 'No such file or directory'
> 63984: access("/usr/local/llvm36/lib/nss_dns.so.1",F_OK) ERR#2 'No such=
 file or directory'
> 63984: access("/usr/local/llvm37/lib/nss_dns.so.1",F_OK) ERR#2 'No such=
 file or directory'
> 63984: access("/lib/casper/nss_dns.so.1",F_OK)	 ERR#2 'No such file or =
directory'
> 63984: access("/lib/nss_dns.so.1",F_OK)		 ERR#2 'No such file or direct=
ory'
> 63984: access("/usr/lib/nss_dns.so.1",F_OK)	 ERR#2 'No such file or dir=
ectory'
>=20
> Now that's a bummer, but likely not a pkg issue.
>=20
> 63984: mkdir("/tmp",0777)			 ERR#17 'File exists'
>=20
> Why?
>=20
> 63984: getpid()					 =3D 63984 (0xf9f0)
> 63984: stat("/tmp",{ mode=3Ddrwxrwxrwt ,inode=3D4,size=3D76,blksize=3D1=
6384 }) =3D 0 (0x0)
> 63984: open("/tmp/packagesite.yaml.txz.DlhH4v",O_RDWR|O_CREAT|O_EXCL,06=
00) =3D 5 (0x5)
> 63984: unlink("/tmp/packagesite.yaml.txz.DlhH4v") =3D 0 (0x0)
> 63984: unlink("")				 ERR#2 'No such file or directory'
>=20
> ?
>=20
> 3984: stat("/etc/nsswitch.conf",{ mode=3D-rw-r--r--
> ,inode=3D14316,size=3D255,blksize=3D4096 }) =3D 0 (0x0)
> 63984: geteuid()                                 =3D 0 (0x0)
> 63984: open("/etc/spwd.db",O_CLOEXEC,00)         =3D 12 (0xc)
> 63984: fstat(12,{ mode=3D-rw------- ,inode=3D319614,size=3D40960,blksiz=
e=3D40960
> }) =3D 0 (0x0)
> 63984: read(12,"\0\^F\^Ua\0\0\0\^B\0\0\^D\M-R\0"...,260) =3D 260 (0x104=
)
> 63984: pread(0xc,0x803064000,0x1000,0x6000)      =3D 4096 (0x1000)
> 63984: pread(0xc,0x803063000,0x1000,0x4000)      =3D 4096 (0x1000)
> 63984: close(12)                                 =3D 0 (0x0)
> 63984: stat("/etc/nsswitch.conf",{ mode=3D-rw-r--r--
> ,inode=3D14316,size=3D255,blksize=3D4096 }) =3D 0 (0x0)
> 63984: open("/etc/group",O_CLOEXEC,0666)         =3D 12 (0xc)
> 63984: fstat(12,{ mode=3D-rw-r--r-- ,inode=3D319611,size=3D500,blksize=3D=
4096 })
> =3D 0 (0x0)
> 63984: lseek(12,0x0,SEEK_CUR)                    =3D 0 (0x0)
> 63984: lseek(12,0x0,SEEK_SET)                    =3D 0 (0x0)
> 63984: read(12,"# $FreeBSD$\n#\nwheel:*:0:root\n"...,4096) =3D 500 (0x1=
f4)
> 63984: close(12)                                 =3D 0 (0x0)
>=20
> This repeats itself a lot. I wonder if that's an iffeciency in libc or
> pkg.

I think some of this and the nss stuff earlier is purely libc.  There's
also timezone lookups in libc that are done quite often that can be
optimized.  I did some work on this at Isilon to help lookups on the
timezone data but it's not really safe for upstream libc since it is
dangerous for long-living processes.

For NSS perhaps we could use a database to avoid all of the module
lookup syscall spam by only doing it when updating the nssswitch.conf fil=
e.

>=20
> 63984: ioctl(1,TIOCGETA,0xffffced0)		 =3D 0 (0x0)
> 63984: write(1,"[1/1] Extracting rexima-1.4:   0"...,33) =3D 33 (0x21)
> 63984: mkdirat(11,"usr",0755)			 ERR#17 'File exists'
> 63984: mkdirat(11,"usr/local",0755)		 ERR#17 'File exists'
> 63984: mkdirat(11,"usr/local/bin",0755)		 ERR#17 'File exists'

It would be better if the do_extract pre-calculated a list of
directories needed and made them all then rather than on every single
file extraction trying mkdirat_p() again.  And then also optimize
mkdirs/mkdirat_p to check if the path already exists before walking the
tree as you suggest.

> 63984: openat(11,"usr/local/bin/rexima.DcozrktiDx8Y",O_WRONLY|O_CREAT|O=
_EXCL,0755) =3D 12 (0xc)
> 63984: fstat(12,{ mode=3D-rwxr-xr-x ,inode=3D4182,size=3D0,blksize=3D13=
1072 }) =3D 0 (0x0)
>=20
> why fstat?

It is archive_read_data_into_fd(). Pkg opens it then calls
archive_read_data_into_fd:

can_lseek =3D (fstat(fd, &st) =3D=3D 0) && S_ISREG(st.st_mode);
if (!can_lseek)
        nulls =3D calloc(1, nulls_size);

>=20
> In the common case you can expect these directories to already be there=
.
> So how about instead you try the open first and only if it fails you tr=
y
> to create the tree. But you start trying from the top (here: "bin").
>=20
> 63984:
> fstatat(11,"usr/local/bin/rexima",0x7fffffffd5f0,AT_SYMLINK_NOFOLLOW)
> ERR#2 'No such file or directory'
> 63984:
> renameat(11,"usr/local/bin/rexima.DcozrktiDx8Y",11,"usr/local/bin/rexim=
a")
> =3D 0 (0x0)
>=20
> Why not just replace?
>=20

The fstat is done to check for flag schg and remove if there.

Worse though is that pkg is now unlinking before rename which
invalidates half of the whole point of using rename (atomic and not
leaving files missing).

--=20
Regards,
Bryan Drewery



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?73fee5f2-5a18-7eb1-be66-c0ea981f15de>