Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Mar 2013 09:52:28 -0700
From:      Kirk McKusick <mckusick@mckusick.com>
To:        Palle Girgensohn <girgen@freebsd.org>
Cc:        freebsd-fs@freebsd.org, Jeff Roberson <jroberson@jroberson.net>
Subject:   Re: leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume 
Message-ID:  <201303131652.r2DGqSr4051899@chez.mckusick.com>
In-Reply-To: <51405391.1020006@FreeBSD.org> 

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for your report. It is certainly unlike anything that we
have seen reported before. 

Are you running your /usr filesystem with (the default) journalled
soft updates? You can check this by running the `mount' command 
with no arguments.

Rather than rebooting your system, it would be most helpful if you
could instead shut it down to single user. Then do the following:

Create a transcript of your session by running `script'. Once
running in the session run these commands:

Run `mount' to show your filesystem configuration.
Run `df -hi /usr' to see whether the inodes are still missing.
Verify that you can cleanly unmount /usr (e.g., that the unmount
  does not hang and does not complain).
Remount /usr and run `df -hi' to see whether the inodes are
  still missing.
Unmount /usr again and run `fsck_ffs -p -f -d /usr'. If the fsck_ffs
fails with an unexpected inconsistency, you can run `fsck_ffs -y -d /usr'
to force it to clean up. When you have the filesystem successfully
cleaned up, type `exit' to get out of the script session and mail
me the transcript of the session (typescript).

Thanks for your help in tracking this down.

	Kirk McKusick

----- Original Message:

Date: Wed, 13 Mar 2013 11:23:13 +0100
From: Palle Girgensohn <girgen@freebsd.org>
To: freebsd-fs@freebsd.org
Subject: leaking lots of unreferenced inodes (pg_xlog files?), maybe after
 moving tables and indexes to tablespace on different volume

Hi!

Running postgresql-9.2.2 on FreeBSD 9.1 amd64 using vanilla ufs file system.

I have the postgresql base/ on the /usr disk, and a separate volume /opt
where the default tablespace resides. This means that the amount of data
on the /usr disk sould be stable. This is not the case, the disk usage
grows linearly (it seems to leave many inodes unreferenced).

The the discrepancy between df and du is now huge:

# du -sxh /usr; df -h /usr
4,6G	/usr
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/da0s1f    104G     88G    8.0G    92%    /usr

4,6G vs 88GB, that must be more than a rounding error?

Strange thing is I cannot find any open files among the missing.

# lsof /usr| awk '{print $9}'|xargs ls -l > /dev/null

returns no errors (a missing file would render an error with ls). If
there where open files not referenced in any directory, they should be
found.

Next thing is fsck, and yes, there are plenty of unreferenced files.

I ran fsck while system is running (i.e. read only) to get a grip oif
the amount of lost inodes:

fsck /usr | awk '{print $1}'|cut -f 2 -d=| perl -e '$i = 0; while (<>) {
$i += $_;}; print $i / 1024 / 1024; print "\n";'
85223.3530330658

~85 GB gone, that's 80% of the disk, and it accounts fo all the missing
space.

MTIME for the inodes are pretty evenly spread over time since the
machine was updated to FreeBSD 9.1, rebooted, and PostgreSQL was updated
to 9.2. All was done at the same time, so I can't really tell who's to
blaim, but this is the only server, out of a dozen that where updated to
exactly the same versions, that has this problem. All other servers have
their /usr disk usage stable (since all data resides on a separate
tablespace).

The unreferenced inodes are almost exclusively around 16 MB in size, so
they most certainly all are postgresql pg_xlog files. This means all
files are lost from the same portion of code in the database engine.

How could it possibly be able to leave unreferenced inodes around like
this at such a scale? Is the culprit a combination of postgresql and
file system code? Both where updated.

pg_xlog checkpoints seems to happen approximately every three minutes:

Mar 13 00:39:08 dbserver postgres[5298]: [48-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:41:38 dbserver postgres[5298]: [49-1] db=,user= LOG:
checkpoint complete: wrote 2542 buffers (0.3%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=149.667 s, sync=0.101 s,
total=149.770 s; sync files=628, longest=0.021 s, average=0.000 s
Mar 13 00:44:08 dbserver postgres[5298]: [50-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:46:38 dbserver postgres[5298]: [51-1] db=,user= LOG:
checkpoint complete: wrote 3996 buffers (0.4%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=149.438 s, sync=0.111 s,
total=149.551 s; sync files=823, longest=0.006 s, average=0.000 s
Mar 13 00:49:08 dbserver postgres[5298]: [52-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:51:38 dbserver postgres[5298]: [53-1] db=,user= LOG:
checkpoint complete: wrote 13736 buffers (1.4%); 0 transaction log
file(s) added, 0 removed, 2 recycled; write=149.958 s, sync=0.311 s,
total=150.271 s; sync files=1335, longest=0.079 s, average=0.000 s
Mar 13 00:54:08 dbserver postgres[5298]: [54-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:56:38 dbserver postgres[5298]: [55-1] db=,user= LOG:
checkpoint complete: wrote 14638 buffers (1.5%); 0 transaction log
file(s) added, 0 removed, 17 recycled; write=149.330 s, sync=0.271 s,
total=149.603 s; sync files=1363, longest=0.017 s, average=0.000 s
Mar 13 00:59:08 dbserver postgres[5298]: [56-1] db=,user= LOG:
checkpoint starting: time
Mar 13 01:01:38 dbserver postgres[5298]: [57-1] db=,user= LOG:
checkpoint complete: wrote 8035 buffers (0.8%); 0 transaction log
file(s) added, 0 removed, 21 recycled; write=149.285 s, sync=0.146 s,
total=149.433 s; sync files=1160, longest=0.003 s, average=0.000 s
Mar 13 01:04:08 dbserver postgres[5298]: [58-1] db=,user= LOG:
checkpoint starting: time
Mar 13 01:06:37 dbserver postgres[5298]: [59-1] db=,user= LOG:
checkpoint complete: wrote 2156 buffers (0.2%); 0 transaction log
file(s) added, 0 removed, 9 recycled; write=149.402 s, sync=0.057 s,
total=149.461 s; sync files=610, longest=0.000 s, average=0.000 s
Mar 13 01:09:08 dbserver postgres[5298]: [60-1] db=,user= LOG:
checkpoint starting: time


I'm pretty certain that unmounting the file system and running fsck will
regain the lost space, but will it stop there?

Stopping postgresql briefly did not help, I tried that. That would have
helped if the files where open, but they're not. It seems to postgresql
did the right thing, and FreeBSD failed to unreference the files.

The server has about 30 databases and ~127 concurrent connections (not
all beeing active simultaneously, though), so it is fair to say it is
pretty active, but nothing extreme.

Hardware is HP DL360, using their HT Smart Array P410i.

Any ideas how to debug this? Or shall I just reboot, fsck, hope the
problem will go away, and when it does, forget about it?

Thanks,
Palle
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJRQFORAAoJEIhV+7FrxBJDzVUIAJHU011JDxLxj8/xg05Gwhgq
XK3xB+0N0NSUQ50yhcRKLINz/j/XfeS0ZxlH+MstaPA9y0r1JUXMxkb/uTUvGBiy
jutk3eVe0cati9cVZbJkRU5FxEgmQ0fg0GOMl3RQAErkh5achj+klWvN7PnwGjTs
O3L9RgckKuxTJffk52GAS05qY/TKR6f08kdX3I2cFtqw3tyTyrXU0JPdk2snuPhv
H40xV46zgtWMFDvZLt61MryQ7/JotVQwU78scUB+zxrf8KKM9V0mM7pk0pIbG4Qw
NJBpZJ5gjbl4x+dkQrtZdL65yq88hACYwo9D+83Ct4ig8tgcQ7ViNHWxJqknK7Q=
=3ZZs
-----END PGP SIGNATURE-----
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201303131652.r2DGqSr4051899>