Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Oct 2011 21:11:08 +0200
From:      Miroslav Lachman <000.fbsd@quip.cz>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        freebsd-fs@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject:   Re: dirhash and dynamic memory allocation
Message-ID:  <4EA1C3CC.3090500@quip.cz>
In-Reply-To: <20111021162025.GA89885@icarus.home.lan>
References:  <4E97FEDD.7060205@quip.cz> <j7938v$66s$1@dough.gmane.org> <4EA19203.5050503@quip.cz> <20111021162025.GA89885@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
Jeremy Chadwick wrote:
> On Fri, Oct 21, 2011 at 05:38:43PM +0200, Miroslav Lachman wrote:
>> Hi, I am back on this topic...
>>
>> Ivan Voras wrote:
>>> On 14/10/2011 11:20, Miroslav Lachman wrote:
>>>> Hi all,
>>>>
>>>> I tried some tuning of dirhash on our servers and after googlig a bit, I
>>>> found an old GSoC project wiki page about Dynamic Memory Allocation for
>>>> Dirhash: http://wiki.freebsd.org/DirhashDynamicMemory
>>>> Is there any reason not to use it / not commit it to HEAD?
>>>
>>> AFAIK it's sort-of already present. In 8-stable and recent kernels you
>>> can give huge amounts of memory to dirhash via vfs.ufs.dirhash_maxmem
>>> (but except in really large edge cases I don't think you *need* more
>>> than 32 MB), and the kernel will scale-down or free the memory if not
>>> needed.
>>>
>>> In effect, vfs.ufs.dirhash_maxmem is the upper limit - the kernel will
>>> use less and will free the allocated memory in low memory situations
>>> (which I've tried and it works).
>>
>> So the current behavior is that on 7.3+ and 8.x we have smaller
>> average dirhash buffer (by default) than it was initialy 10 years
>> ago. Because it starts as 2MB fixed size and now we have 2MB max,
>> which is lowered in low mem situations... and sometimes it is set to
>> 0MB!
>>
>> I caught this 2 days ago:
>>
>> root@rip ~/# sysctl vfs.ufs
>> vfs.ufs.dirhash_reclaimage: 5
>> vfs.ufs.dirhash_lowmemcount: 36953
>> vfs.ufs.dirhash_docheck: 0
>> vfs.ufs.dirhash_mem: 0
>> vfs.ufs.dirhash_maxmem: 8388608
>> vfs.ufs.dirhash_minsize: 2560
>>
>> I set maxmem to 8MB in sysctl.conf to increase performance and
>> dirhash_mem 0 is really bad surprise!
>
> Actually, the "bad surprise" is dirhash_lowmemcount of 36953.  You
> increasing dirhash_maxmem is fine -- what you're seeing is that your
> machine keeps running out of memory, or that your directories are filled
> with so many files that you're exhausting the dirhash repetitively.
>
> I'm going to be blunt and just ask it: why does that happen?  Or do you
> have a filesystem that has an absurdly high number of files in a single
> directory?  If the former, ignore the next paragraph

There are not absurdly high number of files in a single directory, 
because I know this potential problem and I am fighting against it with 
webapp developers. But I see similar lowmemcount on almost all UFS based 
servers. Most of them are for webhosting (running OpenSource or 
proprietary CMS, so the most content is in MySQL). Many of our servers 
have long uptime (about or more than year), so the lowmemcount numbers 
are higher on them. Webservers are hosting about 100-150 websites.

Examples from 4 of our servers:

vfs.ufs.dirhash_lowmemcount: 45295
up 39 days

vfs.ufs.dirhash_lowmemcount: 164782
up 419 days

vfs.ufs.dirhash_lowmemcount: 391452
up 102 days

vfs.ufs.dirhash_lowmemcount: 633202
up 417 days

Only few of our servers have lowmemcount lower than 1000 (but stil 
higher than 500)

One example is server with jails, where UFS is used only for host 
system, and jails are on ZFS.

This server has 4GB of RAM and 362MB used swap space:

vfs.ufs.dirhash_lowmemcount: 936
up 284 days

> I've harped on this before on the mailing list: one of the first things
> I learned as a system administrator was that you Do Not(tm) fill
> directories with tens of thousands of files.  Split them up into
> subdirs.  Even caching daemons (squid, etc.) support this kind of thing;
> filename "aj1j11hsfkqXaj21" should really be aj/1j/11hsfkqXaj21.  You
> get the idea.  DNS/BIND administrators of systems which have tens of
> thousands of domains are quite familiar with this scenario too.
>
>> I am worrying about bad performance in situation where dirhash is
>> emptied in situations, where server is already running at maximum
>> performance (there is some memory hungry process and system can
>> start swapping to disk + dirhash is efectively disabled)
>>
>> I found a PR kern/145246
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=145246
>>
>> Is it possible to add some dirhash_minmem limit to not clear all the
>> dirhash memory?
>> So I can set dirhash_minmem=2MB dirhash_maxmem=16MB and then
>> dirhash_mem will be allways between these two limits?
>
> dirhash shouldn't be "disabled", it's that memory pressure from other
> things has priority over the dirhash, but I understand what you mean.
> This is quite evident from dirhash_lowmemcount being so high.
>
> I understand what you want, and maybe there is a way to get what you
> want (with little effort), but I am strongly inclined to say you need to
> figure out on your system what is causing such memory pressure and solve
> that.  Honest: try to solve the real problem rather than dancing around
> it.  If you have a process that skyrockets in RSS/RES usage due to a
> memory leak or out-of-control design (such as a daemonised perl script
> which blindly uses .= to append data to a scalar, or blindly keeps
> appending data to the end of a list), then fix that problem.

As the servers are running 3rd party apps (customer's websites), it is 
out of my control to fix issues with PHP CMS etc. So low memory fix "is 
easy" - buy and add more RAM.

> Basically I'm trying to say that it shouldn't be the responsibility of
> dirhash to "work around" other problems happening on the system that
> diminish or exhaust available memory.  You end up with a kernel design
> that has tons of one-offs in it and that does nothing but bite you in
> the butt down the road.  (Linux has been through this many times over.)

You are partially right. But dirhash lowmemhook seems too sensitive to 
me. I see high lowmemcount numbers on systems with almost empty swap. 
(few kB in swap, not MBs) That's why I am looking for dirhash_minmem.

Miroslav Lachman




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EA1C3CC.3090500>