Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Sep 1997 06:17:23 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        dyson@FreeBSD.ORG
Cc:        syssgm@dtir.qld.gov.au, freebsd-current@FreeBSD.ORG
Subject:   Re: New timeout capability (was Re: cvs commit:....)
Message-ID:  <199709240617.XAA04899@usr07.primenet.com>
In-Reply-To: <199709230920.EAA00190@dyson.iquest.net> from "John S. Dyson" at Sep 23, 97 04:20:00 am

next in thread | previous in thread | raw e-mail | index | archive | help
> I could possibly imagine a reasonable use for a 16K basic allocation size.

8k is where I typically stop, mostly because of frag size.  1k frags
are about my limit.  8-).


> I think that 4K performs pretty darned well anyway though.  In the
> real world, I wouldn't think that one would see much of a performance
> difference between 4K and 16K. 

For 8k, there used to be about a 40% improvement over 4k for iozone; I
haven't really tried this for about 5 moths now, though.

I expect a bit of a drop for 16k because of the 2k frags, actually.

I'd thing that 32k would go back up -- perhaps way up -- because of
4k page aligned frags being good for you.

It really matters how sequentially you are accessing your files.

For random writes less not equal to 4k, there is a requirement of
read-before-write.  Technically, you could take this down to 512b,
since the VM has the bitmap for it.  If so, block sizes over 4k
(with frags larger than a disk block) would get relatively more
expensive *fast*, as long as you were doing I/O on block boundries.

I'm not sure whether I/O on a block boundry for a page causes a read
before write or not.  It probably does; this is technically not needed,
so theres a tiny optimization there for better iozone numbers.  8-).

If the read-before-write could be done on a block basis using a block
bitmap to indicate which 512b chunks had been read and which hadn't,
and you were guaranteed read-before-write, and if you wrote a whole
block, you'd map it read without reading, and you respected this bitmap
when responding to the dirty bit, well... that'd be a lot of work.  8-).
It would also give a more uniform win for block aligned accesses in
block increments (ndbm?), and certainly make IOZONE happier, as well
as making MSDOS FS happier.

So to recap, a 512b aligned write of block 3 in a new 4k page would
result in b00001000 in the bitmap, and the dirty bit set on the page.
A 43b write in block 5 not crossing a block boundry would result in
b00100000 in the bitmap, a 512b read of that block from disk, and a
43b write somewhere in the block, with the dirty bit set on the page.
Probably a usesful optimization for fixed size record based random
record I/O for records 2k or smaller (so page locality is less of an
issue, and so that you shouldn't just read the whole page anyway).

I don't know what the impact would be on the pager in the general
case; probably not pretty at all, actually.  Maybe John could comment
(probably to csay I'm insane ;-)).


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199709240617.XAA04899>