Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 07 Oct 2003 01:11:11 +0200
From:      "Poul-Henning Kamp" <phk@phk.freebsd.dk>
To:        Scott Long <scottl@freebsd.org>
Cc:        Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
Subject:   Re: Alignment of disk-I/O from userland. 
Message-ID:  <27374.1065481871@critter.freebsd.dk>
In-Reply-To: Your message of "Mon, 06 Oct 2003 16:44:32 MDT." <20031006163218.L55190@pooker.samsco.home> 

next in thread | previous in thread | raw e-mail | index | archive | help
In message <20031006163218.L55190@pooker.samsco.home>, Scott Long writes:

>We already
>have the busdma interface whose sole purpose is to take system
>buffers and prepare them for transfer to/from hardware [...]

I certainly do agree that _if_ we do want to do copy/align busdma would
be a good place for it.

>As for returning an error code for a buffer that we (arbitrarily) believe
>to be too big to align, [...]

I have never advocated returning an error based on "alignment and size",
only based on alignment alone.


But I also just realized a complication I had not thought of earlier,
and which may modify our thinking further:  This is an issue for
all physread()/physwrite() drivers, not just disks.

In other words, if I want to write to 1MB blocks to a SCSI tape,
and I don't align my in memory buffer sufficiently for the hardware,
busdma would have to allocate 1MB of memory (it may _possibly_ be
able to do so as disjunct pages rather than consequtively) and copy
the entire request over.

For disks we can chop the request at sector boundaries or multiple
thereoff and deal with it that way, but we don't have that option
for scsi_sa or even scsi_pt devices.

Currently we impose a 128k upper limit on I/O requests, but we have
already more or less agreed that needs to grow into the 4-16MB range
soon.

The more I think about it, there more arguments I find for retaining
the status quo of requiring userland to do proper alignment (but
with better error-checking).

Particularly since the only unaligned case I know of yet, newfs(8),
is by trivial accident rather than need or intent.

The question of how to communicate the alignment required to
userland has been raised.  I propose this answer:

    Suffient alignment can be obtained by any one of these methods:
	1.  Allocate your buffer with malloc(3).
	2.  Align it to the request size.
	3.  Align it to a page.

(The first is somewhat dependent on the behaviour of phkmalloc, and
can be removed, but it offers a nice clean shortcut for most
programmers.)

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?27374.1065481871>