Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Feb 2000 12:56:52 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Alfred Perlstein <bright@wintelcom.net>
Cc:        hackers@FreeBSD.ORG
Subject:   Re: Syncing a vector of fileoffsets and lengths?
Message-ID:  <200002072056.MAA50540@apollo.backplane.com>
References:  <20000207114042.E25520@fw.wintelcom.net> <200002071938.LAA50114@apollo.backplane.com> <20000207125636.G25520@fw.wintelcom.net>

next in thread | previous in thread | raw e-mail | index | archive | help
:I think two kinds of behavior are needed, ordered range fsync and
:unordered async fsync.
:
:The ordered range could be taken care of easily by your implementation,
:however for maximum effectiveness you'd want to allow for unordered
:async fsync and notification.
:
:The simplest way I can think of doing this keeping a per-process count
:of how many buffers where scheduled for async IO and allowing as many
:async ops to happen, incrementing the count, as each io completes it
:decrements the count and calls wakeup_one once it reaches 0 again.

    "Eeek".

    First, keep in mind that it is not possible to guarentee write ordering
    after the fact even if we had an interface for it.  There are too many
    other subsystems which might flush a buffer out of order - the page
    daemon, update daemon, buf daemon, and clustering code, for example.
    Once the data has been thrown into a filesystem buffer, the game is over.
    (read the last paragraph for more on this).

    You can guarentee write ordering in only one place:  When you actually
    issue the write.  It should be possible to extend this with the AIO
    mechanism to handle the necessary synchronous and fsync cases by adding
    new opcodes, and we can certainly create an AIO call for fsync2, 
    e.g. aio_fsync2(), to handle notification.  This would run on top of
    the fsync2() system call and VOP_FSYNC2() filesystem API.  We can add a 
    link pointer dependancy to the aiocb to guarentee commit ordering or
    even to allow multiple iocb's to be issued in a single system call
    (and run sequentially).  You then issue multiple aio's chained together
    with dependancies and wait for the last one to complete, then wait for the
    previous ones to complete (which will not block since you know they've
    already run once the last one returns).

    What we do not want to do is to create a whole new kernel notification
    mechanism *just* for fsync, nor do we want to pollute the argument space
    up *just* to avoid making multiple system calls.

:I think there's enough fields in the struct buf to support this unordered,
:i'm not sure it will be possible to do this if the application wants
:FIFO async fsync.

    We aren't going to mess with struct buf.  The goal is to simplify 
    struct buf, not complexify it.  Dealing with ordering dependancies 
    properly is difficult at best - look at softupdates for example.  The
    chance of getting it right and not introducing new deadlock or runaway
    bugs in anything under a couple of months is low.  We would be letting
    ourselves in for a world of hurt.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


:What do you think?
:
:-Alfred



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200002072056.MAA50540>