Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Jul 2007 15:25:20 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Gary Palmer <gpalmer@freebsd.org>
Cc:        freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject:   Re: FFS writes to read-only mount
Message-ID:  <20070704133335.I16373@besplex.bde.org>
In-Reply-To: <20070704014205.GA53564@in-addr.com>
References:  <45F776AE.8090702@nokia.com> <20070314161041.GI7847@garage.freebsd.pl> <45F8EE27.6070208@nokia.com> <20070315090031.GB80993@deviant.kiev.zoral.com.ua> <4689EFAA.4080009@nokia.com> <20070704102358.W95084@delplex.bde.org> <20070704014205.GA53564@in-addr.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 3 Jul 2007, Gary Palmer wrote:

> On Wed, Jul 04, 2007 at 11:08:36AM +1000, Bruce Evans wrote:

>> In some non-current versions of FreeBSD, I have debugging code in
>> ffs_update() that complains about attempts to update inodes on read-only
>> file systems.  Such attempts certainly occur, due to historical mistakes.
>> They are supposed to be handled (starting sometime in 4.x) by silently
>> ignoring the problem and clearing the IN_MODIFIED flag and related flags
>> so that the update is not retried later.  I don't know of any cases where
>> this doesn't work.
>
> Does silently clearing that flag mean data could be lost?  Or are these
> just async metadata updates and all the file content is properly
> flushed prior to the FS going RO?

I think at most some timestamps were lost, and then maybe only for the
short time while transitioning from rw to ro.  Timestamps related only
to that transition period _should_ be lost, since it isn't worth
restarting the transition to pick up changes to timestamps alone.  Now
it looks like the hack in ufs_itimes() to write out timestamps related
to before the transition (but not yet finalized) never worked and has
been lost.  Maybe I just don't understand the code and everything now
works without hacks.  I think what should happen for MNT_UPDATE is:

o first call vn_start_write().  Hopefully this prevents all writes from
   userland during the transition.  Writes from the kernel must be permitted
   so as to sync old writes from userland.
o sync all old writes using something a synchronous ffs_sync(), but more
   forceful so as not to forget syncing IN_LAZYMOD inodes.
o set MNT_RDONLY in the vnode for the mount point.  I think this alone was
   supposed to prevent writes from userland.  It works poorly for this since
   it also prevents some writes from the kernel.  E.g., in ufs_itimes() it
   now prevents ufs_itimes() changing anything, so if timestamps haven't
   already been finalized and flushed then there is a bug.  Some old versions
   of ufs_timestamp() starting with ufs_vnops.c 1.182 handled this problem
   badly by setting IN_MODIFIED before checking any readonly flag, but I
   think this did less than nothing since these versions proceeded to check
   MNT_RDONLY and make null changes to the timestamps if that flag is set;
   thus they broke assertions obout no writes to read-only file systems
   without actually syncing old timestamps.
o for ffs, set fs_ronly in the superblock to prevent all writes via the
   file system.  ffs_update() checks this, and this is supposed to permit
   the kernel to update timestamps between the setting of MNT_RDONLY and
   the setting of fs_ronly, but this never worked right.

There are related problems with IN_LAZYMOD and IN_LAZYACCESS.  IN_LAZYMOD
inodes are only fully synced by going through ufs_reclaim() (for ffs).
I think this doesn't happen early enough (if at all) to work for the
rw -> ro transition (it works for unmount()).  This problem is moot
in -current since IN_LAZYMOD is only used for cdevs and there are no
cdevs on ffs.  (I also use it for atime updates but don't test it much
since I also use -noatime for almost all file systems).  Problems with
IN_LAZYACCESS are similar, but are more likely to be all fixed since
they are serious if they occur.  Writes of even atimes by the kernel
must be prevented while taking snapshots.  This is handled by delaying
the atime updates; all other writes are supposed to be prevented by
something like vn_start_write().

Concerning fsck not working after "mount -u -o ro /": fsck generally
doesn't work on mounted file systems, even if the mount is ro.  It
works for the root file system after plain mount only because
ffs_mountfs() has an extra g_access() call to make it work.  This
call is missing for mount -u.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070704133335.I16373>