Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Mar 2015 11:25:07 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        kostikbel@gmail.com
Cc:        src-committers@FreeBSD.org, jilles@stack.nl, svn-src-all@FreeBSD.org, delphij@FreeBSD.org, brde@optusnet.com.au, svn-src-head@FreeBSD.org
Subject:   Re: svn commit: r280308 - head/sys/fs/devfs
Message-ID:  <201503221825.t2MIP7jv096531@gw.catspoiler.org>
In-Reply-To: <20150322162507.GD2379@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 22 Mar, Konstantin Belousov wrote:
> On Sun, Mar 22, 2015 at 02:37:09PM +0100, Jilles Tjoelker wrote:
>> On Sat, Mar 21, 2015 at 08:49:00PM +1100, Bruce Evans wrote:
>> > On Sat, 21 Mar 2015, Xin LI wrote:
>> 
>> > > Log:
>> > >  Disable timestamping on devfs read/write operations by default.
>> 
>> > >  Currently we update timestamps unconditionally when doing read or
>> > >  write operations.  This may slow things down on hardware where
>> > >  reading timestamps is expensive (e.g. HPET, because of the default
>> > >  vfs.timestamp_precision setting is nanosecond now) with limited
>> > >  benefit.
>> 
>> > >  A new sysctl variable, vfs.devfs.dotimes is added, which can be
>> > >  set to non-zero value when the old behavior is desirable.
>> 
>> > I don't like this.  It defaults to non-POSIX-conformant behaviour...
>> 
>> > The slowness is mostly from no delayed update of times in devfs.
>> > Switching vfs.timestamp_precision to a hardware timecounter would
>> > have been even more expensive for regular files if file systems
>> > didn't have delayed updates.  The assumption that vfs_timestamp()
>> > doesn't use a slow timecounter was so often satisfied that no one
>> > missed devfs also not supporting mounting with -noatime.
>> 
>> > Delayed updates are even easier to implement for devfs than for disk
>> > file systems the times never need to be written to disk.  A slow update
>> > is still wasteful for atimes, but not as bad as for disk file systems
>> > since it doesn't trigger a slower sync to disk.
>> 
>> Yes, I think implementing delayed updates is the right solution to this
>> problem. This way, only stat and last close will need to read the clock.
>> No configuration option will be needed.
>> 
>> A subtle difference with most other file systems is that devfs nodes
>> often stay open for very long, so the timestamps will usually come from
>> stat() calls, which may be much later than the actual read or write.
>> Still that is better than not updating timestamps at all.
>> 
>> > > ...
>> > > Modified: head/sys/fs/devfs/devfs_vnops.c
>> > > ==============================================================================
>> > > --- head/sys/fs/devfs/devfs_vnops.c	Sat Mar 21 00:21:30 2015	(r280307)
>> > > +++ head/sys/fs/devfs/devfs_vnops.c	Sat Mar 21 01:14:11 2015	(r280308)
>> > > ...
>> > > @@ -1700,7 +1708,8 @@ devfs_write_f(struct file *fp, struct ui
>> > > 	resid = uio->uio_resid;
>> > >
>> > > 	error = dsw->d_write(dev, uio, ioflag);
>> > > -	if (uio->uio_resid != resid || (error == 0 && resid != 0)) {
>> > > +	if (devfs_dotimes &&
>> > > +	    (uio->uio_resid != resid || (error == 0 && resid != 0))) {
>> > > 		vfs_timestamp(&dev->si_ctime);
>> > > 		dev->si_mtime = dev->si_ctime;
>> > > 	}
>> 
>> > An old bug is evident in the diff.  Writing shouldn't change the ctime.
>> 
>> That is not a bug. POSIX unambiguously requires write() to update both
>> mtime and ctime.
> Does POSIX ever say anything about special files ?
> 
> Devfs already has non-POSIX behaviour, e.g. write on one mount is
> reflected as mtime/ctime update on all mounts. On reboot, the time
> stamps are re-created, i.e. changes are not persistent.
> 
> I think the deviations may be summarrized as 'devfs mtime is useless for
> the usual mtime purposes'.  From this PoV, I have no objections to the
> patch.  Doing extra work with delayed updates of times, which might
> be somewhat non-trivial, is a feature with non-obvious gain.

It's not totally worthless.  I think the mtime on tty devices is used to
calculate the idle time that is printed by the w command.  We just don't
need nanosecond accuracy for that.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201503221825.t2MIP7jv096531>