Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Aug 2014 20:27:55 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Mateusz Guzik <mjguzik@gmail.com>, freebsd-hackers@freebsd.org
Subject:   Re: atomic_load_acq_int in sequential_heuristic
Message-ID:  <20140825172755.GD2737@kib.kiev.ua>
In-Reply-To: <20140825130433.GD14344@dft-labs.eu>
References:  <20140824115729.GC2045@dft-labs.eu> <20140824162331.GW2737@kib.kiev.ua> <20140824164236.GX2737@kib.kiev.ua> <20140825005659.GA14344@dft-labs.eu> <20140825073404.GZ2737@kib.kiev.ua> <20140825081526.GB14344@dft-labs.eu> <20140825083539.GB2737@kib.kiev.ua> <20140825091056.GC14344@dft-labs.eu> <20140825111000.GC2737@kib.kiev.ua> <20140825130433.GD14344@dft-labs.eu>

next in thread | previous in thread | raw e-mail | index | archive | help

--Y3cq2BYpkEb43po+
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Aug 25, 2014 at 03:04:33PM +0200, Mateusz Guzik wrote:
> On Mon, Aug 25, 2014 at 02:10:01PM +0300, Konstantin Belousov wrote:
> > On Mon, Aug 25, 2014 at 11:10:56AM +0200, Mateusz Guzik wrote:
> > > On Mon, Aug 25, 2014 at 11:35:39AM +0300, Konstantin Belousov wrote:
> > > > > +			atomic_set_int(&fp->f_flag, FHASLOCK);
> > > > You misspelled FRDAHEAD as FHASLOCK, below as well.
> > > > Was this tested ?
> > > >=20
> > >=20
> > > Oops, damn copy-pasto. Sorry.
> > >=20
> > > > > +			VOP_UNLOCK(vp, 0);
> > > > >  		} else {
> > > > > -			do {
> > > > > -				new =3D old =3D fp->f_flag;
> > > > > -				new &=3D ~FRDAHEAD;
> > > > > -			} while (!atomic_cmpset_rel_int(&fp->f_flag, old, new));
> > > > > +			atomic_clear_int(&fp->f_flag, FHASLOCK);
> > > > So what about extending the vnode lock to cover the flag reset ?
> > > >=20
> > >=20
> > > Sure.
> > >=20
> > > So this time I tested it properly and found out it is impossible to
> > > disable the hint. The test is:
> > >=20
> > > -1 is truncated and then read from intptr_t which yields a big positi=
ve
> > > number instead. Other users in the function use int tmp to work around
> > > this issue.
> > Could you provide me with the test case which demonstrates the problem ?
> >=20
>=20
> Nothing special:
> https://people.freebsd.org/~mjg/patches/F_READAHEAD.c
And how did you verified that fcntl(F_READAHEAD, -1) did not worked ?
I ended up with adding printf() to kern_fcntl() to see arg value.

>=20
> > The fcntl(2) prototype in sys/fcntl.h is variadic, so int arg argument
> > is not promoted.  On the other hand, syscalls.master declares arg as lo=
ng.
> > Did you tried to pass -1L as third argument to disable ?
> >=20
>=20
> Yes, -1L deals with the problem. I would still argue that using 'tmp'
> like the rest of the function would not hurt as a cheap solution.
This would deliberately break the current ABI (which takes the argument
as long for F_READAHEAD), which is not acceptable.

I do think that there is bug in the "-1" stuff, but it is in compat32
shims. The compat/freebsd32/syscalls.master does not provide the compat
for fcntl(2), which means that 32bit fcntl(2) does not work when either
signed extension is needed (the F_READAHEAD case), or on the big-endian
machines.  On i386, it did not practically matter before F_READAHEAD,
since x86 is little-endian and flags passed as arg did not touch the
sign bit.

Note that fcntl(2) man page is wrong, it claims that optional argument
arg is int.  It cannot be true since pointer on LP64 platform cannot
fit into int.  The SUSv4 is explicit in describing which command
takes which type; our man page must be fixed, but this is for later.

See the patch at the end of the reply for the fix.  It needs sysent
regen for actual build.

>=20
> > >=20
> > > diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
> > > index 7abdca0..774f647 100644
> > > --- a/sys/kern/kern_descrip.c
> > > +++ b/sys/kern/kern_descrip.c
> > > @@ -760,7 +760,8 @@ kern_fcntl(struct thread *td, int fd, int cmd, in=
tptr_t arg)
> > >  			error =3D EBADF;
> > >  			break;
> > >  		}
> > > -		if (arg >=3D 0) {
> > > +		tmp =3D arg;
> > > +		if (tmp >=3D 0) {
> > >  			vp =3D fp->f_vnode;
> > >  			error =3D vn_lock(vp, LK_SHARED);
> > >  			if (error !=3D 0) {
> > > @@ -769,7 +770,7 @@ kern_fcntl(struct thread *td, int fd, int cmd, in=
tptr_t arg)
> > >  			}
> > >  			bsize =3D fp->f_vnode->v_mount->mnt_stat.f_iosize;
> > >  			VOP_UNLOCK(vp, 0);
> > > -			fp->f_seqcount =3D (arg + bsize - 1) / bsize;
> > > +			fp->f_seqcount =3D (tmp + bsize - 1) / bsize;
> > >  			do {
> > >  				new =3D old =3D fp->f_flag;
> > >  				new |=3D FRDAHEAD;
> > >=20
> > > Then the patch in question:
> > >=20
> > > diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
> > > index 774f647..4efadb0 100644
> > > --- a/sys/kern/kern_descrip.c
> > > +++ b/sys/kern/kern_descrip.c
> > > @@ -476,7 +476,6 @@ kern_fcntl(struct thread *td, int fd, int cmd, in=
tptr_t arg)
> > >  	struct vnode *vp;
> > >  	cap_rights_t rights;
> > >  	int error, flg, tmp;
> > > -	u_int old, new;
> > >  	uint64_t bsize;
> > >  	off_t foffset;
> > > =20
> > > @@ -760,27 +759,25 @@ kern_fcntl(struct thread *td, int fd, int cmd, =
intptr_t arg)
> > >  			error =3D EBADF;
> > >  			break;
> > >  		}
> > > +		vp =3D fp->f_vnode;
> > > +		/*
> > > +		 * Exclusive lock synchronizes against
> > > +		 * sequential_heuristic().
> > I would also add a sentence that we care about f_seqcount update in
> > seq_heur().
> >=20
>=20
> /*
>  * Exclusive lock synchronizes against f_seqcount reads and writes in
>  * sequential_heuristic().
>  */
>=20
> > Another place to add the locking annotation is the struct file in
> > sys/file.h.  Now f_seqcount is 'protected' by the vnode lock.
> > I am not sure how to express the locking model shortly.
> >=20
>=20
> /*
>  * (a) f_vnode lock required (shared allows both reads and writes)
>  */
Ok.

diff --git a/sys/compat/freebsd32/freebsd32_misc.c b/sys/compat/freebsd32/f=
reebsd32_misc.c
index 815a9b7..fb8736c 100644
--- a/sys/compat/freebsd32/freebsd32_misc.c
+++ b/sys/compat/freebsd32/freebsd32_misc.c
@@ -2980,3 +2980,28 @@ freebsd32_procctl(struct thread *td, struct freebsd3=
2_procctl_args *uap)
 	return (kern_procctl(td, uap->idtype, PAIR32TO64(id_t, uap->id),
 	    uap->com, data));
 }
+
+int
+freebsd32_fcntl(struct thread *td, struct freebsd32_fcntl_args *uap)
+{
+	intptr_t tmp;
+
+	switch (uap->cmd) {
+	/*
+	 * Do unsigned conversion for arg when operation
+	 * interprets it as flags or pointer.
+	 */
+	case F_SETLK_REMOTE:
+	case F_SETLKW:
+	case F_SETLK:
+	case F_GETLK:
+	case F_SETFD:
+	case F_SETFL:
+		tmp =3D (unsigned int)(uap->arg);
+		break;
+	default:
+		tmp =3D uap->arg;
+		break;
+	}
+	return (kern_fcntl(td, uap->fd, uap->cmd, tmp));
+}
diff --git a/sys/compat/freebsd32/syscalls.master b/sys/compat/freebsd32/sy=
scalls.master
index 3339690..161f69d 100644
--- a/sys/compat/freebsd32/syscalls.master
+++ b/sys/compat/freebsd32/syscalls.master
@@ -200,7 +200,8 @@
 89	AUE_GETDTABLESIZE	NOPROTO	{ int getdtablesize(void); }
 90	AUE_DUP2	NOPROTO	{ int dup2(u_int from, u_int to); }
 91	AUE_NULL	UNIMPL	getdopt
-92	AUE_FCNTL	NOPROTO	{ int fcntl(int fd, int cmd, long arg); }
+92	AUE_FCNTL	STD	{ int freebsd32_fcntl(int fd, int cmd, \
+				    int arg); }
 93	AUE_SELECT	STD	{ int freebsd32_select(int nd, fd_set *in, \
 				    fd_set *ou, fd_set *ex, \
 				    struct timeval32 *tv); }

--Y3cq2BYpkEb43po+
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBAgAGBQJT+3IbAAoJEJDCuSvBvK1BdzQP/jEbCtmdwVzlWQBLJtWRkAhn
C7idTvbhGdRWi1a8G1yEa4fNnxm7z5KGl+WRW+mO7npj8B7w/taX0pCt97J/TCIA
9iJI6qNuzylKXezuiGBrTx78pysmQ/6Y2ozQa3rGwIN4c03802Z+v5bz7uFbKFCP
9THL4Jzi7l8sNicGoG7upd0/gYEOYHg9sNuEgS7204Mn3NzRd5CQqouStm77kAea
byx4IWoJrSB66bVSqb8bZcgsUn0xdM0iUcHLQQrF5lEyL6B52K7t+BC5Y5abVA78
XbUjDV1jlzjxxXLidx2ZyABjriK8FGFDBpjXquFjS1YCoz+wxb009pikmRgQP2Wf
BBkibiZI/kkSS5rnqKJL2c75CJOOf7ONd/Ye0zvPVPtZbFgF9vmwGrOUwQCKhn+Q
OWuQCIvMKDShL9GncrSjHP4cBHzHmxZOozUb8ysO30gWV+4tihPHKakMw9ssKnbv
OqAP+qK1oPznhWlIXiciD3oRw/2SpIHErWDJUwXtMhFZvAuWwLZzntUYE5kUjTkD
plVRi8gAzAazDVZYDhEa7ycP27P/9PvEzw5URpuxT6ABdzfK9vKpFEFoqiAqcXuE
WdBG6cJQTzROrV0cDwfgowmW23FYhdijiHZdv/DccTbKIu5oMrS53bTXZne/ODhC
k6nfxbtxS8DHD/KMJcIO
=fHHK
-----END PGP SIGNATURE-----

--Y3cq2BYpkEb43po+--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140825172755.GD2737>