Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 May 2007 14:14:42 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Scot Hetzel <swhetzel@gmail.com>
Cc:        emulation@freebsd.org
Subject:   Re: linuxolator: LTP lseek03 failure
Message-ID:  <20070507124848.S47783@besplex.bde.org>
In-Reply-To: <790a9fff0705061005v1a1a883ehc155bac7f747c3eb@mail.gmail.com>
References:  <790a9fff0705021345j2ad9ae98o56aaf357d556fe27@mail.gmail.com>  <790a9fff0705040004oab16ed8q1a1c476386379ea9@mail.gmail.com>  <20070504190007.Y37951@besplex.bde.org> <790a9fff0705040819u24e4c2f0s5c9fc34b93770e13@mail.gmail.com> <790a9fff0705061005v1a1a883ehc155bac7f747c3eb@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 6 May 2007, Scot Hetzel wrote:

> On 5/4/07, Scot Hetzel <swhetzel@gmail.com> wrote:

>> Would the best fix be to change the native lseek to return EINVAL when
>> fo_ioctl returns ENOTTY?

Probably.

> I had a look at the OpenSolaris implementation of lseek and found that
> we are returning the wrong error code for the ENOTTY case.  When
> ENOTTY is returned by fo_ioctl, we should be checking for the
> following cases:
>
> SEEK_DATA - Is offset past end of file
> SEEK_HOLE - Return virtual hole at end of file, if offset is valid
>
> on error, SEEK_DATA and SEEK_HOLE should be returning ENXIO for these
> cases.  OpenSolaris also checks offset > OFF_MAX, and returns
> EOVERFLOW.
>
> When the lseek03 test is run, with these changes, it returns with ENXIO.

Shouldn't it return with EOVERFLOW even for zfs?  I guess the test doesn't
check for that, so the result should be EINVAL for non-zfs and maybe
ENXIO for zfs if the test didn't set up right for the seeks to work
(presumably it sets the offset to something reasonable so as not to get
spurious errors from args other than `whence', but it expects i/o to fail
so it doesn't set up for i/o.

> I sent the attached patch for kern/vfs_syscalls.c to pjd for review.

The case of negative offsets also seems to be broken.  POSIX requires
returning EINVAL for negative offsets for files of type regular, block
special and some others.  We do this near the end of the the function
in the noneg case, but:
- block special files have rotted to just character special variants.
   Seeking to negative offsets on disk files is useless, so the noneg
   case should apply to cdevs that are disks like it must apply to
   bdevs that were disks.
- the SEEK_DATA/SEEK_HOLE ioctl runs before the check for negative offsets
   (other cases fall through to the check before doing anything that will
   break the check).  zfs_ioctl() does bogus bounds checking via zfs_holey():
   - "off_t offset" has become "offset_t off" (offset_t == off_t so there
     is no type error yet)
   - "off" is assigned to uint64_t noff.  This may overflow in theory, but
     doesn't in practive since off_t is int64_t.  uoff_t (spelled u_offset_t
     in the opensolaris compat layer) should be used to avoid this logic
     error.
   - the bounds of noff are checked.  If offset was negative, then noff is
     large unsugned so ENXIO is returned.  I can't see anywhere that
     translates this to EOVERFLOW.

% Index: vfs_syscalls.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/kern/vfs_syscalls.c,v
% retrieving revision 1.438
% diff -u -r1.438 vfs_syscalls.c
% --- vfs_syscalls.c	4 May 2007 14:23:28 -0000	1.438
% +++ vfs_syscalls.c	6 May 2007 15:37:39 -0000
% @@ -1755,9 +1755,36 @@
%  		break;
%  	case SEEK_DATA:
%  		error = fo_ioctl(fp, FIOSEEKDATA, &offset, cred, td);
% +		if (error == ENOTTY) {
% +			/*
% +			 * The ioctl is not supported. Is the offset
% +			 * past the end of the file.
% +			 */
% +			error = VOP_GETATTR(vp, &vattr, fp->f_cred, td);
% +			if (error == 0 &&
% +			   (offset >= (uoff_t)vattr.va_size))
% +				error = ENXIO;
% +		}

I think this should just return EINVAL (or later, when all file systems
are supposed to support SEEK_DATA, maybe EOPNOTSUPP).  Detecting an error
only at the end of a file (and still returning the impossible error
ENOTTY otherwise) is not useful.

% +		if (noneg && (offset > OFF_MAX))
% +			error = EOVERFLOW;

This shouldn't happen.  offset has type off_t, and OFF_MAX is the maximum
value of an off_t.  I think EOVERFLOW is only the appropriate error if
the final offset would be unrepresentable, as can happen for relative
seeks, but here the offset is either garbage or the same as the origina;
offset, and if SEEK_DATA is actually implemented then the final offset
could only be > OFF_MAX if the file system implements files with
unreachable offsets.  Strictly, `offset' is garbage on error, so we
shouldn't check it.

The case where the original offset is < 0 still falls through to the
"offset < 0" check later, but that check is only done if errno == 0
(correctly, since the offset is garbage on error), so the above still
returns ENOTTY for negative offsets.

%  		break;
%  	case SEEK_HOLE:
%  		error = fo_ioctl(fp, FIOSEEKHOLE, &offset, cred, td);
% +		if (error == ENOTTY) {
% +			/*
% +			 * The ioctl is not supported. Return virtual
% +			 * hole at end of file, if offset is valid.
% +			 */
% +			error = VOP_GETATTR(vp, &vattr, fp->f_cred, td);
% +			if (error == 0) {
% +				if (uap->offset < (off_t)vattr.va_size)

False positive if uap->offset < 0.

% +					offset = (uoff_t)vattr.va_size;
% +				else
% +					error = ENXIO;
% +			}
% +		}

Skipping over possible holes to reach EOF is even less useful than detecting
some errors since it has wrong sematics.  However, this case has no effect,
since error remains set at ENOTTY, so the rest of the function does nothing
except clean up and return ENOTTY.

% +		if (noneg && (offset > OFF_MAX))
% +			error = EOVERFLOW;
%  		break;

As above.

%  	default:
%  		error = EINVAL;

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070507124848.S47783>