Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Jul 2019 19:30:54 +0200
From:      Jilles Tjoelker <jilles@stack.nl>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>, "kib@freebsd.org" <kib@FreeBSD.org>, Alan Somers <asomers@freebsd.org>
Subject:   Re: should a copy_file_range(2) syscall be interrupted via a signal
Message-ID:  <20190705173054.GA30404@stack.nl>
In-Reply-To: <YTXPR01MB0285E79DFAAE250FD7A7A181DDF50@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>
References:  <YTXPR01MB0285E79DFAAE250FD7A7A181DDF50@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 05, 2019 at 12:28:51AM +0000, Rick Macklem wrote:
> I have been working on a Linux compatible copy_file_range(2) syscall
> (the current code can be found at https://reviews.freebsd.org/D20584).

> One outstanding issue is how it should deal with signals. Right now, I
> have vn_start_write() without PCATCH, so that it won't be interrupted
> by a signal, but I notice that vn_write() {ie. write syscall } does
> have PCATCH on vn_start_write() and so does vn_rdwr() when it is
> called without IO_NODELOCKED.

A regular write() is only interruptible when writing to a terminal,
pseudo-terminal master, pipe, socket, or, under certain conditions, a
file on an NFS intr mount. Therefore, applications may not have the code
to resume interrupted writes to regular files gracefully.

> I am thinking that copy_file_range(2) should do this also.
> However, if it returns an error, it is impossible for the caller to
> know how much of the data range got copied.

A regular write() returns partial success if interrupted by a signal
when it has already written something. Therefore, the application can
resume the operation by adjusting pointers and counts.

Something similar applies to "deterministic" errors like [EFBIG] where
the first call will write as far as possible (if this is not nothing)
successfully and the next attempt will return the error.

> What do you think the copy_file_range(2) code should do?

I'm not sure it should actually be done, but the need for adjusting
pointers and counts could be avoided with a little extra kernel and libc
code. The system call would receive an additional argument pointing to
an off_t that indicates how many bytes previous calls have already
written. A libc wrapper would initialize this to 0. With this, the
system call can be restarted automatically after a signal.

In any case, [EINTR] and the internal ERESTART must not be returned
unless it is safe to repeat the call with the same (direct) arguments.

-- 
Jilles Tjoelker



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190705173054.GA30404>