Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Jul 2019 15:51:49 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Mark Johnston <markj@freebsd.org>
Cc:        "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>, "kib@freebsd.org" <kib@FreeBSD.org>, Alan Somers <asomers@freebsd.org>
Subject:   Re: should a copy_file_range(2) syscall be interrupted via a signal
Message-ID:  <YTXPR01MB0285BE1F5A0CA1D51DDB75F6DDF50@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <20190705143845.GA50901@raichu>
References:  <YTXPR01MB0285E79DFAAE250FD7A7A181DDF50@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>, <20190705143845.GA50901@raichu>

next in thread | previous in thread | raw e-mail | index | archive | help
Mark Johnston wrote:
>On Fri, Jul 05, 2019 at 12:28:51AM +0000, Rick Macklem wrote:
>> Hi,
>>
>> I have been working on a Linux compatible copy_file_range(2) syscall
>> (the current code can be found at https://reviews.freebsd.org/D20584).
>>
>> One outstanding issue is how it should deal with signals.
>> Right now, I have vn_start_write() without PCATCH, so that it won't be
>> interrupted by a signal, but I notice that vn_write() {ie. write syscall=
 } does
>> have PCATCH on vn_start_write() and so does vn_rdwr() when it is called
>> without IO_NODELOCKED.
>>
>> I am thinking that copy_file_range(2) should do this also.
>> However, if it returns an error, it is impossible for the caller to know=
 how much
>> of the data range got copied.
>
>Couldn't copy_file_range() return the number of bytes copied in this
>case?  (The Linux man page notes that short writes are possible.) I
>would expect to see the same error handling that we have in
>dofilewrite(), where certain errnos are squashed.
I think this would be a good approach for local file systems, since I belie=
ve that
the only place that EINTR can be generated is the vn_start_write() call, si=
nce
vn_rdwr(IO_NODELOCKED) never returns it and the call completes before
returning.

As such, the EINTR happens at a "well known" place in the copy and a return=
 of
the bytes copied should be fine.

Now, for NFS, it gets a little weird...
- For NFSv3, many use the "intr" mount option, which means that a VOP_WRITE=
()
  can return EINTR and the caller doesn't know if the write succeeded on th=
e NFS
  server or not.
  --> Returning "bytes copied" instead of an error for this case doesn't se=
em
       appropriate to me, since there is no way to know if the last write h=
appened?
However, "intr" is not recommended for NFSv4 and NFSv4.2 is the only case w=
here
there is an RPC to do this on the server.

Maybe nfs_copy_file_range() shouldn't "hide" EINTR, although the local file
systems do so.

I think sounds like a good approach.
What do others think?

>> What do you think the copy_file_range(2) code should do?
>
>I'd find it surprising if copy_file_range() isn't interruptible.
I'll admit I haven't tested on Linux, so I don't know what happens there.
The Linux man page doesn't mention EINTR, but I don't know what happens
for a Linux "intr" NFS mount. I do have a Linux system for testing, but it =
is the
same system I have been using to test this syscall on FreeBSD. Maybe I need=
 to
boot/play around with it.

I do think returning "bytes copied" instead of EINTR is a good idea, where =
practical.

Thanks for the comments, rick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB0285BE1F5A0CA1D51DDB75F6DDF50>