Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Mar 2003 20:08:59 -0800
From:      Sean Chittenden <sean@chittenden.org>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        Hiten Pandya <hiten@unixdaemons.com>, arch@FreeBSD.ORG
Subject:   Re: Should sendfile() to return ENOBUFS?
Message-ID:  <20030304040859.GB79234@perrin.int.nxad.com>
In-Reply-To: <3E641131.431A0BA8@mindspring.com>
References:  <20030303224418.GU79234@perrin.int.nxad.com> <20030304001230.GC36475@unixdaemons.com> <20030304002218.GY79234@perrin.int.nxad.com> <3E641131.431A0BA8@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--8Bx+wEju+vH9ym24
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

> sendfile:
>=20
>      When using a socket marked for non-blocking I/O, sendfile() may
>      send fewer bytes than requested.  In this case, the number of
>      bytes success- fully written is returned in *sbytes (if
>      specified), and the error EAGAIN is returned.
>=20
> This seems to indicate several things:
>=20
> 1)	The correct error is EAGAIN, *not* ENOBUFS

EAGAIN/EWOULDBLOCK, I'm inclined to agree...

> 2)	You need to be damn sure you can guarantee a correct update
> 	of *sbytes; I believe this is very difficult in the case in
> 	question, which is why it blocks

I'm not convinced of this.  Have you poked through
src/sys/kern/uipc_syscalls.c?  It's not that ugly/hard, nothing's
impossible with a bit of refactoring.

> 3)	If sbytes is NULL, you should probably block, even on a
> 	non-blocking call.  The reason for this is that there is
> 	no way for the application to restart without *sbytes

This degrades terribly though and if you get a spike in traffic,
degradation of performance is critical.  Going from a non-blocking
application to a blocking call simply because of high use is murderous
and is justification in itself enough for me to move away from the
really nice zero-copy sockets that sendfile() affords me, back to the
sluggish writev() syscall.  If a system is busy, it's stuck in an
sfbufa state and blocks the server from servicing thousands of
connections.  The symptoms are common and synonymous with mbuf
exhaustion or any other kind of buffer exhaustion...  my point is that
having this block is the worst way that sendfile() can degrade under
high performance.

> 4)	If you get rid of the blocking with (sbytes =3D=3D NULL), you
> 	better add a BUGS section to the manual page.

There's nothing that says that sbytes can't be set to 0 if errno is
EAGAIN, in fact, that's what it does right now.

> Frankly I'm really surprised that you are blocking in this place; it
> indicates an inability to get a page in the kernel map in the sf
> zone, which, in turn, indicates that your NSFBUFS is improperly
> tuned; if you are using sendfile, and tune up your other kernel
> parameters for your system, don't forget NSFBUFS.

Well, it's set to 65535 at the moment.  How much higher you think I
should set it?  :-] At some point I have to say, "it's high enough and
I just need to get the application to degrade gracefully."  :-]

> While you could *technically* make sf_buf_alloc() non-blocking, in
> general this would be a bad idea, given that the one place it's
> called is in in interior loop that can be the subject of a "goto"
> (so it's an embedded interior loop) in sendfile() itself.  I think
> it would be very hard to satisfy #2, to allow it to be restartable
> by the application, in the face of failure, and since *sbytes is not
> a mandatory parameter, likely your application will end up barfing
> (e.g. sending partial FTP files or HTML documents down, with no way
> to recover from a failure, other than closing the client socket, and
> hoping the client can recover).

Frankly, if a developer is stupid enough to pass in NULL for sbytes,
they get what they deserve.  Returning -1 and setting errno to EAGAIN
in the event that there aren't any sf_buf's available isn't what I'd
call the programming exercise of the decade.  :-P

> In a "flash crowd" case on an HTTP server, this basically means that
> you will continuously get retries, and the situation will worsen,
> exponentially, as people retry getting the same page.  In the FTP
> case, or some other protocol without automatic retry on session
> abandonment, of course, it will be fatal.

Hrm, let me redefine "fatal" as "changing the behavior of a system
call to go from returning in less than 0.001ms, to returning in 2-15s
for every connection when trying to make over ~500K sendfile(2) calls
a second."  I'd call that a catastrophic failure to degrade
successfully.  -sc

--=20
Sean Chittenden

--8Bx+wEju+vH9ym24
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Comment: Sean Chittenden <sean@chittenden.org>

iD8DBQE+ZCbb3ZnjH7yEs0ERAk3mAKCTIVw1wlkEppN9MlKOvgcjGROfbQCgyjlj
ihQpNHXryGSGT/JMcV81SQI=
=frrn
-----END PGP SIGNATURE-----

--8Bx+wEju+vH9ym24--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030304040859.GB79234>