Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 May 2003 08:58:11 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Igor Sysoev <is@rambler-co.ru>
Cc:        arch@freebsd.org
Subject:   Re: sendfile(2) SF_NOPUSH flag proposal
Message-ID:  <3ED4DC93.42A44D09@mindspring.com>
References:  <Pine.BSF.4.21.0305281324220.50420-100000@is>

next in thread | previous in thread | raw e-mail | index | archive | help
Igor Sysoev wrote:
> On Tue, 27 May 2003, Terry Lambert wrote:
> > NOTE: TCP_NOPUSH *specifically* mentions writev(2), which, like
> > sendfile(2), takes data from multiple discrete buffers and sends
> > it.
> 
> I agree with you, but writev() takes data from the memory while
> sendfile() can read it from a disk - it's one of the cause of the partially
> filled packets in the middle of the file stream.  TF_NOPUSH (internal
> TCP_NOPUSH representation) can be used to avoid it.

The writev() takes it from memory... and sendfile() takes it
from memory.  The only difference is whether the memory that
is referred to by the mbuf headers is from the program's
address space, and copied into an mbuf in the kernel's address
space, or is an external mbuf referred to by an sf_buf, and in
the kernel's address space because it's in the buffer cache.


> Suppose you have one page in VM and you need to read the next pages
> from a disk. What would you do ? If you send this single page - it
> will go as 1460, 1460 and 1176.

Only if I set stupidly set TCP_NODELAY on the socket, which
I have to go out of my way to do.  If I can't read the next
block off the disk, wire it, and set up an EXT_SFBUF for it
in 2MSL, there's something seriously wrong in the OS.  2MSL
is a *very* long time on modern systems.

The "problem" is the call to:

 error = (*so->so_proto->pr_usrreqs->pru_send)(so, 0, m, 0, 0, td);

in sendfile(2) in uipc_syscalls.c, in the case where it's not
true that:

	(sbspace(&so->so_snd) >= so->so_snd.sb_lowat)

...or, more specifically, that it's effectively sent TCP_NODELAY.
You'll notice that the page is only unwired when the external
mbuf is freed.

-- Terry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3ED4DC93.42A44D09>