Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 08 Sep 2010 16:29:56 +0000
From:      "Poul-Henning Kamp" <phk@phk.freebsd.dk>
To:        mdf@FreeBSD.org
Cc:        FreeBSD Arch <freebsd-arch@FreeBSD.org>
Subject:   Re: Extending sbufs with a drain 
Message-ID:  <67049.1283963396@critter.freebsd.dk>
In-Reply-To: Your message of "Wed, 08 Sep 2010 08:43:40 MST." <AANLkTikerOYdEyT6jt7rAN3JP_N_oofq7ZXc4NptA4zP@mail.gmail.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
In message <AANLkTikerOYdEyT6jt7rAN3JP_N_oofq7ZXc4NptA4zP@mail.gmail.com>, mdf@
FreeBSD.org writes:

>Regardless of what sbufs were originally designed to do, this seems
>like a useful enhancement to me that does not take away the ability to
>use them for overflow detection.  An sbuf without a drain marked
>SBUF_FIXED will still perform in this manner.  But now there are more
>options.

I am not saying it is not useful, I am questioning if sbufs is the
right vehicle for this usefulness.

>> For one thing, sbuf is not a kernel facility, it is a general purpose
>> string editing library which is also used outside FreeBSD.   If
>> we add the drain facility it should work in both userland and kernel.
>
>Given that sys/sbuf.h is included in 100+ kernel files and 4
>user-space files, [...]

That is probably more an indication that our userland is largely
unchanged from 1995, whereas as everybody loves to hack up the
kernel :-)  The various people who have told me they adopted sbufs
have been very happy with them in userland.

>As for being used outside FreeBSD, I was unaware of this.  Who is
>using it, and what kinds of code are we pulling from the other
>implementations?

As usual with the BSD license, we have no idea and no way to find out.

One place I do know about is Varnish :-)

My main point here is that you should not treat sbufs as a kernel
facility, but see it as a generally desirable and useful way of
doing safe string operations, and that if we make a major extension
in functionality, it should be done both in userland and kernel.

>> The drain function does not, and should not have to know that sbufs
>> are involved
>
>Why not?  This is an assertion without any proof.

See later in my previous reply:  The contents of sbufs is private,
nobody outside subr_sbuf.c should ever fondle the internals.  See
further below.

>to receive the sbuf pointer as an argument allows for future
>flexibility I haven't even imagined yet.

I'll let Gettys (The programmer, not the general) shoot that canard down:

   3. The only thing worse than generalizing from one example
      is generalizing from no examples at all.

>> int sbuf_drain(void *private, const char *ptr, int len)
>>
>> Return value:
>>  >= 0  # bytes drained
>>        -1      error, errno relevante
>>        -2      error, errno not relevant
>
>This prototype and return scheme does not work in the kernel where
>there is no errno.

Which bit of "errno not relevant" in the description of -2 did
you fail to notice ?  :-)

>Note that in the existing implementation,
>sbuf_extend() will return -1 and not set errno in the case where
>SBUF_AUTOEXTEND is not set.

You were the one to involve errno's in this (patch 0001...) I
don't see them having any relevance, since ENOMEM is the only
one that's realistically relevant.

Realistiscally you could have a drain function that dumps into
a socket (userland or kernel) and it would be nice to report
the ECONNRESET back.

I was never really happy with des@' choice of the name sbuf_overflowed()
I would have prefered sbuf_error() or sbuf_ok().   If we decide to
introduce errnos, this gives us the chance to add those and advocate
their use instead of sbuf_overflowed().

>> Note: The the private argument should be a void*, not an uintptr_t
>
>I seem to recall that casting a pointer to uintptr_t and back is safe

Yes, but why would you want to ?  Normally private arguments are typed
"void *" as they more often than not point to some structure, to
which you can cast a void* directly, whereas uintptr_t requires a
explicit cast.

>> This means that the autoextend can not be implemented in terms of
>> a drain, but that looked plain wrong to me any way.
>
>Implementing autoextend as a drain shows off the flexibility of the
>drain function, among other things.

Uhm no, it does not.

It results in the wrong prototype for the drain function, exposes
the sbuf internals where they should not be fondled, and requires
the addition of two extra functions which are unnecessary with the
simpler prototype I suggested above.

>Line boundaries will never be
>perfect since a line can always exceed whatever static sized buffer is
>used.

Good point, forget about drain-policy then.

The size 2 buffer for char by char requires good documentation and
possibly a distinct #define.

>> sbuf_flush() is unecessary, sbuf_finish() should be used, because
>> that is the only way you can be sure the error code is available
>> for checking afterwards.
>
>With the existing sbuf_finish() implementation, it was unusable.
>Mostly because of the SBUF_FINISHED state change.  There is no reason
>off-hand that I can think of to have the sbuf track whether it's
>really "done"; that is up to the user of the service who should know
>whether all the data has been printed to the sbuf.  More below...

The critical feature of the sbuf API is that you do not need to
check the error status on every call.  That is why UNIX grew the
EPIPE hack: nobody checks the return value of printf().

With sbufs no damage happens, no matter what you do, and you can
(and should!) check the compound operation at the end with
sbuf_overflowed().

For a sbuf with a drain function, sbuf_finish() should flush the
buffer, but it should probably not set the SBUF_FINISHED flag.  That
would also catch confused calls to sbuf_data() with an assert.

What is the semantics of sbuf_len() when you have a drain function ?

Number of undrained bytes ?

>> sbuf_unfinished_data() is just plain wrong and shall not be added, ever.
>
>This was required because of the assert_sbuf_state(SBUF_FINISHED)

No, this was required because you chose the wrong prototype for the
drain function.

>> The internals of sbufs are strictly private and should stay private.
>>
>> Sbuf does not make any promises about how the string is stored
>> internally until you sbuf_finish() the buffer.
>
>sbuf did make this promise.  Since sbuf_data and sbuf_len are public
>accessor functions [...]

Sbuf made no such promise, sbuf_data() and sbuf_len() are there
precisly to isolate the internals.

Amongst the original ideas was to have a multipart buffer to avoid
multiple realloc()/memcpy operations on buffer extends, and only
collapse the buffer at the end when the final size was known.  This
optimization has so far not been necessary.  Your API change would
make it impossible.

Poul-Henning

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?67049.1283963396>