Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 07 Sep 2007 15:03:22 +0200
From:      =?UTF-8?B?SmVhbi1Tw6liYXN0aWVuIFDDqWRyb24=?= <dumbbell@freebsd.org>
To:        freebsd-arch@freebsd.org
Subject:   Pipe direct write and pipeselwakeup()
Message-ID:  <46E14C1A.1060606@freebsd.org>

next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------060407030105010800010608
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

I'm investigating a problem with select/poll/kevent not triggered when
writing to a pipe. Here I explain what I understood and, at the end of
this mail, I propose a patch. I would like to have feedback about this
solution.

The problem comes from the way pipes are implemented. The kernel uses
two ways to write data on a pipe:

    o  buffered write. This is done when there is less than 8192 bytes
       (PIPE_MINDIRECT) in the _current_ iov. Data from _all_ iov are
       uiomove()'d to an internal buffer until there's no more data or
       the buffer is full.

    o  direct write. This is done when there is at least 8192 bytes in
       the current iov.

Both techniques can't be mixed. So during a single call to writev(2), if
there's a need to switch from one to the other, the kernel must wake
reader processes and select/poll/kqueue up before the write continues.
But when switching from direct write to buffered write, the kernel only
wakes reader processes up, not select/poll/kqueue.

Someone provided me with a testcase to reproduce the bug. I attached the
sources to this mail ("rd.c" and "wr.c"). Use it like this:
    ./rd ./wr

Here's is what's going on with this testcase:
    1.  the first iov is smaller than 8192 bytes (1 or 2 bytes), so
        buffered write is selected.
    2.  the kernel internal buffer is 65536 bytes long, so uiomove()
        will fill it completly with the data (73727 or 73728 bytes).
        At the end, 8191 or 8192 bytes remain, depending on
        TRIGGER_WRITEV_BUG in "wr.c".
    3a. with 8191 bytes remaining, buffered write is still selected but
        the buffer is full: readers and selects are awaken. Everything's
        fine.
    3b. with 8192 bytes remaining, direct write is selected. It sees
        that the internal buffer is in use: readers are awaken (so the
        buffer can be flushed) but not selects. Here, the
        select/poll/kevent times out.

There are 3 cases where only readers are awaken. The attached patch add
calls to pipeselwakeup(). This fixes the testcase but I'd like to know
if there was a good reason to not call pipeselwakeup() in this 3
specific cases?

Also, in the third case, the PIPE_WANTW flag isn't set either. I think
it should be set too. What do you think?

Thanks for any feedback!

- --
Jean-Sébastien Pédron
http://www.dumbbell.fr/

PGP Key: http://www.dumbbell.fr/pgp/pubkey.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4UwZa+xGJsFYOlMRAqKBAJwLx+9WoQmPs4pa8VEPOzT2b5r3VQCfarLY
giS8UUEYvNuUQGBqtJ4jhJU=
=Rato
-----END PGP SIGNATURE-----

--------------060407030105010800010608
Content-Type: text/plain;
 name="rd.c"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="rd.c"

Ci8qICJnY2MgLVdhbGwgLW8gcmQgcmQuYyIgKi8KLyogImdjYyAtRF9SRUVOVFJBTlQgLVdh
bGwgLW8gcmQgcmQuYyAtbGNfciIgKi8KI2luY2x1ZGUgPGZjbnRsLmg+CiNpbmNsdWRlIDxz
eXMvdHlwZXMuaD4KI2luY2x1ZGUgPHN5cy9ldmVudC5oPgojaW5jbHVkZSA8c3lzL3RpbWUu
aD4KI2luY2x1ZGUgPHN5cy91aW8uaD4KI2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxz
dGRsaWIuaD4KI2luY2x1ZGUgPHVuaXN0ZC5oPgojaW5jbHVkZSA8c3RyaW5nLmg+CiNpbmNs
dWRlIDxlcnJuby5oPgojaW5jbHVkZSA8cG9sbC5oPgojaW5jbHVkZSA8c3lzL3NlbGVjdC5o
PgoKI2lmIDEKI2RlZmluZSBVU0VfS1EKI2VsaWYgMQojZGVmaW5lIFVTRV9QT0xMCiNlbmRp
ZgoKI2RlZmluZSBFVl9USU1FT1VUIDEwCgppbnQKbWFpbihpbnQgYXJnYywgY2hhciAqYXJn
dltdKQp7CiAgICBpbnQgZmRzWzJdOwogICAgaW50IHJlczsKCiAgICBpZiAoYXJnYyAhPSAy
KSB7CglmcHJpbnRmKHN0ZGVyciwgIlVzYWdlOiByZCA8cGF0aCB0byB3cml0ZXI+XG4iKTsK
CWV4aXQoMSk7CiAgICB9CgogICAgaWYgKHBpcGUoZmRzKSA8IDApIHsKCXBlcnJvcigicGlw
ZSgpIGZhaWxlZCIpOwoJZXhpdCgxKTsKICAgIH0KCiAgICBmcHJpbnRmKHN0ZGVyciwgInBp
cGUgZmRzPXslZCwgJWR9XG4iLCBmZHNbMF0sIGZkc1sxXSk7CgogICAgZnByaW50ZihzdGRl
cnIsICJzZXR0aW5nICVkIGluIG5vbi1ibG9ja2luZyBtb2RlXG4iLCBmZHNbMF0pOwogICAg
ZmNudGwoZmRzWzBdLCBGX1NFVEZMLCBmY250bChmZHNbMF0sIEZfR0VURkwsIDApIHwgT19O
T05CTE9DSyk7CgogICAgcmVzID0gZm9yaygpOwogICAgaWYgKHJlcyA8IDApIHsKCXBlcnJv
cigiZm9yaygpIGZhaWxlZFxuIik7CglleGl0KDEpOwogICAgfQogICAgZWxzZSBpZiAocmVz
ID09IDApIHsKCWNsb3NlKDEpOwoJZHVwKGZkc1sxXSk7CgljbG9zZShmZHNbMF0pOwoJY2xv
c2UoZmRzWzFdKTsKCWV4ZWNsKGFyZ3ZbMV0sIGFyZ3ZbMV0sIE5VTEwpOwoJcGVycm9yKCJl
eGVjbCgpIGZhaWxlZFxuIik7CglleGl0KDEpOwogICAgfQoKICAgIGNsb3NlKGZkc1sxXSk7
CgojaWZkZWYgVVNFX0tRCiAgICB7CglpbnQga3E7CglpbnQgaTsKCXN0cnVjdCBrZXZlbnQg
ZXZbMTBdOwoJc3RydWN0IHRpbWVzcGVjIHR2ID0gezAsIDB9OwoJc3RydWN0IGtldmVudCBj
aGdbMV07CgoJa3EgPSBrcXVldWUoKTsKCWlmIChrcSA8IDApIHsKCSAgICBwZXJyb3IoImtx
dWV1ZSgpIGZhaWxlZCIpOwoJICAgIGV4aXQoMSk7Cgl9CgkKCWZwcmludGYoc3RkZXJyLCAi
c2V0dGluZyBFVkZJTFRfUkVBRCBvbiAlZFxuIiwgZmRzWzBdKTsKCUVWX1NFVCgmY2hnWzBd
LCBmZHNbMF0sIEVWRklMVF9SRUFELCBFVl9BREQsIDAsIDAsICh2b2lkICopIDIpOwoJcmVz
ID0ga2V2ZW50KGtxLCAmY2hnWzBdLCAxLCBOVUxMLCAwLCAmdHYpOwoJaWYgKHJlcyA8IDAp
IHsKCSAgICBwZXJyb3IoImtldmVudCgpIGZhaWxlZFxuIik7CgkgICAgZXhpdCgxKTsKCX0K
CWVsc2UgewoJICAgIGZwcmludGYoc3RkZXJyLCAia2V2ZW50KCkgcmV0dXJuZWQgPSAlZFxu
IiwgcmVzKTsKCX0KCXR2LnR2X3NlYyA9IEVWX1RJTUVPVVQ7Cgl0di50dl9uc2VjID0gMDsK
CWZwcmludGYoc3RkZXJyLCAia2V2ZW50IHdhaXRpbmcgZm9yICVkIHNlY3MgZm9yIGV2ZW50
cy4uLlxuIiwKCQlFVl9USU1FT1VUKTsKCXJlcyA9IGtldmVudChrcSwgTlVMTCwgMCwgJmV2
WzBdLCAxMCwgJnR2KTsKCWlmIChyZXMgPCAwKSB7CgkgICAgcGVycm9yKCJrZXZlbnQgZmFp
bGVkXG4iKTsKCSAgICBleGl0KDEpOwoJfQoJZWxzZSBpZiAocmVzID09IDApIHsKCSAgICBm
cHJpbnRmKHN0ZGVyciwgImtldmVudCB0aW1lZCBvdXRcbiIpOwoJICAgIGV4aXQoMSk7Cgl9
CglmcHJpbnRmKHN0ZGVyciwgImtldmVudCByZXR1cm5lZCA9ICVkXG4iLCByZXMpOwoKCWZv
ciAoaSA9IDA7IGkgPCAxMCAmJiBpIDwgcmVzOyBpKyspIHsKCSAgICBmcHJpbnRmKHN0ZGVy
ciwgInJlc3VsdCBldmVudCAlZDogZmQ9JWQ6ICIsIGksIChpbnQpZXZbaV0uaWRlbnQpOwoJ
ICAgIGlmIChldltpXS5mbGFncyAmIEVWX0VSUk9SKSB7CgkJZnByaW50ZihzdGRlcnIsICJF
Vl9FUlJPUjogJXMgIiwgc3RyZXJyb3IoZXZbaV0uZGF0YSkpOwoJICAgIH0KCSAgICBlbHNl
IHsKCQlpZiAoZXZbaV0uZmlsdGVyID09IEVWRklMVF9SRUFEKQoJCSAgICBmcHJpbnRmKHN0
ZGVyciwgIkVWRklMVF9SRUFEICIpOwoJCWlmIChldltpXS5maWx0ZXIgPT0gRVZGSUxUX1dS
SVRFKQoJCSAgICBmcHJpbnRmKHN0ZGVyciwgIkVWRklMVF9XUklURSAiKTsKCSAgICB9Cgkg
ICAgZnByaW50ZihzdGRlcnIsICJcbiIpOwoJfQogICAgfQoKI2VsaWYgZGVmaW5lZChVU0Vf
UE9MTCkKICAgIHsKCXN0cnVjdCBwb2xsZmQgcGZkc1sxXTsKCQoJcGZkc1swXS5mZCA9IGZk
c1swXTsKCXBmZHNbMF0uZXZlbnRzID0gKFBPTExJTnxQT0xMUkROT1JNKTsKCXBmZHNbMF0u
cmV2ZW50cyA9IDA7CgoJZnByaW50ZihzdGRlcnIsICJwZmRzWzBdLmZkID0gJWQgcGZkc1sw
XS5ldmVudHMgPSBQT0xMSU58UE9MTFJETk9STVxuIiwKCQlwZmRzWzBdLmZkKTsKCSAgICAK
CglmcHJpbnRmKHN0ZGVyciwgInBvbGwgd2FpdGluZyBmb3IgJWQgc2VjcyBmb3IgZXZlbnRz
Li4uXG4iLAoJCUVWX1RJTUVPVVQpOwoJcmVzID0gcG9sbCgmcGZkc1swXSwgMSwgRVZfVElN
RU9VVCoxMDAwKTsKCWlmIChyZXMgPCAwKSB7CgkgICAgcGVycm9yKCJwb2xsIGZhaWxlZFxu
Iik7CgkgICAgZXhpdCgxKTsKCX0KCWVsc2UgaWYgKHJlcyA9PSAwKSB7CgkgICAgZnByaW50
ZihzdGRlcnIsICJwb2xsIHRpbWVkIG91dFxuIik7CgkgICAgZXhpdCgxKTsKCX0KCWZwcmlu
dGYoc3RkZXJyLCAicG9sbCByZXR1cm5lZCA9ICVkXG4iLCByZXMpOwoJZnByaW50ZihzdGRl
cnIsICJmZD0lZCAiLCBwZmRzWzBdLmZkKTsKCWlmIChwZmRzWzBdLnJldmVudHMgJiBQT0xM
SU4pCgkgICAgZnByaW50ZihzdGRlcnIsICJQT0xMSU4gIik7CglpZiAocGZkc1swXS5yZXZl
bnRzICYgUE9MTFJETk9STSkKCSAgICBmcHJpbnRmKHN0ZGVyciwgIlBPTExSRE5PUk0gIik7
CglpZiAocGZkc1swXS5yZXZlbnRzICYgUE9MTE9VVCkKCSAgICBmcHJpbnRmKHN0ZGVyciwg
IlBPTExPVVQgIik7CglmcHJpbnRmKHN0ZGVyciwgIlxuIik7CiAgICB9CiNlbHNlIC8qIHVz
ZSBzZWxlY3QgKi8KICAgIHsKCWZkX3NldCByZWFkZmRzOwoJc3RydWN0IHRpbWV2YWwgc3R2
ID0ge0VWX1RJTUVPVVQsIDB9OwoJRkRfWkVSTygmcmVhZGZkcyk7CglGRF9TRVQoZmRzWzBd
LCAmcmVhZGZkcyk7CgoJZnByaW50ZihzdGRlcnIsICJzZWxlY3RpbmcgZmQgPSAlZFxuIiwg
ZmRzWzBdKTsKCSAgICAKCglmcHJpbnRmKHN0ZGVyciwgInNlbGVjdCB3YWl0aW5nIGZvciAl
ZCBzZWNzIGZvciBldmVudHMuLi5cbiIsCgkJRVZfVElNRU9VVCk7CglyZXMgPSBzZWxlY3Qo
ZmRzWzBdKzEsICZyZWFkZmRzLCBOVUxMLCBOVUxMLCAmc3R2KTsKCWlmIChyZXMgPCAwKSB7
CgkgICAgcGVycm9yKCJzZWxlY3QgZmFpbGVkXG4iKTsKCSAgICBleGl0KDEpOwoJfQoJZWxz
ZSBpZiAocmVzID09IDApIHsKCSAgICBmcHJpbnRmKHN0ZGVyciwgInNlbGVjdCB0aW1lZCBv
dXRcbiIpOwoJICAgIGV4aXQoMSk7Cgl9CglmcHJpbnRmKHN0ZGVyciwgInNlbGVjdCByZXR1
cm5lZCA9ICVkXG4iLCByZXMpOwoJZnByaW50ZihzdGRlcnIsICJmZD0lZCAiLCBmZHNbMF0p
OwoJaWYgKEZEX0lTU0VUKGZkc1swXSwgJnJlYWRmZHMpKQoJICAgIGZwcmludGYoc3RkZXJy
LCAiZmQgaXMgc2V0ICIpOwoJZnByaW50ZihzdGRlcnIsICJcbiIpOwogICAgfQojZW5kaWYK
ICAgIGNsb3NlKGZkc1swXSk7CiAgICByZXR1cm4gMDsKfQoKCgoKCgo=
--------------060407030105010800010608
Content-Type: text/plain;
 name="wr.c"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="wr.c"

Ci8qICJnY2MgLVdhbGwgLW8gd3Igd3IuYyIgYnVnZ3kgKi8KLyogImdjYyAtRF9SRUVOVFJB
TlQgLVdhbGwgLW8gd3Igd3IuYyAtbGNfciIgbm90IGJ1Z2d5ISAqLwoKCiNkZWZpbmUgVFJJ
R0dFUl9XUklURVZfQlVHIDEKCgojaW5jbHVkZSA8c3lzL3R5cGVzLmg+CiNpbmNsdWRlIDxz
eXMvdWlvLmg+CiNpbmNsdWRlIDx1bmlzdGQuaD4KI2luY2x1ZGUgPHN0ZGlvLmg+CgojaWZu
ZGVmIFBBR0VfU0laRQojZGVmaW5lIFBBR0VfU0laRSA0MDk2CiNlbmRpZgoKLyogVGhlIGZv
bGxvd2luZyAicGlwZSBkZWZpbmVzIiBjdXQgZnJvbSBzeXMvcGlwZS5oICovCgojaWZuZGVm
IFBJUEVfU0laRQojZGVmaW5lIFBJUEVfU0laRSAgICAgICAxNjM4NAojZW5kaWYKCiNpZm5k
ZWYgQklHX1BJUEVfU0laRQojZGVmaW5lIEJJR19QSVBFX1NJWkUgICAoNjQqMTAyNCkKI2Vu
ZGlmCgojaWZuZGVmIFNNQUxMX1BJUEVfU0laRQojZGVmaW5lIFNNQUxMX1BJUEVfU0laRSBQ
QUdFX1NJWkUKI2VuZGlmCgojaWZuZGVmIFBJUEVfTUlORElSRUNUCiNkZWZpbmUgUElQRV9N
SU5ESVJFQ1QgIDgxOTIKI2VuZGlmCgojaWZuZGVmIFBJUEVOUEFHRVMKI2RlZmluZSBQSVBF
TlBBR0VTICAgICAgKEJJR19QSVBFX1NJWkUgLyBQQUdFX1NJWkUgKyAxKQojZW5kaWYKCiNp
ZiBUUklHR0VSX1dSSVRFVl9CVUcKI2RlZmluZSBCVUYwX1NaIDIKI2Vsc2UKI2RlZmluZSBC
VUYwX1NaIDEKI2VuZGlmCiNkZWZpbmUgQlVGMV9TWiBQQUdFX1NJWkUKI2RlZmluZSBCVUYy
X1NaIChQSVBFTlBBR0VTICogUEFHRV9TSVpFIC0gMikKCnN0YXRpYyBjaGFyIGJ1ZltCVUYw
X1NaICsgQlVGMV9TWiArIEJVRjJfU1pdOwoKCmludCBtYWluKHZvaWQpCnsKICAgIGludCB3
cjsKICAgIHN0cnVjdCBpb3ZlYyBpb3ZbM107CgogICAgc2xlZXAoMSk7CgogICAgZnByaW50
ZihzdGRlcnIsICJQSVBFTlBBR0VTPSVkXG5QQUdFX1NJWkU9JWRcbiIsCgkgICAgUElQRU5Q
QUdFUywgUEFHRV9TSVpFKTsKICAgIGZwcmludGYoc3RkZXJyLCAiQlVGMF9TWj0lZCwgQlVG
MV9TWj0lZCwgQlVGMl9TWj0lZFxuIiwKCSAgICBCVUYwX1NaLCBCVUYxX1NaLCBCVUYyX1Na
KTsKCiAgICBpb3ZbMF0uaW92X2Jhc2UgPQojaWYgQlVGMF9TWiA9PSAwCglOVUxMCiNlbHNl
CgkmYnVmWzBdCiNlbmRpZgoJOwogICAgaW92WzBdLmlvdl9sZW4gPSBCVUYwX1NaOwoKCiAg
ICBpb3ZbMV0uaW92X2Jhc2UgPQojaWYgQlVGMV9TWiA9PSAwCglOVUxMCiNlbHNlCgkmYnVm
W0JVRjBfU1pdCiNlbmRpZgoJOwogICAgaW92WzFdLmlvdl9sZW4gPSBCVUYxX1NaOwoKCiAg
ICBpb3ZbMl0uaW92X2Jhc2UgPQojaWYgQlVGMl9TWiA9PSAwCglOVUxMCiNlbHNlCgkmYnVm
W0JVRjBfU1orQlVGMV9TWl0KI2VuZGlmCgk7CiAgICBpb3ZbMl0uaW92X2xlbiA9IEJVRjJf
U1o7CgkJCiAgICB3ciA9IHdyaXRldigxLCAmaW92WzBdLCAzKTsKICAgIGZwcmludGYoc3Rk
ZXJyLCAid3JpdGUgcmV0dXJuZWQ6ICVkXG4iLCB3cik7CiAgICByZXR1cm4gMDsKfQoKCgoK

--------------060407030105010800010608
Content-Type: text/plain;
	name="sys-kern-sys_pipe.c-pipeselwakeup_with_directwrite-a.patch"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
	filename*0="sys-kern-sys_pipe.c-pipeselwakeup_with_directwrite-a.patch"

SW5kZXg6IHN5cy9rZXJuL3N5c19waXBlLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQpSQ1MgZmlsZTogL2hv
bWUvZHVtYmJlbGwvcHJvamVjdHMvZnJlZWJzZC9jdnMtbWlycm9yL3NyYy9zeXMva2Vybi9z
eXNfcGlwZS5jLHYKcmV0cmlldmluZyByZXZpc2lvbiAxLjE5MQpkaWZmIC11IC1yMS4xOTEg
c3lzX3BpcGUuYwotLS0gc3lzL2tlcm4vc3lzX3BpcGUuYwkyNyBNYXkgMjAwNyAxNzozMzox
MCAtMDAwMAkxLjE5MQorKysgc3lzL2tlcm4vc3lzX3BpcGUuYwk2IFNlcCAyMDA3IDEzOjI4
OjIxIC0wMDAwCkBAIC04ODEsNiArODgxLDcgQEAKIAkJCXdha2V1cCh3cGlwZSk7CiAJCX0K
IAkJd3BpcGUtPnBpcGVfc3RhdGUgfD0gUElQRV9XQU5UVzsKKwkJcGlwZXNlbHdha2V1cCh3
cGlwZSk7CiAJCXBpcGV1bmxvY2sod3BpcGUpOwogCQllcnJvciA9IG1zbGVlcCh3cGlwZSwg
UElQRV9NVFgod3BpcGUpLAogCQkgICAgUFJJQklPIHwgUENBVENILCAicGlwZHd3IiwgMCk7
CkBAIC04OTYsNiArODk3LDcgQEAKIAkJCXdha2V1cCh3cGlwZSk7CiAJCX0KIAkJd3BpcGUt
PnBpcGVfc3RhdGUgfD0gUElQRV9XQU5UVzsKKwkJcGlwZXNlbHdha2V1cCh3cGlwZSk7CiAJ
CXBpcGV1bmxvY2sod3BpcGUpOwogCQllcnJvciA9IG1zbGVlcCh3cGlwZSwgUElQRV9NVFgo
d3BpcGUpLAogCQkgICAgUFJJQklPIHwgUENBVENILCAicGlwZHdjIiwgMCk7CkBAIC0xMDgw
LDYgKzEwODIsNyBAQAogCQkJCXdwaXBlLT5waXBlX3N0YXRlICY9IH5QSVBFX1dBTlRSOwog
CQkJCXdha2V1cCh3cGlwZSk7CiAJCQl9CisJCQlwaXBlc2Vsd2FrZXVwKHdwaXBlKTsKIAkJ
CXBpcGV1bmxvY2sod3BpcGUpOwogCQkJZXJyb3IgPSBtc2xlZXAod3BpcGUsIFBJUEVfTVRY
KHJwaXBlKSwgUFJJQklPIHwgUENBVENILAogCQkJICAgICJwaXBid3ciLCAwKTsK
--------------060407030105010800010608--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?46E14C1A.1060606>