Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Jul 2012 16:52:22 +0800
From:      David Xu <listlog2011@gmail.com>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Garrett Cooper <yanegomi@gmail.com>, freebsd-bugs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org
Subject:   Re: kern/170203: [kern] piped dd's don't behave sanely when dealing with a fifo
Message-ID:  <501256C6.5000307@gmail.com>
In-Reply-To: <20120727103622.B933@besplex.bde.org>
References:  <201207262256.q6QMurVf077480@red.freebsd.org> <20120727103622.B933@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2012/7/27 10:07, Bruce Evans wrote:
> On Thu, 26 Jul 2012, Garrett Cooper wrote:
>
>>> Description:
>> Creating a fifo and then dd'ing across the fifo using /dev/zero 
>> doesn't seem to yield the behavior one would expect to have; dd 
>> should either exit thanks to SIGPIPE being sent or the count being 
>> completed.
>>
>> Furthermore, the count is bogus:
>>
>> Terminal 1:
>>
>> $ dd if=fifo bs=512k count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.002121 secs (15449523 bytes/sec)
>> $ dd if=fifo bs=512k count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.001483 secs (22096295 bytes/sec)
>> ...
>
> I think it's working almost as expected.  Large blocks give non-atomic
> I/O, so the reader sees small blocks, then EOF when it gets ahead of
> the writer.  This always happens without SMP.
>
> Not is a bug (debugged below).  There is no SIGPIPE at the start of
> write() because there is a reader then, and no SIGPIPE for the next
> write() because there is no next write() -- the current one doesn't
> notice when the reader goes away.
>
After fixed dd to not open fifo output file in O_RDWR mode, I still 
found the
writer is blocked there even the reader is already exited.
I think this is definitely a bug. if reader is exited, the writer should 
be aborted too,
but I found it still be blocked in state "pipedwt", obviously, the code in
/sys/fs/fifo_vnops.c wants to wake up the writer when the reader is 
closing the fifo,
but it failed, because the bit flag PIPE_WANTW is forgotten to be set by 
writer,
so it skips executing wakeup(), and then the writer has no chance to 
find EOF bit flag
is set.

I have to apply the following two patches to make the bug  go away:
http://people.freebsd.org/~davidxu/patch/fifopipe/kernel_pipe.diff 
<http://people.freebsd.org/%7Edavidxu/patch/fifopipe/kernel_pipe.diff>;
http://people.freebsd.org/~davidxu/patch/fifopipe/dd.diff 
<http://people.freebsd.org/%7Edavidxu/patch/fifopipe/dd.diff>;



> This is what happens under FreeBSD-~5.2 with the old fifo implementation,
> at least.  It also shows a bug in truss(1) -- the current write() is not
> shown, because it hasn't returned.  kdump shows that the write() has
> started but not returned.
>
>> $ dd if=fifo bs=512M count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.003908 secs (8384514 bytes/sec)
>>
>> Terminal 2:
>>
>> $ dd if=/dev/zero bs=512k count=4 of=fifo
>> ^T
>> load: 0.40  cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k
>
> FreeBSD-~5.2 shows [runnable] for the wait channel.  This is
> strange.  dd should be blocked waiting for a reader, and only
> sbwait makes sense for that.  FreeBSD-9 apparently doesn't
> have the new named pipe implementation either.  -current shows
> [pipdwt].  This makes it clearer that is waiting in write()
> and not in open().  dd probably does the wrong thing for
> fifos, by always trying to open files in O_RDWR mode first.
> This breaks the normal synchronization of readers and writers.
> In fact, this explains why there is no SIGPIPE -- there is
> always a reader since dd can always talk to itself.  First
> the open succeeds without blocking as expected.
>
> After changing the O_RDWR to O_WRONLY in FreeBSD-~5.2, dd almost
> works as expected.  The reader reads 4 blocks of size 8K and
> then exits.  The writer first blocks in open.  Then it is
> killed by SIGPIPE.  Its SIGPIPE handling is broken (nonexistent),
> and the signal kills it without it printing a status message:
>
> %   1266 dd       RET   read 524288/0x80000
> %   1266 dd       CALL  write(0x4,0x8063000,0x80000)
> %   1266 dd       RET   write -1 errno 32 Broken pipe
> %   1266 dd       PSIG  SIGPIPE SIG_DFL
>
> The read is from /dev/zero.  The write is of 512K to the fifo.
> This delivers 4*8K then is killed.  If dd caught the signal
> like it should, then we would expect to see either a short
> write().  The signal handling should clear SA_RESTART, else
> the write() would be restarted and would deliver endless
> SIGPIPEs, now for failing writes.  Reporting of short writes
> is quite broken and this is an interesting test for it.
>
> -current delivers 4*64K instead of 4*8K.  This is because
> the i/o unit is BIG_PIPE_SIZE = 64K for nameless pipes and
> now for nameless pipes.  Apparently the unit is 8K for
> sockets.  I think the unit of atomicity is only 512 bytes
> for both.  Certainly, PIPE_BUF is still 512 in limits.h.
> I think limits.h is broken since the unit isn't actually
> 512 bytes for _all_ file types.  For sockets, you can control
> the watermarks and I think this changes the unit of atomicity.
> I wonder if the socket ioctls for this the old named pipe
> implemention.
>
> The pipe wait channel names are less than perfect.  "pipdw"
> means "pipe direct write".  "wt" looks like an abreviation
> for "write", but there are 3 waits in pipe_direct_write()
> and they are distinguished by the suffixes "w", "c" and "t".
> It isn't clear what these mean.
>
>>> How-To-Repeat:
>> mkfifo fifo
>>
>> Terminal 1:
>>
>> dd if=fifo bs=512k count=4
>>
>> Terminal 2:
>>
>> dd if=/dev/zero bs=512k count=4 of=fifo
>
> Remember to kill the writing dd if you stop it with ^Z. Otherwise, since
> the unhacked version is talking to itself, the fifo acts strangely for
> other tests.
>
> conv=block and conv=noerror (with cbs=512k) change the behaviour only
> slightly (slightly worse).  What works easily is omitting the count.
> dd then reads until EOF, in 256 records of size exactly 8K each under
> FreeBSD-~5.2.  Not giving the count is normal practice, since you
> rarely know the block size for pipes and many other file types. It
> there is another bug here, then it is conv=foo not working.  But
> reblocking is confusing, and I probably did it wrong.
>
> ANother thing that doesn't work well here is trying to control the
> writer with SIGPIPE from the reader.  Even if you can get the reblocking
> right and read precisily 2MB, and fix SIGPIPE, then the SIGPIPE may be
> delivered after the writer has dirtied the fifo with a little more than
> 2MB.  The unread data then remains to bite the next reader.
>
> Bruce
> _______________________________________________
> freebsd-bugs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
> To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org"
> .
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?501256C6.5000307>