Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Nov 2006 10:05:34 -0500
From:      Bill Moran <wmoran@collaborativefusion.com>
To:        Dieter <freebsd@sopwith.solgatos.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: TCP parameters and interpreting tcpdump output
Message-ID:  <20061119100534.a37a6e5c.wmoran@collaborativefusion.com>
In-Reply-To: <200611190742.HAA04640@sopwith.solgatos.com>
References:  <20061119020247.GB15898@dan.emsphone.com> <200611190742.HAA04640@sopwith.solgatos.com>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Sat, 18 Nov 2006 23:42:31 +0000
Dieter <freebsd@sopwith.solgatos.com> wrote:

[snip]

> Bill writes:
> 
> Bill> My guess would be that your process blocked on stdout.
> Bill> You don't mention what you're doing with stdout from the program, are
> Bill> you just letting it scroll on the terminal, or redirecting it to a file?
> 
> Just redirected to a file.  FFS, soft updates, 7200 rpm SATA drive
> with the disk's write cache turned off.  Input data rate is less
> than 20 M bits/sec.  I can write to the disk at approx 6 M Bytes/sec
> sustained.  (or 10x that with disk write cache turned on, but
> I don't like trashed filesystems after the machine goes down hard)
> The machine and the disk are plenty fast enough, AMD64, 2 GB main memory.
> CPU is 90-something percent idle.
> 
> Sometimes it works fine for extended periods, 30-40 minutes.  Other times
> the src box reports thousands of network errors.  So far I haven't figured
> out what the difference is between the working tests and the failing tests.
> The crontab directory is empty, so it shouldn't be cron jobs.
> 
> > As an experiment, try running the process and redirecting
> > stdout to /dev/null -- if it doesn't exhibit the problem, then you
> > need to look at where you're actually storing the data and speed that
> > part up.
> 
> I've thought of trying /dev/null but haven't yet.  It might provide
> a clue.
> 
> I would expect that the filesystem should be buffering the write
> from short term disk latency.  Surely FreeBSD 6.0 provides the
> classic Unix write-behind?
> 
> The disk activity LED flashes constantly, so it doesn't appear to be
> saving up disk writes and then doing a bunch at once,
> 
> > Is the data coming in at a fairly constant rate?
> 
> Yes.
> 
> > you've got plenty of RAM
> 
> The machine has 2 GB.  I wonder if the process is getting its fair share?
> I have been observing other problems where disk activity to one disk
> will make an unrelated process reading data from a different disk *very*
> unresponsive.

Sounds like a hardware problem to me.  If you've got a crappy SATA
controller that's going to block every now and again, you're going to
have trouble with this.

It's not something impossible to work around.  I get the impression that
this machine is doing little or nothing other than receiving this data.
If that's that case, you can use the entire 2G for buffering, and store
incomming data until the disk starts responding again.  I don't know
off the top of my head, but I seriously doubt if the OS is going to use
2G to buffer disk writes.  You, however, can.  As your program stands,
it will buffer a maximum of 15000 bytes, but you're using blocking IO,
so that doesn't even help you if the write blocks.  If you want to take
advantage of all that RAM, you'll have to add some complexity to your
program.  The following is roughed out, not tested (may contain fenceposts)
but liable to work once you fill in the blanks (although it's lacking any
error checking)  Start by making BUFFER_SIZE _much_ larger -- several megs
at least, or even bigger if this machine is dedicated to this task:

[...]
/* Set stdout to non-blocking */
fnctl(1, O_NONBLOCK);
startread = startwrite = buffer;
while (1) {
	/* Wrap around if we're at the end of the buffer */
	if (startread == buffer + BUFFER_SIZE) {
		startread = buffer;
	}
	/* Find out how much unused space is currently in the buffer */
	if (startwrite >= startread) {
		avail_buffer = BUFFER_SIZE - startwrite;
	} else {
		avail_buffer = startread - startwrite;
	}
	/* Read in as much as possible */
	num_read = read(fd2, startwrite, avail_buffer);
	/* move the write pointer */
	startwrite += num_read;
	/* Wrap around if we're at the end of the buffer */
	if (startwrite == buffer + BUFFER_SIZE) {
		startwrite = buffer;
	}
	/* Calculate how many bytes we have to write */
	if (startwrite <= startread) {
		readytowrite = startwrite - startread;
	} else {
		readytowrite = BUFFER_SIZE - startwrite;
	}
	/* Here is the key ... this only works if descriptor 1 is
         * non-blocking ... otherwise your code will wait here until
	 * all the bytes are written ... you don't want that
	 */
	num_written = write(1, startwrite, readytowrite);
	startwrite += num_written;
}



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?20061119100534.a37a6e5c.wmoran>