Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Apr 2004 18:47:30 -0600
From:      Brandon Erhart <berhart@ErhartGroup.COM>
To:        freebsd-hackers@freebsd.org
Subject:   Weird behavior with either reading or write()ing !?
Message-ID:  <6.0.2.0.2.20040410183811.01c7f668@mx1.erhartgroup.com>

next in thread | raw e-mail | index | archive | help
Hello,

This is a rather odd bug/weird behavior. Confidence is high that it is not 
logic in my code this time. Please read the following carefully!

In a web-crawling program I am writing, I deal with several thousand fds at 
a time. I am using FreeBSD's KQueue to keep track of them all so that I may 
be notified when
an event is pending on a given socket. The program works as it should for 
about 75% of the connections. The other 25% don't work so well.

I have implemented read timeouts in the fashion that, whenever I am in the 
callback function for data being wait to be read off an fd (EVFILT_READ or 
whatever), I store the last time (via gettimeofday()) that data was read on 
that socket. Then, in my main loop, I check all sockets to see if the last 
time data was read isn't any greater than 10 seconds ago.

However, I am receiving a lot of read timeouts. I keep track of the last 
response from the remote server, and the current state I'm in (E.G., sent 
another GET request on a keepalive connection). In several cases, I had 
received a response for the last page I requested, processed/parsed it, and 
sent down another request. However, data never got back to me. Even after 
10 seconds. Hell, even after 30 seconds in some cases.

What I am wondering is, is it possible for either my write() to be failing 
it's ability to get data to the remote site (I check the return value of 
write(), and its always returning the amount of bytes I am writing), or 
possibly for data to be being "dropped" per-se on my end by the kernel (no 
data waiting on the socket). I have all my sockets in O_NONBLOCK mode.

To test the possibility of perhaps KQueue not notifying me of data waiting, 
or me not grabbing the event off the queue in time, I call a read() on the 
socket one last time when I catch the read timeout. Most of the time (99% 
of it) there is no data waiting.

This all seems to be random. It's never consistent (same server) over 
several runs of the program.

Any ideas folks? This has completely stumped me.

Thanks for your support,

Brandon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6.0.2.0.2.20040410183811.01c7f668>