Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Aug 2008 06:10:59 -0700
From:      Jeremy Chadwick <koitsu@FreeBSD.org>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: lighttpd failing to accept new connections ( connection reset )
Message-ID:  <20080828131059.GA46853@icarus.home.lan>
In-Reply-To: <A4FCC80B7CC742C393346F1FFE7AA18F@multiplay.co.uk>
References:  <A4FCC80B7CC742C393346F1FFE7AA18F@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 28, 2008 at 01:13:57PM +0100, Steven Hartland wrote:
> We're using lighttpd here for a new project and we're having issues
> where by it simply stops processing after a 1-2 days.
>
> Having looked at it in some detail this morning it seems that
> the kernel is resetting the connection without notifying the
> lighttpd process there is a new connection attempt. I assume
> that the listen queue is full but why kevent is not notifying
> lighttpd that there are outstanding events is beyond me.
>
>
> The following is a truss of the process which is currently in
> this state:-
> kevent(6,0x0,0,{},11096,{1.000000000})           = 0 (0x0)
> gettimeofday({1219920575.149428},0x0)            = 0 (0x0)
> kevent(6,0x0,0,{},11096,{1.000000000})           = 0 (0x0)
> gettimeofday({1219920576.150443},0x0)            = 0 (0x0)
>
> ktrace of the operation as well:-
> 28363 lighttpd RET   kevent 0
> 28363 lighttpd CALL  gettimeofday(0x7fffffffeb20,0)
> 28363 lighttpd RET   gettimeofday 0
> 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20)
> 28363 lighttpd GIO   fd 6 wrote 0 bytes
>       ""
> 28363 lighttpd GIO   fd 6 read 0 bytes
>       ""
> 28363 lighttpd RET   kevent 0
> 28363 lighttpd CALL  gettimeofday(0x7fffffffeb20,0)
> 28363 lighttpd RET   gettimeofday 0
> 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20)
> 28363 lighttpd GIO   fd 6 wrote 0 bytes
>       ""
> 28363 lighttpd GIO   fd 6 read 0 bytes
>       ""
> 28363 lighttpd RET   kevent 0
> 28363 lighttpd CALL  gettimeofday(0x7fffffffeb20,0)
> 28363 lighttpd RET   gettimeofday 0
> 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20)
> 28363 lighttpd GIO   fd 6 wrote 0 bytes
>       ""
> 28363 lighttpd GIO   fd 6 read 0 bytes
>       ""
> 28363 lighttpd RET   kevent 0
> 28363 lighttpd CALL  gettimeofday(0x7fffffffeb20,0)
> 28363 lighttpd RET   gettimeofday 0
> 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20)
> 28363 lighttpd GIO   fd 6 wrote 0 bytes
>       ""
> 28363 lighttpd GIO   fd 6 read 0 bytes
>       ""
> 28363 lighttpd RET   kevent 0
> 28363 lighttpd CALL  gettimeofday(0x7fffffffeb20,0)
> 28363 lighttpd RET   gettimeofday 0
> 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20)
> 28363 lighttpd GIO   fd 6 wrote 0 bytes
>       ""
> 28363 lighttpd GIO   fd 6 read 0 bytes
>       ""
> 28363 lighttpd RET   kevent 0
> 28363 lighttpd CALL  gettimeofday(0x7fffffffeb20,0)
> 28363 lighttpd RET   gettimeofday 0
> 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20)
>
>
> tcpdump shows:-
> 12:10:29.475255 IP (tos 0x10, ttl  64, id 9536, offset 0, flags [DF], 
> proto: TCP (6), length: 64) client.61224 > server.80: S, cksum 0x6d22 
> (incorrect (-> 0xedfa), 291994449:291994449(0) win 65535 <mss 
> 1460,nop,wscale 1,nop,nop,timestamp 3661727139 0,sackOK,eol>
> 12:10:29.481396 IP (tos 0x0, ttl  61, id 25503, offset 0, flags [DF], 
> proto: TCP (6), length: 60) server.80 > client.61224: S, cksum 0xbf22 
> (correct), 3444532576:3444532576(0) ack 291994450 win 65535 <mss 
> 1460,nop,wscale 9,sackOK,timestamp 3136311843 3661727139>
> 12:10:29.481419 IP (tos 0x10, ttl  64, id 9538, offset 0, flags [DF], 
> proto: TCP (6), length: 52) client.61224 > server.80: ., cksum 0x6d16 
> (incorrect (-> 0x6bd2), 1:1(0) ack 1 win 33304 <nop,nop,timestamp 
> 3661727145 3136311843>
> 12:10:29.487519 IP (tos 0x10, ttl  61, id 25504, offset 0, flags [DF], 
> proto: TCP (6), length: 40) server.80 > client.61224: R, cksum 0x20c7 
> (correct), 3444532577:3444532577(0) win 0
>
> This may have been raised before back 2003 as bug kern/57380
> but it was closed after no response from the reporter.
>
> Another possible issues related to this is:-
> http://trac.lighttpd.net/trac/ticket/1734
>
>
> I've currently got one of the production machines offline
> with this error ( hence the important flag ) in the hope
> that someone can suggest a test which will shed more light
> on the issue before I restart it.

Can you change the polling method in lighttpd to use poll or select
instead of kqueue?  This would help in determining if the problem is
with the daemon itself or the kevent system.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080828131059.GA46853>