Date: Sun, 08 May 2016 14:32:51 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 204426] Processes terminating cannot access memory Message-ID: <bug-204426-16-ww6KTSmkie@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-204426-16@https.bugs.freebsd.org/bugzilla/> References: <bug-204426-16@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D204426 --- Comment #108 from Robert Blayzor <rblayzor@inoc.net> --- It's been longer than average now and I have not run into the processes terminating abnormally with the last patch installed against 10.3. HOWEVER,= I have noticed a new issue with the network stack that seems to be happening = at roughly the same interval. I'm not sure if the two are related or if fixing= one problem manifested into another. Basically now we're getting the severs filling up with TCP connections stuc= k in a "CLOSED" state. We'll end up getting thousands of them until connections = to the processes just time out. Sometimes we'll see kernel messages: sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu= eue awaiting acceptance (46 occurrences) sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu= eue awaiting acceptance (46 occurrences) sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu= eue awaiting acceptance (50 occurrences) sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu= eue awaiting acceptance (42 occurrences) sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu= eue awaiting acceptance (44 occurrences) sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu= eue awaiting acceptance (38 occurrences) sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu= eue awaiting acceptance (40 occurrences) [...] But not always... Currently I have a server in this state and I'll see: tcp6 0 0 mta1.imap mta-slb-1.alb1.i.20511 CLOSED tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.43879 CLOSED tcp6 0 0 mta1.imap mta-slb-0.alb1.i.5259 CLOSED tcp6 0 0 mta1.pop3 mta-slb-0.alb1.i.12519 CLOSED tcp6 0 0 mta1.imap mta-slb-1.alb1.i.1316 CLOSED tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.65343 CLOSED tcp6 0 0 mta1.imap mta-slb-0.alb1.i.16289 CLOSED tcp6 0 0 mta1.pop3 mta-slb-0.alb1.i.19215 CLOSED tcp6 32 0 mta1.sieve mta-slb-0.alb1.i.19549 CLOSED tcp6 0 0 mta1.imap mta-slb-1.alb1.i.49287 CLOSED tcp6 32 0 mta1.sieve mta-slb-1.alb1.i.54187 CLOSED tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.39767 CLOSED tcp6 0 0 mta1.imap mta-slb-0.alb1.i.54366 CLOSED tcp6 0 0 mta1.pop3 mta-slb-0.alb1.i.47579 CLOSED tcp6 0 0 mta1.imap mta-slb-1.alb1.i.48798 CLOSED tcp6 0 0 mta1.pop3 mta-slb-1.alb1.i.40190 CLOSED ... [ 1000's of lines truncated ] It's not just Dovecot either, we also will see several stuck in CLOSED from Exim as well. So it doesn't look like an application issue, in fact, sockst= at shows these stuck sockets not related to the process anymore... ie: ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:1:56602 ? ? ? ? tcp6 2607:f058:110:2::1:1:4190 2607:f058:110:2::f:0:32558 ? ? ? ? tcp6 2607:f058:110:2::1:1:110 2607:f058:110:2::f:1:53931 ? ? ? ? tcp6 2607:f058:110:2::1:1:110 2607:f058:110:2::f:0:58671 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:1:58788 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:0:30523 ? ? ? ? tcp6 2607:f058:110:2::1:1:110 2607:f058:10::10:32375 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:0:46131 ? ? ? ? tcp6 2607:f058:110:2::1:1:110 2607:f058:110:2::f:1:50671 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:1:4223 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:1:15773 ? ? ? ? tcp6 2607:f058:110:2::1:1:110 2607:f058:110:2::f:0:26610 ? ? ? ? tcp6 2607:f058:110:2::1:1:4190 2607:f058:110:2::f:0:38765 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:0:42310 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:1:5643 ? ? ? ? tcp6 2607:f058:110:2::1:1:143 2607:f058:110:2::f:0:37143 ? ? ? ? tcp6 2607:f058:110:2::1:1:4190 2607:f058:110:2::f:0:24906 [...] (agan, thousands of lines truncated) netstat -ans -p tcp tcp: 8497364 packets sent 3825626 data packets (484438498 bytes) 12 data packets (5560 bytes) retransmitted 1 data packet unnecessarily retransmitted 0 resends initiated by MTU discovery 4106401 ack-only packets (0 delayed) 0 URG only packets 0 window probe packets 313890 window update packets 251435 control packets 5525333 packets received 4126773 acks (for 483181992 bytes) 108333 duplicate acks 0 acks for unsent data 3787497 packets (1012422242 bytes) received in-sequence 151 completely duplicate packets (0 bytes) 0 old duplicate packets 0 packets with some dup. data (0 bytes duped) 0 out-of-order packets (0 bytes) 0 packets (0 bytes) of data after window 0 window probes 7067 window update packets 6376 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 0 discarded due to memory problems 7546 connection requests 315652 connection accepts 0 bad connection attempts 0 listen queue overflows 73753 ignored RSTs in the windows 323196 connections established (including accepts) 323132 connections closed (including 1414 drops) 121869 connections updated cached RTT on close 121869 connections updated cached RTT variance on close 0 connections updated cached ssthresh on close 2 embryonic connections dropped 4126615 segments updated rtt (of 3726607 attempts) 12 retransmit timeouts 0 connections dropped by rexmit timeout 0 persist timeouts 0 connections dropped by persist timeout 14 Connections (fin_wait_2) dropped because of timeout 19343 keepalive timeouts 19298 keepalive probes sent 45 connections dropped by keepalive 240267 correct ACK header predictions 792401 correct data packet header predictions 315653 syncache entries added 0 retransmitted 0 dupsyn 0 dropped 315652 completed 0 bucket overflow 0 cache overflow 1 reset 0 stale 0 aborted 0 badack 0 unreach 0 zone failures 315653 cookies sent 0 cookies received 164 hostcache entries added 0 bucket overflow 0 SACK recovery episodes 0 segment rexmits in SACK recovery episodes 0 byte rexmits in SACK recovery episodes 0 SACK options (SACK blocks) received 0 SACK options (SACK blocks) sent 0 SACK scoreboard overflow 0 packets with ECN CE bit set 0 packets with ECN ECT(0) bit set 0 packets with ECN ECT(1) bit set 0 successful ECN handshakes 0 times ECN reduced the congestion window 0 packets with valid tcp-md5 signature received 0 packets with invalid tcp-md5 signature received 0 packets with tcp-md5 signature mismatch 0 packets with unexpected tcp-md5 signature received 0 packets without expected tcp-md5 signature received If I attempt to kill and restart the processes, sometimes it works, sometim= es it doesn't and I have to end up rebooting the server. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-204426-16-ww6KTSmkie>