Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 08 May 2016 14:32:51 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-threads@FreeBSD.org
Subject:   [Bug 204426] Processes terminating cannot access memory
Message-ID:  <bug-204426-16-ww6KTSmkie@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204426-16@https.bugs.freebsd.org/bugzilla/>
References:  <bug-204426-16@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D204426

--- Comment #108 from Robert Blayzor <rblayzor@inoc.net> ---
It's been longer than average now and I have not run into the processes
terminating abnormally with the last patch installed against 10.3. HOWEVER,=
 I
have noticed a new issue with the network stack that seems to be happening =
at
roughly the same interval. I'm not sure if the two are related or if fixing=
 one
problem manifested into another.

Basically now we're getting the severs filling up with TCP connections stuc=
k in
a "CLOSED" state. We'll end up getting thousands of them until connections =
to
the processes just time out.

Sometimes we'll see kernel messages:

sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu=
eue
awaiting acceptance (46 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu=
eue
awaiting acceptance (46 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu=
eue
awaiting acceptance (50 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu=
eue
awaiting acceptance (42 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu=
eue
awaiting acceptance (44 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu=
eue
awaiting acceptance (38 occurrences)
sonewconn: pcb 0xfffff80001b36930: Listen queue overflow: 767 already in qu=
eue
awaiting acceptance (40 occurrences)
[...]


But not always... Currently I have a server in this state and I'll see:


tcp6       0      0 mta1.imap              mta-slb-1.alb1.i.20511 CLOSED
tcp6       0      0 mta1.pop3              mta-slb-1.alb1.i.43879 CLOSED
tcp6       0      0 mta1.imap              mta-slb-0.alb1.i.5259  CLOSED
tcp6       0      0 mta1.pop3              mta-slb-0.alb1.i.12519 CLOSED
tcp6       0      0 mta1.imap              mta-slb-1.alb1.i.1316  CLOSED
tcp6       0      0 mta1.pop3              mta-slb-1.alb1.i.65343 CLOSED
tcp6       0      0 mta1.imap              mta-slb-0.alb1.i.16289 CLOSED
tcp6       0      0 mta1.pop3              mta-slb-0.alb1.i.19215 CLOSED
tcp6      32      0 mta1.sieve             mta-slb-0.alb1.i.19549 CLOSED
tcp6       0      0 mta1.imap              mta-slb-1.alb1.i.49287 CLOSED
tcp6      32      0 mta1.sieve             mta-slb-1.alb1.i.54187 CLOSED
tcp6       0      0 mta1.pop3              mta-slb-1.alb1.i.39767 CLOSED
tcp6       0      0 mta1.imap              mta-slb-0.alb1.i.54366 CLOSED
tcp6       0      0 mta1.pop3              mta-slb-0.alb1.i.47579 CLOSED
tcp6       0      0 mta1.imap              mta-slb-1.alb1.i.48798 CLOSED
tcp6       0      0 mta1.pop3              mta-slb-1.alb1.i.40190 CLOSED
... [ 1000's of lines truncated ]


It's not just Dovecot either, we also will see several stuck in CLOSED from
Exim as well. So it doesn't look like an application issue, in fact, sockst=
at
shows these stuck sockets not related to the process anymore... ie:

?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:56602
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:4190
2607:f058:110:2::f:0:32558
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:110
2607:f058:110:2::f:1:53931
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:110
2607:f058:110:2::f:0:58671
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:58788
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:30523
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:110
2607:f058:10::10:32375
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:46131
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:110
2607:f058:110:2::f:1:50671
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:4223
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:15773
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:110
2607:f058:110:2::f:0:26610
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:4190
2607:f058:110:2::f:0:38765
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:42310
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:1:5643
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:143
2607:f058:110:2::f:0:37143
?        ?          ?     ?  tcp6   2607:f058:110:2::1:1:4190
2607:f058:110:2::f:0:24906
[...] (agan, thousands of lines truncated)


netstat -ans -p tcp
tcp:
        8497364 packets sent
                3825626 data packets (484438498 bytes)
                12 data packets (5560 bytes) retransmitted
                1 data packet unnecessarily retransmitted
                0 resends initiated by MTU discovery
                4106401 ack-only packets (0 delayed)
                0 URG only packets
                0 window probe packets
                313890 window update packets
                251435 control packets
        5525333 packets received
                4126773 acks (for 483181992 bytes)
                108333 duplicate acks
                0 acks for unsent data
                3787497 packets (1012422242 bytes) received in-sequence
                151 completely duplicate packets (0 bytes)
                0 old duplicate packets
                0 packets with some dup. data (0 bytes duped)
                0 out-of-order packets (0 bytes)
                0 packets (0 bytes) of data after window
                0 window probes
                7067 window update packets
                6376 packets received after close
                0 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short
                0 discarded due to memory problems
        7546 connection requests
        315652 connection accepts
        0 bad connection attempts
        0 listen queue overflows
        73753 ignored RSTs in the windows
        323196 connections established (including accepts)
        323132 connections closed (including 1414 drops)
                121869 connections updated cached RTT on close
                121869 connections updated cached RTT variance on close
                0 connections updated cached ssthresh on close
        2 embryonic connections dropped
        4126615 segments updated rtt (of 3726607 attempts)
        12 retransmit timeouts
                0 connections dropped by rexmit timeout
        0 persist timeouts
                0 connections dropped by persist timeout
        14 Connections (fin_wait_2) dropped because of timeout
        19343 keepalive timeouts
                19298 keepalive probes sent
                45 connections dropped by keepalive
        240267 correct ACK header predictions
        792401 correct data packet header predictions
        315653 syncache entries added
                0 retransmitted
                0 dupsyn
                0 dropped
                315652 completed
                0 bucket overflow
                0 cache overflow
                1 reset
                0 stale
                0 aborted
                0 badack
                0 unreach
                0 zone failures
        315653 cookies sent
        0 cookies received
        164 hostcache entries added
                0 bucket overflow
        0 SACK recovery episodes
        0 segment rexmits in SACK recovery episodes
        0 byte rexmits in SACK recovery episodes
        0 SACK options (SACK blocks) received
        0 SACK options (SACK blocks) sent
        0 SACK scoreboard overflow
        0 packets with ECN CE bit set
        0 packets with ECN ECT(0) bit set
        0 packets with ECN ECT(1) bit set
        0 successful ECN handshakes
        0 times ECN reduced the congestion window
        0 packets with valid tcp-md5 signature received
        0 packets with invalid tcp-md5 signature received
        0 packets with tcp-md5 signature mismatch
        0 packets with unexpected tcp-md5 signature received
        0 packets without expected tcp-md5 signature received




If I attempt to kill and restart the processes, sometimes it works, sometim=
es
it doesn't and I have to end up rebooting the server.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-204426-16-ww6KTSmkie>