From owner-freebsd-current@freebsd.org  Tue Apr  5 07:27:15 2016
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B28DCB03213
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Tue,  5 Apr 2016 07:27:15 +0000 (UTC)
 (envelope-from ohartman@zedat.fu-berlin.de)
Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de
 [130.133.4.66])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 7703D1A8D
 for <freebsd-current@freebsd.org>; Tue,  5 Apr 2016 07:27:14 +0000 (UTC)
 (envelope-from ohartman@zedat.fu-berlin.de)
Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69])
 by outpost.zedat.fu-berlin.de (Exim 4.85)
 with esmtps (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256)
 (envelope-from <ohartman@zedat.fu-berlin.de>)
 id <1anLOL-002oIn-57>; Tue, 05 Apr 2016 09:27:13 +0200
Received: from p578a69f9.dip0.t-ipconnect.de ([87.138.105.249]
 helo=freyja.zeit4.iv.bundesimmobilien.de)
 by inpost2.zedat.fu-berlin.de (Exim 4.85)
 with esmtpsa (TLSv1.2:AES128-GCM-SHA256:128)
 (envelope-from <ohartman@zedat.fu-berlin.de>)
 id <1anLOK-001TsM-Pg>; Tue, 05 Apr 2016 09:27:13 +0200
Date: Tue, 5 Apr 2016 09:27:12 +0200
From: "O. Hartmann" <ohartman@zedat.fu-berlin.de>
To: freebsd-current@freebsd.org
Subject: Re: CURRENT slow and shaky network stability
Message-ID: <20160405092712.131ee52c@freyja.zeit4.iv.bundesimmobilien.de>
In-Reply-To: <201604050646.u356k850078565@slippy.cwsent.com>
References: <20160405082047.670d7241@freyja.zeit4.iv.bundesimmobilien.de>
 <201604050646.u356k850078565@slippy.cwsent.com>
Organization: FU Berlin
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.29; amd64-portbld-freebsd11.0)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Originating-IP: 87.138.105.249
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Apr 2016 07:27:15 -0000

On Mon, 04 Apr 2016 23:46:08 -0700
Cy Schubert <Cy.Schubert@komquats.com> wrote:

> In message <20160405082047.670d7241@freyja.zeit4.iv.bundesimmobilien.de>, 
> "O. H
> artmann" writes:
> > On Sat, 02 Apr 2016 16:14:57 -0700
> > Cy Schubert <Cy.Schubert@komquats.com> wrote:
> >   
> > > In message <20160402231955.41b05526.ohartman@zedat.fu-berlin.de>, "O. 
> > > Hartmann"
> > >  writes:  
> > > > --Sig_/eJJPtbrEuK1nN2zIpc7BmVr
> > > > Content-Type: text/plain; charset=US-ASCII
> > > > Content-Transfer-Encoding: quoted-printable
> > > > 
> > > > Am Sat, 2 Apr 2016 11:39:10 +0200
> > > > "O. Hartmann" <ohartman@zedat.fu-berlin.de> schrieb:
> > > >     
> > > > > Am Sat, 2 Apr 2016 10:55:03 +0200
> > > > > "O. Hartmann" <ohartman@zedat.fu-berlin.de> schrieb:
> > > > >=20    
> > > > > > Am Sat, 02 Apr 2016 01:07:55 -0700
> > > > > > Cy Schubert <Cy.Schubert@komquats.com> schrieb:
> > > > > >  =20    
> > > > > > > In message <56F6C6B0.6010103@protected-networks.net>, Michael
> > > > > > > Butle  
> > r  
> > > > > > > =    
> > > > writes:   =20    
> > > > > > > > -current is not great for interactive use at all. The strategy
> > > > > > > > of pre-emptively dropping idle processes to swap is hurting ..
> > > > > > > > big tim=    
> > > > e.     =20    
> > > > > > >=20
> > > > > > > FreeBSD doesn't "preemptively" or arbitrarily push pages out to
> > > > > > > disk.=    
> > > >  LRU=20    
> > > > > > > doesn't do this.
> > > > > > >    =20    
> > > > > > > >=20
> > > > > > > > Compare inactive memory to swap in this example ..
> > > > > > > >=20
> > > > > > > > 110 processes: 1 running, 108 sleeping, 1 zombie
> > > > > > > > CPU:  1.2% user,  0.0% nice,  4.3% system,  0.0% interrupt,
> > > > > > > > 94.5% i=    
> > > > dle    
> > > > > > > > Mem: 474M Active, 1609M Inact, 764M Wired, 281M Buf, 119M Free
> > > > > > > > Swap: 4096M Total, 917M Used, 3178M Free, 22% Inuse     =20    
> > > > > > >=20
> > > > > > > To analyze this you need to capture vmstat output. You'll see the
> > > > > > > fre=    
> > > > e pool=20    
> > > > > > > dip below a threshold and pages go out to disk in response. If you
> > > > > > > ha=    
> > > > ve=20    
> > > > > > > daemons with small working sets, pages that are not part of the
> > > > > > > worki=    
> > > > ng=20    
> > > > > > > sets for daemons or applications will eventually be paged out.
> > > > > > > This i=    
> > > > s not=20    
> > > > > > > a bad thing. In your example above, the 281 MB of UFS buffers are
> > > > > > > mor=    
> > > > e=20    
> > > > > > > active than the 917 MB paged out. If it's paged out and never used
> > > > > > > ag=    
> > > > ain,=20    
> > > > > > > then it doesn't hurt. However the 281 MB of buffers saves you I/O.
> > > > > > > Th=    
> > > > e=20    
> > > > > > > inactive pages are part of your free pool that were active at one
> > > > > > > tim=    
> > > > e but=20    
> > > > > > > now are not. They may be reclaimed and if they are, you've just
> > > > > > > saved=    
> > > >  more=20    
> > > > > > > I/O.
> > > > > > >=20
> > > > > > > Top is a poor tool to analyze memory use. Vmstat is the better
> > > > > > > tool t=    
> > > > o help=20    
> > > > > > > understand memory use. Inactive memory isn't a bad thing per se.
> > > > > > > Moni=    
> > > > tor=20    
> > > > > > > page outs, scan rate and page reclaims.
> > > > > > >=20
> > > > > > >    =20    
> > > > > >=20
> > > > > > I give up! Tried to check via ssh/vmstat what is going on. Last
> > > > > > lines b=    
> > > > efore broken    
> > > > > > pipe:
> > > > > >=20
> > > > > > [...]
> > > > > > procs  memory       page                    disks
> > > > > > faults           
> > cpu  
> > > > > > r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    sy
> > > > > > c  
> > s  
> > > > > > =    
> > > > us sy id    
> > > > > > 22 0 22 5.8G  1.0G 46319   0   0   0 55721 1297   0   4  219 23907
> > > > > > 540=    
> > > > 0 95  5  0    
> > > > > > 22 0 22 5.4G  1.3G 51733   0   0   0 72436 1162   0   0  108 40869
> > > > > > 345=    
> > > > 9 93  7  0    
> > > > > > 15 0 22  12G  1.2G 54400   0  27   0 52188 1160   0  42  148 52192
> > > > > > 436=    
> > > > 6 91  9  0    
> > > > > > 14 0 22  12G  1.0G 44954   0  37   0 37550 1179   0  39  141 86209
> > > > > > 436=    
> > > > 8 88 12  0    
> > > > > > 26 0 22  12G  1.1G 60258   0  81   0 69459 1119   0  27  123 779569
> > > > > > 704=    
> > > > 359 87 13  0    
> > > > > > 29 3 22  13G  774M 50576   0  68   0 32204 1304   0   2  102 507337
> > > > > > 484=    
> > > > 861 93  7  0    
> > > > > > 27 0 22  13G  937M 47477   0  48   0 59458 1264   3   2  112 68131
> > > > > > 4440=    
> > > > 7 95  5  0    
> > > > > > 36 0 22  13G  829M 83164   0   2   0 82575 1225   1   0  126 99366
> > > > > > 3806=    
> > > > 0 89 11  0    
> > > > > > 35 0 22 6.2G  1.1G 98803   0  13   0 121375 1217   2   8  112 99371
> > > > > > 49=    
> > > > 99 85 15  0    
> > > > > > 34 0 22  13G  723M 54436   0  20   0 36952 1276   0  17  153 29142
> > > > > > 443=    
> > > > 1 95  5  0    
> > > > > > Fssh_packet_write_wait: Connection to 192.168.0.1 port 22: Broken
> > > > > > pip  
> > e  
> > > > > >=20
> > > > > >=20
> > > > > > This makes this crap system completely unusable. The server (FreeBSD
> > > > > > 11=    
> > > > .0-CURRENT #20    
> > > > > > r297503: Sat Apr  2 09:02:41 CEST 2016 amd64) in question did
> > > > > > poudriere=    
> > > >  bulk job. I    
> > > > > > can not even determine what terminal goes down first - another one,
> > > > > > muc=    
> > > > h more time    
> > > > > > idle than the one shwoing the "vmstat 5" output, is still alive!=20
> > > > > >=20
> > > > > > i consider this a serious bug and it is no benefit what happened
> > > > > > sinc  
> > e  
> > > > > > =    
> > > > this "fancy"    
> > > > > > update. :-( =20    
> > > > >=20
> > > > > By the way - it might be of interest and some hint.
> > > > >=20
> > > > > One of my boxes is acting as server and gateway. It utilises NAT,
> > > > > IPFW, w=    
> > > > hen it is under    
> > > > > high load, as it was today, sometimes passing the network flow from
> > > > > ISP i=    
> > > > nto the network    
> > > > > for clients is extremely slow. I do not consider this the reason for
> > > > > coll=    
> > > > apsing ssh    
> > > > > sessions, since this incident happens also under no-load, but in the
> > > > > over=    
> > > > all-view onto    
> > > > > the problem, this could be a hint - I hope.=20    
> > > > 
> > > > I just checked on one box, that "broke pipe" very quickly after I
> > > > started  
> >  p=  
> > > > oudriere,
> > > > while it did well a couple of hours before until the pipe broke. It
> > > > seems  
> >  i=  
> > > > t's load
> > > > dependend when the ssh session gets wrecked, but more important, after
> > > > th  
> > e =  
> > > > long-haul
> > > > poudriere run, I rebooted the box and tried again with the mentioned
> > > > brok  
> > en=  
> > > >  pipe after a
> > > > couple of minutes after poudriere ran. Then I left the box for several
> > > > ho  
> > ur=  
> > > > s and logged
> > > > in again and checked the swap. Although there was for hours no load or
> > > > ot  
> > he=  
> > > > r pressure,
> > > > there were 31% of of swap used - still (box has 16 GB of RAM and is
> > > > prope  
> > ll=  
> > > > ed by a XEON
> > > > E3-1245 V2).
> > > >     
> > > 
> > > 31%! Is it *actively* paging or is the 31% previously paged out and no 
> > > paging is *currently* being experienced? 31% of how swap space in total?
> > > 
> > > Also, what does ps aumx or ps aumxww say? Pipe it to head -40 or similar.
> > > 
> > >   
> > 
> > On FreeBSD 11.0-CURRENT #4 r297573: Tue Apr  5 07:01:19 CEST 2016 amd64,
> > loca l
> > network, no NAT. Stuck ssh session in the middle of administering and
> > leaving the console/ssh session for a couple of minutes:
> > 
> > root        2064   0.0  0.1  91416  8492  -  Is   07:18     0:00.03 sshd:
> > hartmann [priv] (sshd)
> > 
> > hartmann    2108   0.0  0.1  91416  8664  -  I    07:18     0:07.33 sshd:
> > hartmann@pts/0 (sshd)
> > 
> > root       72961   0.0  0.1  91416  8496  -  Is   08:11     0:00.03 sshd:
> > hartmann [priv] (sshd)
> > 
> > hartmann   72970   0.0  0.1  91416  8564  -  S    08:11     0:00.02 sshd:
> > hartmann@pts/1 (sshd)
> > 
> > The situation is worse and i consider this a serious bug.
> >   
> 
> There's not a lot to go on here. Do you have physical access to the machine 
> to pop into DDB and take a look? You did say you're using a lot of swap. 
> IIRC 30%. You didn't answer how much 30% was of. Without more data I can't 
> help you. At the best I can take wild guesses but that won't help you. Try 
> to answer the questions I asked last week and we can go further. Until then 
> all we can do is wildly guess.
> 
> 

Hello Cy, sorry for the lack of information.

The machine in question is not accessible at this very moment. The box has 16
GB of physical RAM, 32 GB of swap (on SSD) and a 4-core/8-thread CPU (I think
tht is also important due to allocation of arbitrary memory). The problem I
described arose when using poudriere. The box uses 6 builders, but each builder
can, as I understand, spwan several instances of jobs for compiling/linking etc.
But - that box is only a placeholder for the weirdness that is going on
(despite the fact that it is using NAT since it is atatched to a DSL line).

To the contrary, the system I face today at work is not(!) behind NAT and
doesn't have the "toy" network binding. This box I'm accessing now has 16 GB of
physical RAM, and two sockets, each populated with a oldish 4-core XEON 5XXX
from the Core2Duo age (no SMT). That box does not run poudriere, only
postgresql and some other services.

In February, I was able to push the other box in question (behind NAT, as a
remark) to its limits using poudriere and using 8 builders. Network became
slow, since the box also acts as gateway, but it never failed, broke or dropped
the ssh session due to "broken pipe". Without changing the config except  the
base system's sources for CURRENT, since ~ two, or at most three weeks for now
I get this weird drops. And this is why I started "wining" - there is also a
drop in performance when compiling world, which elongates the compiling time ~
5 - 10 minutes on the NATed box.

I'm fully aware of being on CURRENT, but I think it is considerably reasonable
to report about those weird things happening now. I did not receive any word
about dramatic changes that could trigger such a behaviour. And as I
understand the thread we are in here, there has been made a change which
results in a more aggressive swap of inactive processes. 

I tried to stop all services on the boxes (postgresql, icinga2, http etc) to
check whether those could force the kernel to swap a process. But the loss of
ssh connection and the very strange behaviour that the ssh connection gets
irresponsive is eratic, which means: it comes sometimes very fast after
seconds not touching the xterm/ssh  from the remote box, sometimes it lasts
up to 30 minutes, even on load. So, there is probably a problem with me
understanding this new feature ...