From owner-freebsd-current@freebsd.org  Sat Apr  2 21:19:36 2016
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 963DFB01A7C
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Sat,  2 Apr 2016 21:19:36 +0000 (UTC)
 (envelope-from ohartman@zedat.fu-berlin.de)
Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de
 [130.133.4.66])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 57CC11AC1;
 Sat,  2 Apr 2016 21:19:35 +0000 (UTC)
 (envelope-from ohartman@zedat.fu-berlin.de)
Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69])
 by outpost.zedat.fu-berlin.de (Exim 4.85)
 with esmtps (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256)
 (envelope-from <ohartman@zedat.fu-berlin.de>)
 id <1amSx2-003Vdv-W9>; Sat, 02 Apr 2016 23:19:25 +0200
Received: from x55b2332a.dyn.telefonica.de ([85.178.51.42]
 helo=thor.walstatt.dynvpn.de)
 by inpost2.zedat.fu-berlin.de (Exim 4.85)
 with esmtpsa (TLSv1.2:AES128-GCM-SHA256:128)
 (envelope-from <ohartman@zedat.fu-berlin.de>)
 id <1amSx2-001Ppv-Le>; Sat, 02 Apr 2016 23:19:24 +0200
Date: Sat, 2 Apr 2016 23:19:55 +0200
From: "O. Hartmann" <ohartman@zedat.fu-berlin.de>
To: Cy Schubert <Cy.Schubert@komquats.com>
Cc: Michael Butler <imb@protected-networks.net>, "K. Macy"
 <kmacy@freebsd.org>, FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject: Re: CURRENT slow and shaky network stability
Message-ID: <20160402231955.41b05526.ohartman@zedat.fu-berlin.de>
In-Reply-To: <20160402113910.14de7eaf.ohartman@zedat.fu-berlin.de>
References: <56F6C6B0.6010103@protected-networks.net>
 <201604020807.u3287tgc034452@slippy.cwsent.com>
 <20160402105503.7ede5be1.ohartman@zedat.fu-berlin.de>
 <20160402113910.14de7eaf.ohartman@zedat.fu-berlin.de>
Organization: FU Berlin
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.29; amd64-portbld-freebsd11.0)
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
 boundary="Sig_/eJJPtbrEuK1nN2zIpc7BmVr"; protocol="application/pgp-signature"
X-Originating-IP: 85.178.51.42
X-ZEDAT-Hint: A
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Apr 2016 21:19:36 -0000

--Sig_/eJJPtbrEuK1nN2zIpc7BmVr
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Am Sat, 2 Apr 2016 11:39:10 +0200
"O. Hartmann" <ohartman@zedat.fu-berlin.de> schrieb:

> Am Sat, 2 Apr 2016 10:55:03 +0200
> "O. Hartmann" <ohartman@zedat.fu-berlin.de> schrieb:
>=20
> > Am Sat, 02 Apr 2016 01:07:55 -0700
> > Cy Schubert <Cy.Schubert@komquats.com> schrieb:
> >  =20
> > > In message <56F6C6B0.6010103@protected-networks.net>, Michael Butler =
writes:   =20
> > > > -current is not great for interactive use at all. The strategy of
> > > > pre-emptively dropping idle processes to swap is hurting .. big tim=
e.     =20
> > >=20
> > > FreeBSD doesn't "preemptively" or arbitrarily push pages out to disk.=
 LRU=20
> > > doesn't do this.
> > >    =20
> > > >=20
> > > > Compare inactive memory to swap in this example ..
> > > >=20
> > > > 110 processes: 1 running, 108 sleeping, 1 zombie
> > > > CPU:  1.2% user,  0.0% nice,  4.3% system,  0.0% interrupt, 94.5% i=
dle
> > > > Mem: 474M Active, 1609M Inact, 764M Wired, 281M Buf, 119M Free
> > > > Swap: 4096M Total, 917M Used, 3178M Free, 22% Inuse     =20
> > >=20
> > > To analyze this you need to capture vmstat output. You'll see the fre=
e pool=20
> > > dip below a threshold and pages go out to disk in response. If you ha=
ve=20
> > > daemons with small working sets, pages that are not part of the worki=
ng=20
> > > sets for daemons or applications will eventually be paged out. This i=
s not=20
> > > a bad thing. In your example above, the 281 MB of UFS buffers are mor=
e=20
> > > active than the 917 MB paged out. If it's paged out and never used ag=
ain,=20
> > > then it doesn't hurt. However the 281 MB of buffers saves you I/O. Th=
e=20
> > > inactive pages are part of your free pool that were active at one tim=
e but=20
> > > now are not. They may be reclaimed and if they are, you've just saved=
 more=20
> > > I/O.
> > >=20
> > > Top is a poor tool to analyze memory use. Vmstat is the better tool t=
o help=20
> > > understand memory use. Inactive memory isn't a bad thing per se. Moni=
tor=20
> > > page outs, scan rate and page reclaims.
> > >=20
> > >    =20
> >=20
> > I give up! Tried to check via ssh/vmstat what is going on. Last lines b=
efore broken
> > pipe:
> >=20
> > [...]
> > procs  memory       page                    disks     faults         cpu
> > r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    sy    cs =
us sy id
> > 22 0 22 5.8G  1.0G 46319   0   0   0 55721 1297   0   4  219 23907  540=
0 95  5  0
> > 22 0 22 5.4G  1.3G 51733   0   0   0 72436 1162   0   0  108 40869  345=
9 93  7  0
> > 15 0 22  12G  1.2G 54400   0  27   0 52188 1160   0  42  148 52192  436=
6 91  9  0
> > 14 0 22  12G  1.0G 44954   0  37   0 37550 1179   0  39  141 86209  436=
8 88 12  0
> > 26 0 22  12G  1.1G 60258   0  81   0 69459 1119   0  27  123 779569 704=
359 87 13  0
> > 29 3 22  13G  774M 50576   0  68   0 32204 1304   0   2  102 507337 484=
861 93  7  0
> > 27 0 22  13G  937M 47477   0  48   0 59458 1264   3   2  112 68131 4440=
7 95  5  0
> > 36 0 22  13G  829M 83164   0   2   0 82575 1225   1   0  126 99366 3806=
0 89 11  0
> > 35 0 22 6.2G  1.1G 98803   0  13   0 121375 1217   2   8  112 99371  49=
99 85 15  0
> > 34 0 22  13G  723M 54436   0  20   0 36952 1276   0  17  153 29142  443=
1 95  5  0
> > Fssh_packet_write_wait: Connection to 192.168.0.1 port 22: Broken pipe
> >=20
> >=20
> > This makes this crap system completely unusable. The server (FreeBSD 11=
.0-CURRENT #20
> > r297503: Sat Apr  2 09:02:41 CEST 2016 amd64) in question did poudriere=
 bulk job. I
> > can not even determine what terminal goes down first - another one, muc=
h more time
> > idle than the one shwoing the "vmstat 5" output, is still alive!=20
> >=20
> > i consider this a serious bug and it is no benefit what happened since =
this "fancy"
> > update. :-( =20
>=20
> By the way - it might be of interest and some hint.
>=20
> One of my boxes is acting as server and gateway. It utilises NAT, IPFW, w=
hen it is under
> high load, as it was today, sometimes passing the network flow from ISP i=
nto the network
> for clients is extremely slow. I do not consider this the reason for coll=
apsing ssh
> sessions, since this incident happens also under no-load, but in the over=
all-view onto
> the problem, this could be a hint - I hope.=20

I just checked on one box, that "broke pipe" very quickly after I started p=
oudriere,
while it did well a couple of hours before until the pipe broke. It seems i=
t's load
dependend when the ssh session gets wrecked, but more important, after the =
long-haul
poudriere run, I rebooted the box and tried again with the mentioned broken=
 pipe after a
couple of minutes after poudriere ran. Then I left the box for several hour=
s and logged
in again and checked the swap. Although there was for hours no load or othe=
r pressure,
there were 31% of of swap used - still (box has 16 GB of RAM and is propell=
ed by a XEON
E3-1245 V2).

--Sig_/eJJPtbrEuK1nN2zIpc7BmVr
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEbBAEBCAAGBQJXADd7AAoJEOgBcD7A/5N8pLgH+Mi7sirm20gfmc98xIKyeYDE
rEMIoUnhdgoBZ2hM2hWtyHlyMUXw/j/EjGi6Z/HsFZVdEvialKb/rUWDzTkAtp3R
bQDaYU9bQ2muWcku/ENvGdfdUa3VYRCh6BOiHFcciPITDoAvi5wRsZeF5KgwfdIx
2wiJeOS4EcT8LcmhE19OiKPEJc3esjy1NkLQi+JKBwT06hVf6QiVmXcmgOxWoVCX
2xED4O9Hc6TfPb5ig0q8Fjkgg2ojMk9AL1Kcy4nrZ02z8hOCUjMrPTM5dSQMBy3X
AHGZy5hn5/0QvJBTU4XW08HvZtag00bioqbMPg4ZiJxU7O5Xv+SpiXIPErIocg==
=ZLue
-----END PGP SIGNATURE-----

--Sig_/eJJPtbrEuK1nN2zIpc7BmVr--