Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 9 Apr 2016 10:54:44 +0200
From:      "O. Hartmann" <ohartman@zedat.fu-berlin.de>
To:        Cy Schubert <Cy.Schubert@komquats.com>
Cc:        Michael Butler <imb@protected-networks.net>, "K. Macy" <kmacy@freebsd.org>, FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: CURRENT slow and shaky network stability
Message-ID:  <20160409105444.7020f2f1.ohartman@zedat.fu-berlin.de>
In-Reply-To: <201604050646.u356k850078565@slippy.cwsent.com>
References:  <20160405082047.670d7241@freyja.zeit4.iv.bundesimmobilien.de> <201604050646.u356k850078565@slippy.cwsent.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/SqWr.x1C_BgJVIYh7m_9T5y
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Am Mon, 04 Apr 2016 23:46:08 -0700
Cy Schubert <Cy.Schubert@komquats.com> schrieb:

> In message <20160405082047.670d7241@freyja.zeit4.iv.bundesimmobilien.de>,=
=20
> "O. H
> artmann" writes:
> > On Sat, 02 Apr 2016 16:14:57 -0700
> > Cy Schubert <Cy.Schubert@komquats.com> wrote:
> >  =20
> > > In message <20160402231955.41b05526.ohartman@zedat.fu-berlin.de>, "O.=
=20
> > > Hartmann"
> > >  writes: =20
> > > > --Sig_/eJJPtbrEuK1nN2zIpc7BmVr
> > > > Content-Type: text/plain; charset=3DUS-ASCII
> > > > Content-Transfer-Encoding: quoted-printable
> > > >=20
> > > > Am Sat, 2 Apr 2016 11:39:10 +0200
> > > > "O. Hartmann" <ohartman@zedat.fu-berlin.de> schrieb:
> > > >    =20
> > > > > Am Sat, 2 Apr 2016 10:55:03 +0200
> > > > > "O. Hartmann" <ohartman@zedat.fu-berlin.de> schrieb:
> > > > >=3D20   =20
> > > > > > Am Sat, 02 Apr 2016 01:07:55 -0700
> > > > > > Cy Schubert <Cy.Schubert@komquats.com> schrieb:
> > > > > >  =3D20   =20
> > > > > > > In message <56F6C6B0.6010103@protected-networks.net>, Michael=
 Butle =20
> > r =20
> > > > > > > =3D   =20
> > > > writes:   =3D20   =20
> > > > > > > > -current is not great for interactive use at all. The strat=
egy of
> > > > > > > > pre-emptively dropping idle processes to swap is hurting ..=
 big
> > > > > > > > tim=3D   =20
> > > > e.     =3D20   =20
> > > > > > >=3D20
> > > > > > > FreeBSD doesn't "preemptively" or arbitrarily push pages out =
to
> > > > > > > disk.=3D   =20
> > > >  LRU=3D20   =20
> > > > > > > doesn't do this.
> > > > > > >    =3D20   =20
> > > > > > > >=3D20
> > > > > > > > Compare inactive memory to swap in this example ..
> > > > > > > >=3D20
> > > > > > > > 110 processes: 1 running, 108 sleeping, 1 zombie
> > > > > > > > CPU:  1.2% user,  0.0% nice,  4.3% system,  0.0% interrupt,=
 94.5%
> > > > > > > > i=3D   =20
> > > > dle   =20
> > > > > > > > Mem: 474M Active, 1609M Inact, 764M Wired, 281M Buf, 119M F=
ree
> > > > > > > > Swap: 4096M Total, 917M Used, 3178M Free, 22% Inuse     =3D=
20   =20
> > > > > > >=3D20
> > > > > > > To analyze this you need to capture vmstat output. You'll see=
 the
> > > > > > > fre=3D   =20
> > > > e pool=3D20   =20
> > > > > > > dip below a threshold and pages go out to disk in response. I=
f you
> > > > > > > ha=3D   =20
> > > > ve=3D20   =20
> > > > > > > daemons with small working sets, pages that are not part of t=
he
> > > > > > > worki=3D   =20
> > > > ng=3D20   =20
> > > > > > > sets for daemons or applications will eventually be paged out=
. This
> > > > > > > i=3D   =20
> > > > s not=3D20   =20
> > > > > > > a bad thing. In your example above, the 281 MB of UFS buffers=
 are
> > > > > > > mor=3D   =20
> > > > e=3D20   =20
> > > > > > > active than the 917 MB paged out. If it's paged out and never=
 used
> > > > > > > ag=3D   =20
> > > > ain,=3D20   =20
> > > > > > > then it doesn't hurt. However the 281 MB of buffers saves you=
 I/O.
> > > > > > > Th=3D   =20
> > > > e=3D20   =20
> > > > > > > inactive pages are part of your free pool that were active at=
 one
> > > > > > > tim=3D   =20
> > > > e but=3D20   =20
> > > > > > > now are not. They may be reclaimed and if they are, you've ju=
st
> > > > > > > saved=3D   =20
> > > >  more=3D20   =20
> > > > > > > I/O.
> > > > > > >=3D20
> > > > > > > Top is a poor tool to analyze memory use. Vmstat is the bette=
r tool
> > > > > > > t=3D   =20
> > > > o help=3D20   =20
> > > > > > > understand memory use. Inactive memory isn't a bad thing per =
se.
> > > > > > > Moni=3D   =20
> > > > tor=3D20   =20
> > > > > > > page outs, scan rate and page reclaims.
> > > > > > >=3D20
> > > > > > >    =3D20   =20
> > > > > >=3D20
> > > > > > I give up! Tried to check via ssh/vmstat what is going on. Last=
 lines
> > > > > > b=3D   =20
> > > > efore broken   =20
> > > > > > pipe:
> > > > > >=3D20
> > > > > > [...]
> > > > > > procs  memory       page                    disks     faults   =
       =20
> > cpu =20
> > > > > > r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    s=
y    c =20
> > s =20
> > > > > > =3D   =20
> > > > us sy id   =20
> > > > > > 22 0 22 5.8G  1.0G 46319   0   0   0 55721 1297   0   4  219 23=
907
> > > > > > 540=3D   =20
> > > > 0 95  5  0   =20
> > > > > > 22 0 22 5.4G  1.3G 51733   0   0   0 72436 1162   0   0  108 40=
869
> > > > > > 345=3D   =20
> > > > 9 93  7  0   =20
> > > > > > 15 0 22  12G  1.2G 54400   0  27   0 52188 1160   0  42  148 52=
192
> > > > > > 436=3D   =20
> > > > 6 91  9  0   =20
> > > > > > 14 0 22  12G  1.0G 44954   0  37   0 37550 1179   0  39  141 86=
209
> > > > > > 436=3D   =20
> > > > 8 88 12  0   =20
> > > > > > 26 0 22  12G  1.1G 60258   0  81   0 69459 1119   0  27  123 77=
9569
> > > > > > 704=3D   =20
> > > > 359 87 13  0   =20
> > > > > > 29 3 22  13G  774M 50576   0  68   0 32204 1304   0   2  102 50=
7337
> > > > > > 484=3D   =20
> > > > 861 93  7  0   =20
> > > > > > 27 0 22  13G  937M 47477   0  48   0 59458 1264   3   2  112 68=
131
> > > > > > 4440=3D   =20
> > > > 7 95  5  0   =20
> > > > > > 36 0 22  13G  829M 83164   0   2   0 82575 1225   1   0  126 99=
366
> > > > > > 3806=3D   =20
> > > > 0 89 11  0   =20
> > > > > > 35 0 22 6.2G  1.1G 98803   0  13   0 121375 1217   2   8  112 9=
9371
> > > > > > 49=3D   =20
> > > > 99 85 15  0   =20
> > > > > > 34 0 22  13G  723M 54436   0  20   0 36952 1276   0  17  153 29=
142
> > > > > > 443=3D   =20
> > > > 1 95  5  0   =20
> > > > > > Fssh_packet_write_wait: Connection to 192.168.0.1 port 22: Brok=
en pip =20
> > e =20
> > > > > >=3D20
> > > > > >=3D20
> > > > > > This makes this crap system completely unusable. The server (Fr=
eeBSD
> > > > > > 11=3D   =20
> > > > .0-CURRENT #20   =20
> > > > > > r297503: Sat Apr  2 09:02:41 CEST 2016 amd64) in question did
> > > > > > poudriere=3D   =20
> > > >  bulk job. I   =20
> > > > > > can not even determine what terminal goes down first - another =
one,
> > > > > > muc=3D   =20
> > > > h more time   =20
> > > > > > idle than the one shwoing the "vmstat 5" output, is still alive=
!=3D20
> > > > > >=3D20
> > > > > > i consider this a serious bug and it is no benefit what happene=
d sinc =20
> > e =20
> > > > > > =3D   =20
> > > > this "fancy"   =20
> > > > > > update. :-( =3D20   =20
> > > > >=3D20
> > > > > By the way - it might be of interest and some hint.
> > > > >=3D20
> > > > > One of my boxes is acting as server and gateway. It utilises NAT,=
 IPFW,
> > > > > w=3D   =20
> > > > hen it is under   =20
> > > > > high load, as it was today, sometimes passing the network flow fr=
om ISP
> > > > > i=3D   =20
> > > > nto the network   =20
> > > > > for clients is extremely slow. I do not consider this the reason =
for
> > > > > coll=3D   =20
> > > > apsing ssh   =20
> > > > > sessions, since this incident happens also under no-load, but in =
the
> > > > > over=3D   =20
> > > > all-view onto   =20
> > > > > the problem, this could be a hint - I hope.=3D20   =20
> > > >=20
> > > > I just checked on one box, that "broke pipe" very quickly after I s=
tarted =20
> >  p=3D =20
> > > > oudriere,
> > > > while it did well a couple of hours before until the pipe broke. It=
 seems =20
> >  i=3D =20
> > > > t's load
> > > > dependend when the ssh session gets wrecked, but more important, af=
ter th =20
> > e =3D =20
> > > > long-haul
> > > > poudriere run, I rebooted the box and tried again with the mentione=
d brok =20
> > en=3D =20
> > > >  pipe after a
> > > > couple of minutes after poudriere ran. Then I left the box for seve=
ral ho =20
> > ur=3D =20
> > > > s and logged
> > > > in again and checked the swap. Although there was for hours no load=
 or ot =20
> > he=3D =20
> > > > r pressure,
> > > > there were 31% of of swap used - still (box has 16 GB of RAM and is=
 prope =20
> > ll=3D =20
> > > > ed by a XEON
> > > > E3-1245 V2).
> > > >    =20
> > >=20
> > > 31%! Is it *actively* paging or is the 31% previously paged out and n=
o=20
> > > paging is *currently* being experienced? 31% of how swap space in tot=
al?
> > >=20
> > > Also, what does ps aumx or ps aumxww say? Pipe it to head -40 or simi=
lar.
> > >=20
> > >  =20
> >=20
> > On FreeBSD 11.0-CURRENT #4 r297573: Tue Apr  5 07:01:19 CEST 2016 amd64=
, loca
> > l
> > network, no NAT. Stuck ssh session in the middle of administering and l=
eaving
> > the console/ssh session for a couple of minutes:
> >=20
> > root        2064   0.0  0.1  91416  8492  -  Is   07:18     0:00.03 ssh=
d:
> > hartmann [priv] (sshd)
> >=20
> > hartmann    2108   0.0  0.1  91416  8664  -  I    07:18     0:07.33 ssh=
d:
> > hartmann@pts/0 (sshd)
> >=20
> > root       72961   0.0  0.1  91416  8496  -  Is   08:11     0:00.03 ssh=
d:
> > hartmann [priv] (sshd)
> >=20
> > hartmann   72970   0.0  0.1  91416  8564  -  S    08:11     0:00.02 ssh=
d:
> > hartmann@pts/1 (sshd)
> >=20
> > The situation is worse and i consider this a serious bug.
> >  =20
>=20
> There's not a lot to go on here. Do you have physical access to the machi=
ne=20
> to pop into DDB and take a look? You did say you're using a lot of swap.=
=20
> IIRC 30%. You didn't answer how much 30% was of. Without more data I can'=
t=20
> help you. At the best I can take wild guesses but that won't help you. Tr=
y=20
> to answer the questions I asked last week and we can go further. Until th=
en=20
> all we can do is wildly guess.
>=20
>=20

Apologies for the late answer, I'm busy.

Well, The "homebox" is physical accessible as well as the systems at work, =
but at work
they are heavily used right now.

As you stated in your prior to this Email, I "overload" the boxes. Yes, I d=
o this by
intention and FreeBSD CURRENT withstood those attacks - approximately until=
 3 or 4 weeks
ago, when these problems occured.

30% swap was the "remain" after I started poudriere, poudriere "died" due t=
o a
lost/broken pipe ssh session and did not relax after hours! The box didn't =
do anything in
that time after the pipe was broken. So I mentioned this.=20

You also mentioned UFS and ZFS concurrency. Yes, I use a mixed system. UFS =
for the
system's partitions, and ZFS for the data volumes. UFS is on SSDs "faster",=
 but this is
only a subjective impression of mine. Having /usr/ports on UFS and ZFS and =
enough memory
(32 GB RAM) shows significant differences on the very same HDD drive: while=
 UFS has
finished a "matured" svn tree, the ZFS based tree could take up to 5 or 6 m=
inutes until
finished. I think this is due to the growing .svn-folder. But on ZFS this o=
ccurs only the
first time the update of /usr/ports is done.

Just to say: if UFS and ZFS coexistency is critical, this is defintely a mu=
st for the
handbook!

But on the other hand, what I complain about is a dramatically change in st=
ability of
CURRENT since the first occurency of the reported problems. Before, the ver=
y same
hardware, the very same setup, the very same jobs performed well. I pushed =
the boxes with
poudriere and several scientific jobs to their limits, and they took it lik=
e a German
tank.=20

By the way, I use csh in all scenarios - I do not know whether this helps.

So, I'm at this moment quite unfamiliar with deeper investigations of the F=
reeBSD OS with
tools for debugging, but this has high priority on my to-do-list. If someon=
e can hint me
towards the right tools and literature (manpages, also maybe sections in de=
velopment
literature for FreeBSD), it would be highly appreciated.

And another info that came just to my mind right now:

I use tmpfs on /tmp and /var/run. I also have a GELI encrypted swap partiti=
on (on UFS
based SSD).=20

Kind regards,

Oliver

--Sig_/SqWr.x1C_BgJVIYh7m_9T5y
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJXCMNUAAoJEOgBcD7A/5N8AfcIAKniTue/BHkODRYcD3YWEhu1
A44k4TF6jBBbgQkb8tt/gEKkRTCT3dOyK5Sjq6eASqO4jaoflVlU7YMS2RjVWl7K
nt7xLsNjA1QcysoFc7ADCDACSdWaFEBpaYYaw+wPPL6K8/q8tCo3aD0lUjsvBHA1
TDmxWrGrtP8C0cirEafKJIrSeRlacQcCKFKoHbjdxn5ja4Pvf138S41/yH2Np92y
kw7PlYKOvwooIP+LLs+CkUxyvi9MRfYnwaKx/gdfzHtQ1RHk2BAQd2FyCRZyGNi0
PhTl03ats4Yvhb6Hb9RbJEAYmcMmwg25zuMlA/X+FNfh2ULgm+8TGy9Cf6OHn/Q=
=POS3
-----END PGP SIGNATURE-----

--Sig_/SqWr.x1C_BgJVIYh7m_9T5y--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160409105444.7020f2f1.ohartman>