Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Oct 2018 23:31:44 -0400
From:      Allan Jude <allanjude@freebsd.org>
To:        freebsd-hackers@freebsd.org
Subject:   Re: High load and MySQL slow without apparent reason
Message-ID:  <248cd85b-f36e-58ea-873d-8d89846f1c93@freebsd.org>
In-Reply-To: <CAG0rGZecYsycwuBzhRBngnBc7TG5Y5913VmdLPPhCbodZPKu8Q@mail.gmail.com>
References:  <CAG0rGZecYsycwuBzhRBngnBc7TG5Y5913VmdLPPhCbodZPKu8Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--NztwEwbgfJvkJ04HVBJP9GJl1lMRV0Htb
Content-Type: multipart/mixed; boundary="iVJ1ytEsUb9aACkVJa83rt1B8JQKRSmWE";
 protected-headers="v1"
From: Allan Jude <allanjude@freebsd.org>
To: freebsd-hackers@freebsd.org
Message-ID: <248cd85b-f36e-58ea-873d-8d89846f1c93@freebsd.org>
Subject: Re: High load and MySQL slow without apparent reason
References: <CAG0rGZecYsycwuBzhRBngnBc7TG5Y5913VmdLPPhCbodZPKu8Q@mail.gmail.com>
In-Reply-To: <CAG0rGZecYsycwuBzhRBngnBc7TG5Y5913VmdLPPhCbodZPKu8Q@mail.gmail.com>

--iVJ1ytEsUb9aACkVJa83rt1B8JQKRSmWE
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

On 2018-10-16 22:25, Darek Margas wrote:
> Hi Everyone,
>=20
> I'm trying to refresh my old FreeBSD experience by moving MySQL platfor=
m
> from Linux onto FreebSD+ZFS.
>=20
> Before I ask for your help I would like to give you some context.
>=20
> The machine is Dell server 2x20 cores, Intel IXL NIC, 1TB of RAM and lo=
ts
> of SAS SSD drives.
> The kernel is slightly modified by removing some unused stuff, replacin=
g
> ixl driver with latest from Intel website and enabling NUMA.
> The whole thing runs number of MySQL daemons packed in jails (bridged
> network) with settings optimized for ZFS ARC caching (O_DIRECT, small
> buffers, etc).
>=20
> This is 11.2-RELEASE.
>=20
> When I tested it first time I found troubles with back pressure on ARC
> whilst short in memory leading machine do death. I also found that
> disabling ARC compression solved silent death but decided to make some
> tunes to keep more memory free for sudden need.
>=20
> Ran some tests, used it for replication salves, etc.
>=20
> Here is the thing - how I crashed this machine without understanding wh=
at
> has happened.
>=20
> First my tunes. I adjusted v_free_target and v_free_min aiming to 128G =
and
> 64G respectively. However, I overlooked fact that this is in pages not =
in
> 1k blocks. As result I set:
>=20
> - 700G max ARC size
> - 512G v_free_target
> - 256G v_free_min

You likely want to tune 'vfs.zfs.arc_free_target' to a value very close
to v_free_target or atleast v_free_min to cause ZFS to give back memory
at that level of memory shortage as well.

>=20
> Obviously this is a nonsense, however, the machine worked calm until AR=
C
> got half of memory. Then shit happened. As I made machine with no swap =
at
> all I have got number of zombies and problems with reclaiming console (=
say,
> open VI which works, then exit and VI stays on console while became zom=
bie).
> That was "fixed" by disabling swapping via sysctl. I also noticed 25% o=
f
> CPU taken by "system" with nothing popping in top except pagedaemon and=
 zfs
> (on arc_reclaim).
>=20
> I have added 40G of swap, rebooted machine but kept wrong settings.
>=20
> It was again calm until ARC got half of memory. This is when I found wh=
at I
> did and fixed v_free stuff to be
>=20
> - 128G v_free_target
> - 64G v_free_min
>=20
> The machine started managing memory the right way, wiping inactive to
> laundry and laundering only when needed. I still observed 25% of
> unexplained load from "system" (floating 5-60%) but all seemed OK.
>=20
> At this point I switched one replica to be master and put production
> queries on it.
>=20
> Summarizing the above - the machine had issues and has not been reboote=
d
> but seemed OK with memory management while having unexplained system lo=
ad.
>=20
> Once I switched my SQLs from Linux master to FreeBSD I noticed slow
> performance. There is stored proc called every 15 minutes. On old machi=
ne
> and all others it takes around 30-40s to complete and previous master h=
ad
> spike in ROW executions to 650kps (one minute sample) while new one got=
 it
> up to 350kps and run for nearly 3 minutes.
>=20
> I started looking deeper and found:
> - Made all MySQL settings the same (when possible as some follow platfo=
rm)
> with no improvement
> - MySQL reload did not help
> - Stopping all replicas running around on the same machine (5 of them) =
to
> release resources made it worse (over 5 minutes to complete call). Star=
ting
> replicas made it better again by one minute.
>=20
> BTW - jail was limited to one NUMA zone and half cores. Not all replica=
s
> had the same NUMA and CPU group.
>=20
> I copied ZFS content to test machine which is exactly the same and kick=
ed
> the same MySQL in same jail and with same settings.
> - Test instance ran correctly within similar completion time to old Lin=
ux
> master
> - ARC on test machine was loaded up to 700G so I thought it would be go=
od
> enough to compare but machine still had lots of memory
>=20
> To make it closer I compiled "memory allocator" which simply allocates =
and
> fills memory until killed or system dies.
>=20
> Run it on test machine first:
> - No effect until v_mem_target passed
> - Once passed pagedaemon kicked in, memory got wiped and shifted, swap =
got
> full (paging only anyway)
> - Load around 20% appeared from system, similar to broken production ma=
chine
> - Got down to 50G passing v_free_min
> - KIlled allocator
> - After 1-2s freezing all got back to normal, load from system was gone=
=2E
> - Swap was in use for some time after but finally got clean (that was o=
nly
> 4G swap on test machine)
> - After some time machine is still calm and MySQL fast
>=20
> Repeated the same on production machine:
> - All as above, except:
> - after killing allocator machine got frozen for, say, 10-15s
> - memory was released but load did not change - neither got much higher=

> while allocating memory nor lower after.
> - Machine remained slow
>=20
> Finally I rebooted whole machine and now it is fast while building ARC.=
 I
> believe it won't have the same issue soon as v_free stuff is set correc=
tly,
> however, I need to understand why this MySQL process suffered and wheth=
er
> it was possible to recover it without reboot. I can imagine it was
> something running in a loop or contention on something otherwise unused=
 or
> simply another clash in settings triggering something in unusual way bu=
t
> have no idea where to look to investigate it. Well, it's possible that
> there is a bug too.
>=20
> Before reboot I collected various vmstats, tops, ran ktrace on MySQL an=
d
> sysctl to dump settings. Not posting as don't know what would be useful=
=2E
>=20
> Could you please point me in right direction?
>=20
> Cheers,
> Darek
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.o=
rg"
>=20


--=20
Allan Jude


--iVJ1ytEsUb9aACkVJa83rt1B8JQKRSmWE--

--NztwEwbgfJvkJ04HVBJP9GJl1lMRV0Htb
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQIcBAEBAgAGBQJbxq0kAAoJEBmVNT4SmAt+PHUQALxpVZdNoGRBNv0nDMw86Fzh
yG/he3JC8eEqlIi+t34sPtTkwINc6F9QgRCSkWAe1DyCLDkVgZ8AHZgSeuiNFQVW
tPP4UL5h33fjYO+BMEZy6hdVkHQZivZ9YjyhDuo/s9NjKTekpjk2V8ngOe2W6KD5
vt7GgN04jp43lCtQ4RR3toCjZzkMOZHgaMJZ34n9AOlb1YflJrAYbJpGt4eVnTPN
0DD+hq9RXkAqzPxBfQsCZLB2vFezAgFrv2GZ0AP0otKkZgpe9ahHPzk5899AvRm9
lBMzlW5Qh0I1cs+yfb3Uhb1VefQIuIuAPjSJQjengOdSdcEZWZQCU37IMGSrurm7
22HS3f65OGrId/dE9si4+nX6Vg/ZcSxNnsxt8bYS52Yq6q01HKWZFXp1728vCNvc
hJ+7QN5AnCBPjFpUMHTRmzXLXulRdM3tIsRkFNn3n1FvCnk+SqnoQ8rOs3lpb2yp
Xs/4z5cMahEggqIu6eukJMqo/cxxOHtIQ/0FL6EXndu6OrWJllDOZtnXtYRIEH5x
0M21Mi44h7WvLnyl/SDEYhzTxvPe+/DwrKTKfF4kWlf8wPJcgGP5uUCD+lcQI0ZM
+NsZIDlsqqHuukHQ4Kho+kjZzo8neMZiCEBZRqWX3R0iM/00M2/uB8qNZtEFRMB6
25VJqU04/4DRiPgwPuCg
=PYHj
-----END PGP SIGNATURE-----

--NztwEwbgfJvkJ04HVBJP9GJl1lMRV0Htb--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?248cd85b-f36e-58ea-873d-8d89846f1c93>