Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Jun 2017 16:51:13 +0000
From:      "Caza, Aaron" <Aaron.Caza@ca.weatherford.com>
To:        Allan Jude <allanjude@freebsd.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24 hours
Message-ID:  <a8523e8099404bd699525f8ff7763819@DM2PR58MB013.032d.mgd.msft.net>

next in thread | raw e-mail | index | archive | help
Thanks Allan for the suggestions.  I tried gstat -d but deletes (d/s) doesn=
't seem to be it as it stays at 0 despite vfs.zfs.trim.enabled=3D1.

This is most likely due to the "layering" I use as, for historical reasons,=
 I have GEOM ELI set up to essentially emulate 4k sectors regardless of the=
 underlying media.  I do my own alignment and partition sizing as well as h=
ave the ZFS record size set to 8k for Postgres.

In gstat, the SSDs %busy is 90-100% on startup after reboot.  Once the perf=
ormance degradation hits (<24 hours later), I'm seeing %busy at ~10%.

#!/bin/sh
psql --username=3Dtest --password=3Dsupersecret -h /db -d test << EOL
\timing on
select count(*) from test;
\q
EOL

Sample run of above script after reboot (before degradation hits) (Samsung =
850 Pros in ZFS mirror):
Timing is on.
  count
----------
 21568508
(1 row)

Time: 57029.262 ms

Sample run of above script after degradation (Samsung 850 Pros in ZFS mirro=
r):
Timing is on.
  count
----------
 21568508
(1 row)

Time: 583595.239 ms
(Uptime ~1 day in this particular case.)


Any other suggestions?

Regards,
A

-----Original Message-----
From: owner-freebsd-hackers@freebsd.org [mailto:owner-freebsd-hackers@freeb=
sd.org] On Behalf Of Allan Jude
Sent: Saturday, June 10, 2017 9:40 PM
To: freebsd-hackers@freebsd.org
Subject: [EXTERNAL] Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performan=
ce drop < 24 hours

On 06/10/2017 12:36, Slawa Olhovchenkov wrote:
> On Sat, Jun 10, 2017 at 04:25:59PM +0000, Caza, Aaron wrote:
>
>> Gents,
>>
>> I'm experiencing an issue where iterating over a PostgreSQL table of ~21=
.5 million rows (select count(*)) goes from ~35 seconds to ~635 seconds on =
Intel 540 SSDs.  This is using a FreeBSD 10 amd64 stable kernel back from J=
an 2017.  SSDs are basically 2 drives in a ZFS mirrored zpool.  I'm using P=
ostgreSQL 9.5.7.
>>
>> I've tried:
>>
>> *       Using the FreeBSD10 amd64 stable kernel snapshot of May 25, 2017=
.
>>
>> *       Tested on half a dozen machines with different models of SSDs:
>>
>> o   Intel 510s (120GB) in ZFS mirrored pair
>>
>> o   Intel 520s (120GB) in ZFS mirrored pair
>>
>> o   Intel 540s (120GB) in ZFS mirrored pair
>>
>> o   Samsung 850 Pros (256GB) in ZFS mirrored pair
>>
>> *       Using bonnie++ to remove Postgres from the equation and performa=
nce does indeed drop.
>>
>> *       Rebooting server and immediately re-running test and performance=
 is back to original.
>>
>> *       Tried using Karl Denninger's patch from PR187594 (which took som=
e work to find a kernel that the FreeBSD10 patch would both apply and compi=
le cleanly against).
>>
>> *       Tried disabling ZFS lz4 compression.
>>
>> *       Ran the same test on a FreeBSD9.0 amd64 system using PostgreSQL =
9.1.3 with 2 Intel 520s in ZFS mirrored pair.  System had 165 days uptime a=
nd test took ~80 seconds after which I rebooted and re-ran test and was sti=
ll at ~80 seconds (older processor and memory in this system).
>>
>> I realize that there's a whole lot of info I'm not including (dmesg, zfs=
-stats -a, gstat, et cetera): I'm hoping some enlightened individual will b=
e able to point me to a solution with only the above to go on.
>
> Just a random guess: can you try r307264 (I am mean regression in
> r307266)?
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org=
"
>

This sounds a bit like an issue I investigated for a customer a few months =
ago.

Look at gstat -d (includes DELETE operations like TRIM)

If you see a lot of that happening, but try: vfs.zfs.trim.enabled=3D0 in /b=
oot/loader.conf and see if your issues go away.

the FreeBSD TRIM code for ZFS basicallys waits until the sector has been fr=
ee for a while (to avoid doing a TRIM on a block we'll immediately reuse), =
so your benchmark will run file for a little while, then suddenly the TRIM =
will kick in.

For postgres, fio, bonnie++ etc, make sure the ZFS dataset you are storing =
the data on / benchmarking has a recordsize that matches the workload.

If you are doing a write-only benchmark, and you see lots of reads in gstat=
, you know you are having to do read/modify/write's, and that is why your p=
erformance is so bad.


--
Allan Jude
_______________________________________________
freebsd-hackers@freebsd.org mailing list https://lists.freebsd.org/mailman/=
listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"

This message may contain confidential and privileged information. If it has=
 been sent to you in error, please reply to advise the sender of the error =
and then immediately delete it. If you are not the intended recipient, do n=
ot read, copy, disclose or otherwise use this message. The sender disclaims=
 any liability for such unauthorized use. PLEASE NOTE that all incoming e-m=
ails sent to Weatherford e-mail accounts will be archived and may be scanne=
d by us and/or by external service providers to detect and prevent threats =
to our systems, investigate illegal or inappropriate behavior, and/or elimi=
nate unsolicited promotional e-mails (spam). This process could result in d=
eletion of a legitimate e-mail before it is read by its intended recipient =
at our organization. Moreover, based on the scanning results, the full text=
 of e-mails and attachments may be made available to Weatherford security a=
nd other personnel for review and appropriate action. If you have any conce=
rns about this process, please contact us at dataprivacy@weatherford.com.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a8523e8099404bd699525f8ff7763819>