Date: Mon, 18 Jun 2018 18:31:40 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: GPT vs MBR for swap devices Message-ID: <BC42DDF9-9383-437B-8AE2-A538050C5160@yahoo.com> In-Reply-To: <20180619005519.GB81275@www.zefox.net> References: <7AB401DF-7AE4-409B-8263-719FD3D889E5@yahoo.com> <20180618230419.GA81275@www.zefox.net> <A8D00616-ADA7-4A33-8787-637AFEF547CF@yahoo.com> <20180619005519.GB81275@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2018-Jun-18, at 5:55 PM, bob prohaska <fbsd at www.zefox.net> wrote: > On Mon, Jun 18, 2018 at 04:42:21PM -0700, Mark Millard wrote: >>=20 >>=20 >> On 2018-Jun-18, at 4:04 PM, bob prohaska <fbsd at www.zefox.net> = wrote: >>=20 >>> On Sat, Jun 16, 2018 at 04:03:06PM -0700, Mark Millard wrote: >>>>=20 >>>> Since the "multiple swap partitions across multiple >>>> devices" context (my description) is what has problems, >>>> it would be interesting to see swapinfo information >>>> from around the time frame of the failures: how much is >>>> used vs. available on each swap partition? Is only one >>>> being (significantly) used? The small one (1 GiByte)? >>>>=20 >>> There are some preliminary observations at >>>=20 >>> = http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_1gbsdflash_= swapinfo/1gbusbflash_1gbsdflash_swapinfo.log >>>=20 >>> If you search for 09:44: (the time of the OOM kills) it looks like >>> both swap partitions are equally used, but only 8% full. >>>=20 >>> At this point I'm wondering if the gstat interval (presently 10 = seconds) >>> might well be shortened and the ten second sleep eliminated. On the = runs >>> that succeed swap usage changes little in twenty seconds, but the = failures >>> seem to to culminate rather briskly. >>=20 >> One thing I find interesting somewhat before the OOM activity is >> the 12355 ms/w and 12318 ms/w on da0 and da0d that goes along >> with having 46 or 33 L(q) and large %busy figures in the same >> lines --and 0 w/s on every line: >>=20 >> Mon Jun 18 09:42:05 PDT 2018 >> Device 1K-blocks Used Avail Capacity >> /dev/da0b 1048576 3412 1045164 0% >> /dev/mmcsd0s3b 1048576 3508 1045068 0% >> Total 2097152 6920 2090232 0% >> dT: 10.043s w: 10.000s >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name >> 0 0 0 0 0.0 0 9 10.8 0 0 = 0.0 0.1 mmcsd0 >> 46 0 0 0 0.0 0 16 12355 0 0 = 0.0 85.9 da0 >> 0 0 0 0 0.0 0 9 10.8 0 0 = 0.0 0.1 mmcsd0s3 >> 0 0 0 0 0.0 0 9 10.8 0 0 = 0.0 0.1 mmcsd0s3a >> 33 0 0 0 0.0 0 22 12318 0 0 = 0.0 114.1 da0d >> Mon Jun 18 09:42:25 PDT 2018 >> Device 1K-blocks Used Avail Capacity >> /dev/da0b 1048576 3412 1045164 0% >> /dev/mmcsd0s3b 1048576 3508 1045068 0% >> Total 2097152 6920 2090232 0% >>=20 >>=20 >> The kBps figures for the writes are not very big above. >>=20 >=20 > If it takes 12 seconds to write, I can understand the swapper getting = impatient.... > However, the delay is on /usr, not swap. >=20 > In the subsequent 1 GB USB flash-alone test case at > = http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_swapinfo/1g= busbflash_swapinfo.log > the worst-case seems to be at time 13:45:00 >=20 > dT: 13.298s w: 10.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 0 0 0 0 0.0 0 5 5.5 0 0 = 0.0 0.1 mmcsd0 > 9 84 0 0 0.0 84 1237 59.6 0 0 = 0.0 94.1 da0 > 0 0 0 0 0.0 0 5 5.5 0 0 = 0.0 0.1 mmcsd0s3 > 0 0 0 0 0.0 0 5 5.6 0 0 = 0.0 0.1 mmcsd0s3a > 5 80 0 0 0.0 80 1235 47.2 0 0 = 0.0 94.1 da0b > 4 0 0 0 0.0 0 1 88.1 0 0 = 0.0 0.7 da0d > Mon Jun 18 13:45:00 PDT 2018 > Device 1K-blocks Used Avail Capacity > /dev/da0b 1048576 22872 1025704 2% >=20 > 1.2 MB/s writing to swap seems not too shabby, hardly reason to kill a = process. That is kBps instead of ms/w. I see a ms/w (and ms/r) that is fairly large (but notably smaller than the ms/w of over 12000): Mon Jun 18 13:12:58 PDT 2018 Device 1K-blocks Used Avail Capacity /dev/da0b 1048576 0 1048576 0% dT: 10.400s w: 10.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name 0 4 0 0 0.0 4 66 3.4 0 0 = 0.0 1.3 mmcsd0 8 18 1 32 1991 17 938 2529 0 0 = 0.0 88.1 da0 0 4 0 0 0.0 4 63 3.5 0 0 = 0.0 1.3 mmcsd0s3 0 4 0 0 0.0 4 63 3.5 0 0 = 0.0 1.3 mmcsd0s3a 6 11 1 32 1991 10 938 3207 0 0 = 0.0 94.7 da0d Mon Jun 18 13:13:19 PDT 2018 Device 1K-blocks Used Avail Capacity /dev/da0b 1048576 0 1048576 0% Going in a different direction, I believe that you have reported needing more than 1 GiByte of swap space so the 1048576 "1K-blocks" would not be expected to be sufficient. So the specific failing point may well be odd but the build would not be expected to finish without an OOM for this context if I understand right. > Thus far I'm baffled. Any suggestions? Can you get a failure without involving da0, the drive that is sometimes showing these huge ms/w (and ms/r) figures? (This question presumes having sufficient swap space, so, say, 1.5 GiByte or more total.) Having the partition(s) each be sufficiently sized but for which the total would not produce the notice for too large of a swap space was my original "additional" suggestion. I still want to see what such does as a variation of a failing context. But now it would seem to be a good idea to avoid da0 and its sometimes large ms/w and /ms/r figures. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BC42DDF9-9383-437B-8AE2-A538050C5160>