Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Jul 2014 13:16:55 -0700
From:      Jeremy Chadwick <jdc@koitsu.org>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Consistently "high" CPU load on 10.0-STABLE
Message-ID:  <20140720201655.GA70545@icarus.home.lan>
In-Reply-To: <ED826825202341E58B71A3F718B60562@multiplay.co.uk>
References:  <20140720062413.GA56318@icarus.home.lan> <97EA8E571E634DBBAA70F7AA7F0DE97C@multiplay.co.uk> <20140720173524.GA67065@icarus.home.lan> <ED826825202341E58B71A3F718B60562@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
I suppose I could try "reverting" r265944 + r265945 and see if there's
any improvement, and if not, then try "reverting" r254304 and see if
there's any behavioural change.

In the interim, I ended up writing this silly one-liner:

while : ; do (uptime ; vmstat -s ; echo) | tee -a uptime_vmstat_s.txt ; sleep 2 ; done 

Then waited for moments where the load suddenly shot up when the system
really wasn't (from my perspective) doing anything.  An example (a small
snippet from the log, where the load went from 0.06 to 0.30 for no
reason I can discern, but the full log has some other occurrences too):

http://jdc.koitsu.org/freebsd/releng10_perf_issue/uptime_vmstat_s.txt

Obviously that output for analysis is not ideal/difficult for a human to
read, so I can write a perl or awk script which accomplishes the same
thing -- focusing on specific vmstat -s fields/lines and prints both the
load in addition to deltas (between previous run and most recent run)
between statistics so we can see what counters suddenly "jump" in
correlation with load average jumping up.  Think of it like a "vmstat 1"
or "iostat 1" combined with printing the load, all on one line.

I just need to know what lines in vmstat -s are of interest.

One thing I haven't tried is booting the system into single-user and
letting it sit there + see if the issue happens then.

But I have tried these with no change in behaviour:

- Shut off samba, isc-dhcpd, named, apache22, redis, and powerd
- Commented out all cron jobs
- Ran swapoff -a

I'm hoping this isn't some ZFS-related thing.  (Yes I can try to rule
out ZFS if necessary, but I gotta copy data over to UFS filesystems and
so on to make the system usable in that state; thankfully I keep
root/swap/var/tmp/usr on UFS).

On Sun, Jul 20, 2014 at 08:09:49PM +0100, Steven Hartland wrote:
> Hmm does vmstat -s indicate that 10 is swapping much more than 9?
> 
> I know there was some issues in early 10 which resulted in it
> swapping more but I was under the impression this was fixed by:
> http://svnweb.freebsd.org/base?view=revision&revision=265944
> http://svnweb.freebsd.org/base?view=revision&revision=265945
> 
> May be there's still and issue there?
> 
>     Regards
>     Steve
> ----- Original Message ----- 
> From: "Jeremy Chadwick" <jdc@koitsu.org>
> To: "Steven Hartland" <killing@multiplay.co.uk>
> Cc: <freebsd-stable@freebsd.org>
> Sent: Sunday, July 20, 2014 6:35 PM
> Subject: Re: Consistently "high" CPU load on 10.0-STABLE
> 
> 
> > Yes and no, heh... :-)
> > 
> > Using top -a -H -S -z -s 1 and watching very very closely, what I end up
> > seeing is that occasionally syncer reaches WCPU percentages of around
> > 0.50% (or maybe higher) -- but when that happens, the actual load average
> > **does not** suddenly increase.
> > 
> > The load just seems to go from like 0.01 or 0.02 to 0.12 or 0.20
> > sporadically with no real evidence of "why" in top.  Possibly this is
> > because I'm using -s 1 (one second update intervals) and whatever
> > happens is so fast/quick that it lasts less than a second / top doesn't
> > catch it, but still impacts the load?
> > 
> > Anyway, the "top" processes per TIME in the above command are here:
> > 
> >  PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
> >   17 root        16    -     0K    16K syncer  2   1:31   0.49% syncer
> >    0 root       -16    0     0K  4912K swapin  0   1:04   0.00% kernel{swapper}
> >    0 root       -92    0     0K  4912K -       2   0:17   0.00% kernel{em0 que}
> > 1767 root        20    0 57312K 10104K select  1   0:15   0.00% smbd
> >   12 root       -60    -     0K   528K WAIT    0   0:13   0.00% intr{swi4: clock}
> >  643 dhcpd       20    0 24524K 13044K select  1   0:07   0.00% dhcpd
> >   12 root       -88    -     0K   528K WAIT    1   0:04   0.00% intr{irq259: ahci0:ch}
> >   14 root       -16    -     0K    16K -       2   0:04   0.00% rand_harvestq
> > 58515 jdc         20    0 37028K  6196K select  2   0:03   0.00% sshd
> >  420 bind        20    0 70872K 36552K kqread  0   0:02   0.00% named{named}
> >   19 root        20    -     0K    16K sdflus  2   0:02   0.00% softdepflush
> >  420 bind        20    0 70872K 36552K uwait   3   0:02   0.00% named{named}
> > 
> > This is with a system uptime of 14 hours.
> > 
> > Comparatively, the RELENG_9 box I have (although it's on a VPS), which
> > does a lot more in general and has an uptime of 39 days, shows something
> > like this:
> > 
> >  PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
> >   12 root       -60    -     0K   224K WAIT    0 132:49   0.00% intr{swi4: clock}
> >    0 root       -92    0     0K   160K -       0  89:01   0.00% kernel{em0 taskq}
> >   17 root        16    -     0K    16K syncer  0  45:46   0.00% syncer
> >   12 root       -88    -     0K   224K WAIT    1  16:31   0.00% intr{irq14: ata0}
> >   12 root       -60    -     0K   224K WAIT    1  12:52   0.00% intr{swi4: clock}
> >   13 root        -8    -     0K    48K -       1   8:03   0.00% geom{g_down}
> > 15490 halbot      20    0 76172K 22532K select  0   6:34   0.00% perl
> >  593 bind        20    0 92288K 23740K uwait   1   4:39   0.00% named{named}
> >  593 bind        20    0 92288K 23740K uwait   0   4:36   0.00% named{named}
> > 
> > So syncer looks like it might be about right for both systems, but
> > "swapper" (still not sure what that is exactly) sure seems to be much
> > more busy on the RELENG_10 box doing nothing vs. the RELENG_9 box.
> > Here's the swapper line from the RELENG_9 box:
> > 
> > $ top -a -H -S -b 9999 | grep swapper
> >    0 root       -16    0     0K   160K swapin  0   0:55   0.00% kernel{swapper}
> > 
> > It's the only suspect I have at this point but it's not very good
> > evidence of anything.  :/
> > 
> > Maybe I can use while : ; do top -a -S -H -z -b 99999 >> somelog.txt ;
> > sleep 0.25 ; done and let that run for a while in hopes of catching
> > something?  But I also worry that such a test would actually impact the
> > load itself.
> > 
> > On Sun, Jul 20, 2014 at 03:26:02PM +0100, Steven Hartland wrote:
> >> If you add -H -z to your top command does anything stand out?
> >> 
> >>     Regards
> >>     Steve
> >> ----- Original Message ----- 
> >> From: "Jeremy Chadwick" <jdc@koitsu.org>
> >> To: <freebsd-stable@freebsd.org>
> >> Sent: Sunday, July 20, 2014 7:24 AM
> >> Subject: Consistently "high" CPU load on 10.0-STABLE
> >> 
> >> 
> >> > (Please keep me CC'd as I'm not subscribed to freebsd-stable@)
> >> > 
> >> > Today I took the liberty of upgrading my main home server from
> >> > 9.3-STABLE (r268785) to 10.0-STABLE (r268894).  The upgrade consisted of
> >> > doing a fresh install of 10.0-STABLE on a brand new unused SSD.  Most
> >> > everything went as planned, barring a couple ports-related anomalies,
> >> > and I seemed fairly impressed by the fact that buildworld times had
> >> > dropped to 27 minutes and buildkernel to 4 minutes with clang (something
> >> > I'd been avoiding like the plague for a long while).  Kudos.
> >> > 
> >> > But after an hour or so, I noticed a consistent (i.e. reproducible)
> >> > trend: the system load average tends to hang around 0.10 to 0.15 all the
> >> > time.  There are times where the load drops to 0.03 or 0.04 but then
> >> > something kicks it back up to 0.15 or 0.20 and then it slowly levels out
> >> > again (over the course of a few minutes) then repeats.
> >> > 
> >> > Obviously this is normal behaviour for a system when something is going
> >> > on periodically.  So I figured it might have been a userland process
> >> > behaving differently under 10.x than 9.x.  I let top -a -S -s 1 run and
> >> > paid very very close attention to it for several minutes.  Nothing.  It
> >> > doesn't appear to be something userland -- it appears to be something
> >> > kernel-level, but nothing in top -S shows up as taking up any CPU time
> >> > other than "[idle]" so I have no idea what might be doing it.
> >> > 
> >> > The box isn't doing anything like routing network traffic/NAT, it's pure
> >> > IPv4 (IPv6 disabled in world and kernel, and my home network does
> >> > basically no IPv6) and sits idle most of the time fetching mail.  It
> >> > does use ZFS, but not for /, swap, /var, /tmp, or /usr.
> >> > 
> >> > vmstat -i doesn't particularly show anything awful.  All the cpuX:timer
> >> > entries tend to fluctuate in rate, usually 120-200 or so; I'd expect an
> >> > interrupt storm to be showing something in the 1000+ range.
> >> > 
> >> > The only thing I can think of is the fact that the SSD being used has no
> >> > 4K quirk entry in the kernel (and its ATA IDENTIFY responds with 512
> >> > logical, 512 physical, even though we know it's 4K).  The partitions are
> >> > all 1MB-aligned regardless.
> >> > 
> >> > This is all bare-metal, by the way -- no virtualisation involved.
> >> > 
> >> > I do have DTrace enabled/built on this box but I have absolutely no clue
> >> > how to go about profiling things.  For example maybe output of this sort
> >> > would be helpful (but I've no idea how to get it):
> >> > 
> >> > http://lists.freebsd.org/pipermail/freebsd-stable/2014-July/079276.html
> >> > 
> >> > I'm certain I didn't see this behaviour in 9.x so I'd be happy to try
> >> > and track it down if I had a little bit of hand-holding.
> >> > 
> >> > I've put all the things I can think of that might be relevant to "system
> >> > config/tuning bits" up here:
> >> > 
> >> > http://jdc.koitsu.org/freebsd/releng10_perf_issue/
> >> > 
> >> > I should note my kernel config is slightly inaccurate (I've removed some
> >> > stuff from the file in attempt to rebuild, but building world prior to
> >> > kernel failed due to r268896 breaking world, but anyone subscribed here
> >> > has already seen the Jenkins job of that ;-) ).
> >> > 
> >> > Thanks.
> >> > 
> >> > -- 
> >> > | Jeremy Chadwick                                   jdc@koitsu.org |
> >> > | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> >> > | Making life hard for others since 1977.             PGP 4BD6C0CB |
> >> > 
> >> > _______________________________________________
> >> > freebsd-stable@freebsd.org mailing list
> >> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
> >> >
> > 
> > -- 
> > | Jeremy Chadwick                                   jdc@koitsu.org |
> > | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> > | Making life hard for others since 1977.             PGP 4BD6C0CB |
> > 
> > _______________________________________________
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
> >

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140720201655.GA70545>