Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Jul 2011 11:05:15 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Steve Kargl <sgk@troutmask.apl.washington.edu>
Cc:        freebsd-current@FreeBSD.org
Subject:   Re: Heavy I/O blocks FreeBSD box for several seconds
Message-ID:  <4E1C003B.4090604@FreeBSD.org>
In-Reply-To: <20110711161654.GA97361@troutmask.apl.washington.edu>
References:  <20110706170132.GA68775@troutmask.apl.washington.edu> <5080.1309971941@critter.freebsd.dk> <20110706180001.GA69157@troutmask.apl.washington.edu> <4E14A54A.4050106@freebsd.org> <4E155FF9.5090905@FreeBSD.org> <20110707151440.GA75537@troutmask.apl.washington.edu> <4E160C2F.8020001@FreeBSD.org> <20110707200845.GA77049@troutmask.apl.washington.edu> <ivf221$oo2$1@dough.gmane.org> <4E1B1198.6090308@FreeBSD.org> <20110711161654.GA97361@troutmask.apl.washington.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
on 11/07/2011 19:16 Steve Kargl said the following:
> On Mon, Jul 11, 2011 at 06:07:04PM +0300, Andriy Gapon wrote:
>> But it's not clear which of the processes are slaves and which is master.
>> It's also not clear why the master takes so much CPU (on par with the
>> slaves) -
>> from my reading of its description (by Steve) it should be doing only light
>> periodic work.
> 
> These are all slave processes.  The master process was on a different
> node in the cluster.  Each process is doing the exact same computation
> with only a small change in a coordinate from (x,y,z) to (x,y+n*dy,z)
> with n = 1, 2, 3, 4.  The small change does not causes a different 
> code path, so all should complete in nearly identical times.

OK, the situation is much clearer (to me) now.

>> If it does have to do CPU-heavy work, then I'd imagine that it should
>> spawn only Ncpus - 1 slaves.
> 
> And if you have M users on the system?  Also note, you can get the
> exact same loading problem by launching Ncpu+1 completely independent
> cpu-bound processes.  Ncpu-1 processes will be bound to specific cpus
> and 2 processes will ping-pong on one cpu.  This ping-ponging will
> simply kill performance.

I'd still argue that if someone cares about doing some calculations as fast as
possible then he shouldn't have more than Ncpu CPU-bound processes.  How to
achieve that is a technical/administrative issue.

But nevertheless I now see what the problem is.
I think that the best thing you can further provide (as objective evidence for
the problem at hand) is ktr(4) traces for at least KTR_SCHED mask.  Perhaps you
even already have them from your previous sessions with Jeff.

P.S. This is not a promise to actually debug this issue based on the traces :-)
-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E1C003B.4090604>