Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Aug 2012 12:59:58 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        Doug Barton <dougb@FreeBSD.org>
Cc:        Adrian Chadd <adrian@freebsd.org>, lev@freebsd.org, current@freebsd.org
Subject:   Re: CURRENT as gateway on not-so-fast hardware: where is a bottlneck?
Message-ID:  <50320A9E.5070303@FreeBSD.org>
In-Reply-To: <5031F636.1020405@FreeBSD.org>
References:  <157941699.20120815004542@serebryakov.spb.ru> <CAJ-Vmon86-FPs4%2BXXkQXAow1jW465pMM2Sj7ZHi_0_E9VYSFSA@mail.gmail.com> <502AE8B5.9090106@FreeBSD.org> <502B775D.7000101@FreeBSD.org> <5031F636.1020405@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 20.08.2012 11:32, Doug Barton wrote:
> On 08/15/2012 03:18, Alexander Motin wrote:
>> On 15.08.2012 03:09, Doug Barton wrote:
>>> On 08/14/2012 12:20 PM, Adrian Chadd wrote:
>>>> Would you be willing to compile a kernel with KTR so you can capture
>>>> some KTR scheduler dumps?
>>>>
>>>> That way the scheduler peeps can feed this into schedgraph.py (and you
>>>> can too!) to figure out what's going on.
>>>>
>>>> Maybe things aren't being scheduled correctly and the added latency is
>>>> killing performance?
>>>
>>> You might also try switching to SCHED_ULE to see if it helps.
>>>
>>> Although, in the last few months as mav has been converging the 2 I've
>>> started to see the same problems I saw on my desktop systems previously
>>> re-appear even using ULE. For example, if I'm watching an AVI with VLC
>>> and start doing anything that generates a lot of interrupts (like moving
>>> large quantities of data from one disk to another) the video and sound
>>> start to skip. Also, various other desktop features (like menus, window
>>> switching, etc.) start to take measurable time to happen, sometimes
>>> seconds.
>>>
>>> ... and lest you think this is just a desktop problem, I've seen the
>>> same scenario on 8.x systems used as web servers. With ULE they were
>>> frequently getting into peak load situations that created what I called
>>> "mini thundering herd" problems where they could never quite get caught
>>> up. Whereas switching to 4BSD the same servers got into high-load
>>> situations less often, and they recovered on their own in minutes.
>>
>> It is quite pointless to speculate without real info like mentioned
>> above KTR_SCHED traces.
>
> I'm sorry, you're quite wrong about that. In the cases I mentioned, and
> in about 2 out of 3 of the cases where users reported problems and I
> suggested that they try 4BSD, the results were clear. This obviously
> points out that there is a serious problem with ULE, and if I were the
> one who was responsible for that code I would be looking at ways of
> helping users figure out where the problems are. But that's just me.

I am not telling anything bad about 4BSD. Choice is provided because 
they are indeed different and none is perfect. 4BSD also has problems. 
What I would like to say is that if we want to improve situation, we 
need more detailed info then just verbal description. I am not telling 
that ULE is perfect. I went there because I've seen problems, and I am 
still fixing some pieces. I am just trying to explain described behavior 
from the point of my knowledge about it, hoping that it may help 
somebody to set up some new experiments or try some tuning/fixing.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50320A9E.5070303>