Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 04 Apr 2002 16:58:05 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        John Regehr <regehr@cs.utah.edu>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: Linuxthreads on Linux vs FreeBSD performance question
Message-ID:  <3CACF69D.3A992FFD@mindspring.com>
References:  <Pine.LNX.4.21.0204041009440.27824-100000@famine.cs.utah.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
John Regehr wrote:
> No need to use me as an excuse to vent your feelings about
> microbenchmarks vs. good benchmarks.  I'm showing how to use a
> user-space instrumented application to measure scheduling behavior, not
> trying to make any claims about the relative merits of the operating
> systems in realistic conditions.  These can and should be separate
> activities.

I was only trying to point out why you were seeing the
behaviour; you would have gotten the same answer on your
own, eventually, from the context switch measurement
suggestions that you received in answer to your question.

My comments on the relative merits of the microbenchmarks
had more to do with pointing out that the Linux optimization
is a special case.  I included my idealized benchmarking
process because that would show what I said the problems
with the Linux approach were, with regard to thread group
affinity.


> Maybe I should include a few comments to this effect in the paper, in
> order to forestall reactions like yours?  The last thing I want is to
> get into some sort of Linux vs. FreeBSD thing.  Maybe I can prevent some
> of this by telling people that I used to hack the Windows 2000
> scheduler! :)

Heh.

The problem is that it's not clear what the graphs you posted
are comparing.  In the context of the paper, this will probably
be mitigated somewhat.  However, there are a lot of people who
will turn directly to the graphs in any paper, and yell about
them, so I doubt you are safe, not matter what you do.

It would be useful, I think, to indicate what the benchmarks
*aren't* measuring, in the paper, in addition to what they
*are*, so that people don't make wrong application of them.
I'm not saying that they aren't figures of merit, only that
the scope of the merit is not well defined.

> If it would help draw the flames now, while I can still do something
> about it (paper is due around Apr 15), I'd be happy to post a pointer to
> my paper.

I think that would help, but if you are going to publish and
present, you probably want to limit distribution.  8-(.


> Technical comments follow.
> 
> > Because you are attempting a comparative benchmark, I would
> > suspect that you are probably running a significantly
> > quiescent system (just the benchmark itself, and the code
> > being benchmarked, running).  I expect that you have not
> > stopped "cron" (which runs once a second), nor have you
> > stopped other "system processes" which will end up in the
> > scheduling queue,
> 
> No, I haven't stopped these activities.  However, I'm only measuring the
> times for context switches between threads in my test application, so
> the things you mention are not throwing off the numbers.  How is this
> accomplished?  When other apps get to run during the test, this shows up
> as large gaps in the CPU time seen by my application, and these are
> thrown out as outliers -- they don't influence the statistics.  The test
> is acutally quite robust in the sense that a fair amount of background
> activity doesn't throw off the numbers, but in this case more care has
> to be taken to throw out the outliers in a sensible way.

What I mentioned is specific to the code in /sys/i386/i386/swtch.s.

The stats that are interesting are the SWTCH_OPTIM_STATS guaraded
statistics.

The particular counter that you need to be looking at is
_tlb_flush_count.

If you look at the Linux code, when they move from one thread
in a process to another thread in the same process (they are
really in the same process), you will see that they do not
reload CR3, and therfeore don't engage in any TLB flushing.

FreeBSD does TLB flushing in the idle loop (this is arguably a
stupid thing for it to do, since the last process to run may
be the next process resceduled... indeed, in your test is an
example of where this would be the case).

It also does an unnecessary reload of the address space, when
moving from one VM to another, even though they are the same
because of the RFMEM flag having been set on their creation.

Check out the "p_vmspace" references in i386/i386/pmap.c,
i386/i386/trap.c, and i386/i386/vm_machdep.c.

I don't klnow if you are using USER_LDT in your compiled FreeBSD
kernels, or if you have done any other tuning of the FreeBSD to
make it perform worse (or better( than GENERIC, as shipped, but
USER_LDT will seriously drop performance as well.


> I'm not running on an SMP either.

This is actually better for Linux, since the patches for per
CPU schedulers didn't go in until 2.5.2.


> > Right now, you are comparing apples and oranges.
> 
> Sure, if:
> 
> apples == expected time to execute the Linux context switch code when
> switching between two Linuxthreads, when the system load consists of 10
> CPU-bound threads and very little other activity
> 
> oranges == expected time to execute the FreeBSD context switch code when
> switching between two Linuxthreads, when the system load consists of 10
> CPU-bound threads and very little other activity

apples == newer version of Linux.

oranges == maintenance release of FreeBSD that was never supposed
	   to happen because 5.0 was supposed to be out by now.

You are testing specifically for something that was intentionally
left unoptimized.

> Thanks for the detailed answer,

No problem.  Good luck with the paper!  I always appreciate real
academic work; if you had been Joe Schmoe making the same
observation, I would have considered you a troll, and not even
bothered to answer.  8-).  As it is, I thought that it was more
important to ensure that you were operating on correct assumptions
about the code, and what you were really seeing, than what people
were posting telling you that you might be seeing.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CACF69D.3A992FFD>