FreeBSD Mail Archives

Date:      Fri, 25 Jun 1999 01:15:02 -0700 (PDT)
From:      Julian Elischer <julian@whistle.com>
To:        current@freebsd.org
Subject:   Re: Microsoft performance (was: ...) (fwd)
Message-ID:  <Pine.BSF.3.95.990625010814.2990A-100000@current1.whistle.com>

next in thread | raw e-mail | index | archive | help

Some interesting comments from an interesting source.

BTW.  I'd appreciate it if everyone noticed how friendly and helpful Linus
is and Not find some thing to rag on linux about with this. The aim of
this thread is to improve FreeBSD in any way we see as being a real
improvement. Don't cc him back into the topic. I'm sure he has enough to
worry about. This is FYI only ok?

---------- Forwarded message ----------
Date: Fri, 25 Jun 1999 00:51:10 -0700 (PDT)
From: Linus Torvalds <torvalds@transmeta.com>
To: Sean Eric Fagan <sef@kithrup.com>
Cc: Julian Elischer <julian@whistle.com>
Subject: Re: Microsoft performance (was: ...)

On Thu, 24 Jun 1999, Sean Eric Fagan wrote:
>
> Not sure if you've seen this stuff before; thought you might find it
> interesting if not.  (This is not intended to be anything other than, "NT
> performs surprisingly well" :).)

Indeed (Julian cc'd).

We had this taught to us the hard way with the mindcraft benchmark.

NT scales VERY well for netbench. It became pretty obvious pretty quickly
that it's one of their basic benchmarks, and it's what they have optimized
NT and IIS for. Fair enough - netbench may not be the best benchmark out
there, but there isn't anything better.

Julian, feel free to forward my comments to anybody, for whatever they are
worth.

> Ok well here are some real numbers for you..
> Win NT 4processors 1GB ram + raid array + IIS
> webbench... 4000 transactions per second...
> 
> FreeBSD.. Identical hardware..
> 1450 transactions per seccond
> Linux: 2000 per second
> Solaris86  6000 per second

The major problem with webbench is apache, although the OS _does_ matter.
The webbench overhead is largely the socket timeouts and you have to do
accept() right scalability-wise. I suspect NT just doesn't do the timewait
stuff at all, but I could be wrong.

We ended up doing the accept() thing in Linux (waking up just one thread),
even though I personally think it doesn't have that big an impact in real
life. It was easy enough to do.

Once you get over that, the numbers seem to be just purely about scaling
the fs and network, but I never got interested enough to check more into
it. The netbench load was more interesting in that area, and we ended up
having a benchmark that didn't have the apache overhead so it was a
"purer" benchmark anyway. AND it was what NT did really well on, so..

The apache guys are definitely looking at a new threading model, you might
try to approach Dean Gaudet and ask him what he's up to. That, together
with the scalaibility stuff, should hopefully be enough. NT didn't do
_that_ well on this one.

And btw, if you're feeling bad about the numbers above: C'T in Germany
did a similar kind of benchmark which was more UNIX-oriented, with perl
CGI scripts etc, and NT did really badly on it. Another case of NT being
optimized for one thing, and one thing only.

Oh, except that if you do the "MS CGI" (whetever they call they built-in
modules), NT actually does that pretty well - built in to IIS. So it's not
even "dynamic content" - it's really about what _kind_ of CGI you do.

> With Netbench:
> NT blows us away.

Netbench is the big thing for NT. And the numbers you quote are the "bad"
NT numbers (ie bad enough that Linux actually was able to inch ahead of
NT). The _good_ NT numbers are three times as good, but they are done with
(a) NT running FATFS, (b) w98 clients exclusively.

> (we're talking an order of magnitude faster)
> I'm not going ot give real numbers as I don't have them readily at hand
> but they are something like 12MB/Sec for FreeBSD vs 90 MB/sec for NT and
> 120MB/sec for linux. Matt has some patches that raise the 12 to 35 and
> kirk has some changes that may raise the numbers to 70 or more,
> and John has some patches that may add more again, but it's all theory,
> and some of the patches have had less results than we expected.

Qutoe frankly, to get close to NT, you have to scale really well to four
CPU's. We've had the core "deep SMP people" working on this for the last
month, and we _finally_ have a system that seems to scale pretty well:
well enough that we think we can beat the _good_ NT numbers too. Or at
least come so close as to not matter any more.

When I say scale really well, I mean it. You need to (a) make sure your
disk never does anything at all, and (b) make sure that you get almost
perfect scaling for cached reads and writes. That's what NT comes pretty
close to doing, it seems.

The new Linux code has a per-page lock on the page cache, and pretty much
nothing else. And I don't think you can beat NT with anything less.

> With Uniprocessor things are a lot more equal.
> but we still suck on netbench.

If you suck on netbench even on UP, the most likely reason is just that
you're doing too much IO. You can make a noticeable portion of the load go
away by trying to keep dirty state in memory, and making sure that
unlink() removes the dirty state without ever writing it out. 

That doesn't get rid of the problem (the netbench load is large enough
that it doesn't fit in 1GB - each part of it would fit, but if you
actually do the clients concurrently their combined working set is more
than you can cache).

The way to cheat on netbench _seems_ to be to actually serialize the
client accesses - doing a few clients at a time, and letting the clients
remove their own crud and avoiding the writeouts because the files got
unlinked. That's what it looked like from our analysis, anyway. That way
you can cache the whole working set because you do it piece-meal.

HOWEVER - NT doesn't actually seem to cheat that way, so we never pursued
that as a venue either (I'd feel really bad if the only way I could beat
NT was through cheating). I just thought I'd bring it up as an example of
how bad a benchmark it really is.

> This is due to the exact form of netbench which is exactly nonoptimal for
> FreeBSD.

netbench really is a horrible benchmark. I don't know what the FreeBSD
tuning parameters are, but if you can avoid flushing dirty buffers that
will help. Try it (it will make interactive performance suck, but hey,
it's a benchmark, not real life).

> Also becaosue of the GKL (Giant Kernel Lock) (see Solaris's results)

That's the big one. We're seeing a factor of 2.5 on a four-way. It's not a
four-time increase, but it seems to become memory bus limited at that
point. 

I'll warn you: we were able to get rid of the big kernel lock around the
filesystem reasonably quickly, but that was mainly because we had all the
infrastructure in place (that's what Linux-2.2 was all about - in
Linux-2.2 the kernel lock is still heavily used, but it's not really a
fundamental requirement any more). Even so, it wasn't exactly painless. We
had to change the page cache quite extensively, even though I originally
tried to design the page cache exactly for something like this.

The networking layer also needs to scale. I refused to put as much effort
into that, because I don't think that 4 100Mbps ethernet cards are all
that realistic - it's much more realistic to have a single gigabit card
and a switch for this kind of server, so I feel that for the networking
most of the scalability is in the end going to be limited by the IO.

On the filesystem side, I decided that the cache really has to scale
perfectly.

> Basically there are some applications and benchmarks for which FreeBSD
> will really suck. We're working on them but some things are just a result
> of how we do things.
> 
> So don't assume that NT figures must be bad..
> we have too many weaknesses in our own code to throw stones. 

I was certainyl surprised by how well NT does. NT does reasonably well in
another area too: writing back data through a shared file mapping was one
of the few benchmarks NT actually _won_ against both Solaris and Linux on
lmbench. We fixed it in the meantime, but the point being that there _are_
things that NT does well.

However, I still feel pretty confident in that the "NT does this really
well" category does seem to be a matter of a few special things that they
have worked on a lot rather than anything "good across the board due to a
kick-ass design" kind of goodness.

So I do feel pretty good, and I do believe that I'll be able to release a
new version of Linux by the end of the year that actually outperforms NT
on the things it seems to shine on. I'm waiting with interest to see what
the next benchmark the MS people are going to find that they do better on.

They've obviously started to look for benchmarks to do that they do well
on..

> It'd be intersting to see how FreeBSD 1.1.5 would have performed on the
> same tests. Sometimes we've gained in general performance but lost in
> some specific cases.

I don't think you'll find that you'd do very differently on either
NetBench or WebBench, unless you've really been working on the parts that
they test. But I'd be happy to share what we found out when we were trying
to figure out exactly how NT was able to do so much better..

		Linus

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.95.990625010814.2990A-100000>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation