Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Jan 2000 11:12:08 -0500 (EST)
From:      Bill Paul <wpaul@skynet.ctr.columbia.edu>
To:        ap@bnc.net (Achim Patzner)
Cc:        current@freebsd.org
Subject:   Re: Current, XEON and MP performance
Message-ID:  <200001171612.LAA04447@skynet.ctr.columbia.edu>
In-Reply-To: <20000117122223.C9402@bnc.net> from "Achim Patzner" at Jan 17, 2000 12:22:23 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Achim Patzner 
had to walk into mine and say:

> > Can you show us the actual results from your testing (an hopefully your
> > testing methods as well) that led you to this conclusion? Details matter.
> 
> First test was running an apache and fcgi noticing that it took considerably
> longer to respond than the old server. Next test was a number of homegrown
> programs under time(1).
> 
> > Are these programs I/O bound, CPU bound, or a little of both?
> 
> All of it; one is page-view accounting for the current web server, others
> are meteorological simulations.
> 
> > FreeBSD's SMP support still depends largely on the big giant lock approach
> > which means that while you can indeed get processes running on multiple
> > CPUs at the same time, you end up using only one CPU once you enter the
> > kernel.
> 
> Which would account for needing more wall-clock-time. But heaps of CPU time?
> 
> > And you have to enter the kernel in order to perform any disk, network
> > or even console I/O. If your programs suck large datasets into memory,
> > do lots of number crunching on them, then spit the results back out to
> > a disk file, then they should benefit from more CPUs. However if they
> > read and write data a lot while running, you're going to be limited by
> > the big giant lock.
> 
> Hm. Both are MP systems. Both were - at the time of test - virtually idle
> besides running the tests. I could even retry in single user mode.

I'm taking about your test programs themselves, not the rest of the
system. It doesn't matter if the system is otherwise idle: if *your*
programs perform a lot of I/O, then the big giant lock will hurt you.
 
> > There may also be scalability issues (i.e. does FreeBSD perform better
> > as you add more CPUs or does it spend so much time trying to stay out
> > of its own way that it actually performs worse) however I don't know
> > enough to say if you could be running into such problems as the only
> > SMP machines I have access to have only 2 CPUs.
> 
> As far as I understood HP Germany, they'd be glad to lend you one of them
> for a few months. You'd just have to ask.

I'm not an SMP expert, though perhaps those who are SMP experts might
want to take advantage of this.
 
> > > [Worse: The LH4
> > > behaves like a spoilt brat when it comes to hardware, disliking the Int=
> el
> > > EtherExpress that came with it (generating bus mastering problems after
> > > bringing it up),
> >=20
> > Which model Intel EtherExpress? What chipset?
> 
> I'll have to put it into another machine - it's sitting on a shelf right
> now. But it was working reliably AND fast on another ASUS P2B-DS machine.
> 
> > What bus mastering problems exactly?
> 
> Time outs, followed by a panic.

This is still a bit vague. It would be nice to know exactly what
the timeout error messages said, as well as what the panic message
said.
 
> > > having interrupt routing problems with two DEC TULIP based ethernet
> > > cards sharing the same IRQ
> >=20
> > Which tulip cards?
> 
> CNet CN100 with an 21443

21143, not 21443.
 
> > What driver?
> 
> if_de

Try if_dc instead.
 
> > What kind of problems?
> 
> Constant time outs under higher load than a few packets a second; no more
> than 18 KB/second on a switch, no matter whether full duplex or not.

This is probably due to a bug in the if_de driver. I finally got a proper
handle on it the other day. The 21143 is different from the 21140 in a
couple of ways, the most important being that it has built-in NWAY
autonegotiation support. If you want to make a 10/100 autonegotiating
NIC with a 21140 chip, you need to use an external MII-compliant
tranceiver chip with NWAY support. You can design board with the 21143
like this as well, however you can *also* do it using the 21143's own
built-in NWAY support. The DEC DE500-BA card works like this. The
problem is that in adding the built-in autoneg support, DEC/Intel did
a really dumb thing. In the CSR6 register, there's a bit that controls
full duplex/half duplex operation: turn it on, the chip runs in full
duplex mode, turn it off and it runs in half duplex. On the 21140, this
is the *only* function that this bit has. But in the 21143, this bit
has two functions: if the internal autoneg is shut off, then it behaves
like it does on the 21140, but if autoneg is turned *on*, then it controls
whether or not the chip advertises 10Mbps half duplex when performing
autonegotiation.

Why is this dumb? For the following reasons:

- There are plenty of unused bits in other registers that DEC/Intel
  could have used instead of overloading the meaning of the full
  duplex enable bit in CSR6.
- The chip defaults to autoneg turned on after a reset. It also
  defaults to half duplex. So if you want to manually enable full
  duplex, you *must* first turn off the autoneg (by clearing CSR14).

The de driver doesn't know about this, and the result is that full
duplex mode just won't work. If your board uses an external transceiver,
the internal NWAY support should be turned off, but it isn't. So
selecting full duplex mode manually or autonegotiating full duplex
with a link partner doesn't work, and you stay in half duplex mode
no matter what you do.

The if_dc driver handles this correctly. You could tweak the de driver
to handle it too, however looking at the code, I don't think it handles
non-MII cards properly. The dc driver should handle those correctly
(at the very least, you can always manually override the mode with
ifconfig even if the autoneg gets it wrong).
 
> > I find it unusual
> > that two PCI devices would wind up with the same IRQ with the APIC enabled
> > since it's supposed to give you a lot more IRQs than in UP mode.
> 
> *ROTFL* (sorry). HP's algorithm for allocating IRQs is giving "same devices"
> the same IRQs if they are running out of IRQS.
> 
> Just take a look at this:

Uhm...

[...]

> APIC_IO: Testing 8254 interrupt delivery
> APIC_IO: Broken MP table detected: 8254 is not connected to IO APIC int pin=
>  2
> APIC_IO: routing 8254 via 8259 on pin 0
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #3 Launched!

I'm a little worried about this. You might want to forward all this
information to the freebsd-smp list: I'm a little out of my depth.

> >=20
> > > and being picky just which 3C906B-TX it
> > > gets plugged in.
> >=20
> > There is no such card as a 3c905B. There's a 3c905B, and there's a
> > 3c905C. Unfortunately, 3Com did go through several different ASIC
> > revisions with the 3c905B series, some of which work better than others,
> > but again, I see no details here.
> 
> 2nd time. Sorry. I should really be more carefull:
> 
> xl0@pci1:3:0:   class=3D0x020000 card=3D0x905510b7 chip=3D0x905510b7 rev=3D=
> 0x64 hdr=3D0x00
> xl1@pci2:2:0:   class=3D0x020000 card=3D0x905510b7 chip=3D0x905510b7 rev=3D=
> 0x24 hdr=3D0x00
> 
> The one not working stated rev=3D0x00. And it is working perfectly well in
> another machine.

Again, there is a distinct lack of details. You can't just say it
doesn't work. You have to describe the failure.

> 
> Ok. Tell me what info to gather. Any preferred benchmarks?

Again, you didn't show us the actual results from your original tests.
You just said "it's not as fast." We don't want your interpretation of
the data: we want the data itself. Again, I would ask the freebsd-smp
list about this. Also, I would try installing a more current version
of -current just to be sure you're on the same page as they are.

-Bill

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=============================================================================


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200001171612.LAA04447>