From owner-freebsd-hackers Mon Jul 8 18:36:56 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id SAA21789 for hackers-outgoing; Mon, 8 Jul 1996 18:36:56 -0700 (PDT) Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id SAA21782 for ; Mon, 8 Jul 1996 18:36:51 -0700 (PDT) Received: (from bde@localhost) by godzilla.zeta.org.au (8.6.12/8.6.9) id LAA28497; Tue, 9 Jul 1996 11:31:59 +1000 Date: Tue, 9 Jul 1996 11:31:59 +1000 From: Bruce Evans Message-Id: <199607090131.LAA28497@godzilla.zeta.org.au> To: matt@lkg.dec.com Subject: Re: Some interesting papers on BSD ... Cc: freebsd-hackers@freebsd.org, tech-kern@netbsd.org Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > http://www.eecs.harvard.edu/~chris/papers/cli.ps >Draft of paper discussing different hardware synchronization schemes >on the x86. It discusses how to avoid talking to the PIC/8259 which >scheduling critical sections. While it probably won't save much on >lightly loaded systems, I have to wonder how it would effect heavily >loaded systems such wcarchive.cdrom.com... FreeBSD has never used the PIC for spl*(). I fixed spl*() in 386BSD-0.0. I wouldn't have considered using the PIC in the first place. Non-multitasking x86 systems normally use cli/sti to get essentially only two interrupt priorities (the PIC provides some priority stuff but it is not much use), and this works OK for multi- tasking provided the cricial sections are short. Even cli/sti has become much slower than the memory-based spl's used in FreeBSD. On Pentium's, pushfl/cli/popfl takes 14 cycles, while `s = splhigh(); splx(s);' takes 4 cycles assuming no cache or branch target buffer misses. The effect of fixing the spl's was very large even on unloaded systems. TCP to localhost speeded up by 25% or so on a 486/33, because it involves a surprisingly large number of spl's (40000/MB IIRC, but this seems too surprisingly large) and each spl pair used 26 PIC i/o instructions under 386BSD-0.0. The speedup would be a factor of about 10-20 on a fast Pentium (from 1 or 2 MB/s to about 20MB/s). Of course, a better implementation using the PIC would only involve 2 or 4 i/o instructions per spl pair. spl is probably fundamentally wrong for SMP. I haven't thought much about what to use instead. FreeBSD still uses the PIC for masking in-service interrupts. This could probably be avoided for edge-sensitive interrupts by depending on the interrupt signal staying high until the interrupt is dismissed so that another edge doesn't occur until the next i/o completion. At worst, the PIC could be used only for noisy interrupts. I haven't tried this because the benefits would be small. Hardware interrupts are much rarer than spls. Bruce