Date: Thu, 30 Dec 2004 11:23:26 +0300 From: Peter Trifonov <pvtrifonov@mail.ru> To: freebsd-smp@freebsd.org Subject: Lost interrupts on SMP systems Message-ID: <200412301123.26785.pvtrifonov@mail.ru>
next in thread | raw e-mail | index | archive | help
Hello, I have posted a message about this problem to freebsd-stable and freebsd-hardware. After more detailed investigation I believe that the problem is related to SMP implementation. I have a dual-procerror PentiumPro box with 3 NICs xl0 is at irq 10, fxp0, fxp1 are at irq 11. If heavy traffic occurs on both fxp's the system says fxp0: device timeout fxp1: device timeout and they stop working. fxp's can be revived by doing ifconfig fxp{0,1} down;ifconfig fxp{0,1} up Sometimes this causes the system to say fxp0: SCB timeout: 0x70 0x0 0x50 0x0 fxp1: SCB timeout: 0x20 0x0 0x50 0x0 However, the network goes back to life. After replacing both fxp's with 3COM NICs "fxp* device timeout" messages have changed to "xl{1,2}: watchdog timeout". xl0 still works fine. Again, the network can be fixed by doing ifconfig down&up. Therefore, the problem is not likely to be related to fxp or xl drivers. Now I have such configuration: xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xf400-0xf47f mem 0xffbec000-0xffbec07f irq 10 at device 12.0 on pci0 xl1: <3Com 3c905B-TX Fast Etherlink XL> port 0xf000-0xf07f mem 0xffbdc000-0xffbdc07f irq 11 at device 13.0 on pci0 xl2: <3Com 3c905C-TX Fast Etherlink XL> port 0xec00-0xec7f mem 0xffbcc000-0xffbcc07f irq 11 at device 14.0 on pci0 Currently I have to run a cron script checking every 5 minutes the network and doing ifconfig down&up. I believe that IRQ sharing on SMP causes some interrupts to be lost. Please let me know if any patches/workarounds are known. -- With best regards, P. Trifonov
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200412301123.26785.pvtrifonov>