Date: Wed, 5 Nov 1997 17:09:45 -0500 (EST) From: "John W. DeBoskey" <jwd@unx.sas.com> To: freebsd-current@freebsd.org Cc: jwd@unx.sas.com (John W. DeBoskey) Subject: fxp0 causes machine lockup Message-ID: <199711052209.AA29928@iluvatar.unx.sas.com>
next in thread | raw e-mail | index | archive | help
Hi, I've had a problem (which I've ignored until now) with my fxp0 device causing my machine to completely lockup, requiring a power reset to clear. If anyone can give me a clue as to where to start looking, I'll be glad to try and run this down. I have an vx0 device which works, but does not speak correctly with my network appliance fileserver. I have other machines with fxp0 devices installed which work fine with the network appliance fileserver, but don't work in all machines... When I issue the command: ifconfig fxp0 10.26.1.237 netmask 0xffff0000 the machine locks up. ifconfig vx0 10.26.1.237 netmask 0xffff0000 works correctly. I have tried with only the fxp0 card, with the fxp0 card and the vx0 card, and with the fxp0 card in every available pci slot with the same result. Hardware: Dell Optiplex 200MHz PPro (running 3.0-110397-SNAP or 0911) the fxp0 card works like a champ in: Hardware: Dell Optiplex 180MHz Pentium (running 3.0-091197-SNAP) I have tried the 0911 and 1103 snaps on the failing machine with the same result (all tests done using the generic kernel). Note: a Notama SMP/UP fix was put in since the last time this worked. Machine Mem kernel Snap result P6 64MB Generic 0911 fail P6 128MB Generic 0911 fail P6 64MB Generic 1103 fail P6 128MB Generic 1103 fail The oldest SNAP I have lying around is 3.0-970716-SNAP, which exibits the same problem In looking through the archives I found this message which appears to be similar, though with different hardware: >From: "Mike Durian" <durian@plutotech.com> >Date: Wed, 01 Oct 1997 12:45:27 -0600 >Subject: strange interaction with Pentium and fxp > > I've been chatting with David Greenman about this problem >I'm seeing, but since we've determined it's not really a >fxp driver bug, I'd like to get some input from a wider >audience. > When I boot single user and ifconfig fxp0 I get a PCI >bus failure with a new -current kernel, but don't with >an old kernel. The nature of the PCI bus failure is >that the 430fx chipset never asserts TRDY# for the >read mem multiple command issued by the EtherExpress as >part of its very first DMA. Eventually the command times >out, the PCI cards (including the EtherExpress) get confused >by the invalid PCI command and start throwing interrupts >that aren't normally checked for in the interrupt handler, >thus locking up the system. > So I need to figure out why TRDY# isn't getting asserted >with the new kernel. I've got traces of both a working >instance of this first mem read multiple from an old kernel >and one that fails with the new kernel. The only difference >I can detect is that the old kernel stores the mbuf at >a physical address like 0x2bxx54 and the new one has the >mbuf at 0x3f54 - a much lower memory address. > I should also mention that this problem does not occur >on a Pentium Pro system. I have not stuck my PCI bus >analyzer on the P6 machine, so I'm not positive it uses >the same addresses, but I'm assuming it would. This could >very well be a 430FX chipset bug, but I still need a work >around. > I have not yet verified that this problem exists on a >different Pentium system, so it is possible that it is >specific to the motherboard. > Does anyone have any ideas? > >mike with this single followup from the original poster: >From: "Mike Durian" <durian@plutotech.com> >Date: Wed, 01 Oct 1997 17:06:05 -0600 >Subject: Re: strange interaction with Pentium and fxp > >On Wed, 01 Oct 1997 12:45:27 MDT, "Mike Durian" <durian@plutotech.com> wrote: >>The only difference >>I can detect is that the old kernel stores the mbuf at >>a physical address like 0x2bxx54 and the new one has the >>mbuf at 0x3f54 - a much lower memory address. >> I should also mention that this problem does not occur >>on a Pentium Pro system. I have not stuck my PCI bus >>analyzer on the P6 machine, so I'm not positive it uses >>the same addresses, but I'm assuming it would. This could >>very well be a 430FX chipset bug, but I still need a work >>around. > > I'm got a better grasp on the problem now. I tried running >the new kernel on another P6 system and when I experienced the >same problem, I knew it wasn't a chipset bug. The only difference >between the two P6's was the amount of memory. The one that >worked had 64MB and the one that failed on 32MB. When I put >64MB in the one that failed, it started working. Then I put >64MB in the Pentium machine and it too started working. Here's >what I know: > >Machine Mem kernel mbuf Phys Addr. result >P6 64MB new NA OK >P6 32MB new NA fail >P5 32MB new 0x00003f54 fail >P5 32MB old 0x002b9f54 OK >P5 64MB new 0x0009bf54 OK > >Apparently, there is a problem with the EtherExpress card >DMAing data out of host memory at physical address 0x3f54 >using the memory read multiple PCI transaction. > Does anyone know why 0x3f54 would be an unacceptable >address, and does anyone have a fix? > >mike My complete dmesg output follows: Copyright (c) 1992-1997 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-971102-SNAP #0: Sun Nov 2 10:15:35 GMT 1997 root@make.ican.net:/usr/src/sys/compile/GENERIC CPU: Pentium Pro (199.43-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x617 Stepping=7 Features=0xfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV> real memory = 67108864 (65536K bytes) avail memory = 62435328 (60972K bytes) Probing for devices on PCI bus 0: Correcting Natoma config for non-SMP chip0: <Intel 82440FX (Natoma) PCI and memory controller> rev 0x02 on pci0.0.0 chip1: <Intel 82371SB PCI to ISA bridge> rev 0x00 on pci0.13.0 ide_pci0: <Intel PIIX3 Bus-master IDE controller> rev 0x00 on pci0.13.1 chip2: <PCI to PCI bridge (vendor=1011 device=0021)> rev 0x00 on pci0.14.0 vga0: <VGA-compatible display device> rev 0x00 int a irq 9 on pci0.16.0 vx0: <3COM 3C905 Fast Etherlink XL PCI> rev 0x00 int a irq 15 on pci0.17.0 mii[*mii*] address 00:a0:24:bb:88:3e Probing for devices on PCI bus 1: fxp0: <Intel EtherExpress Pro 10/100B Ethernet> rev 0x04 int a irq 14 on pci1.9.0 fxp0: Ethernet address 00:a0:c9:8b:09:a5 ahc0: <Adaptec 2940 Ultra SCSI host adapter> rev 0x00 int a irq 11 on pci1.10.0 ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs ahc0: waiting for scsi devices to settle scbus0 at ahc0 bus 0 sd0 at scbus0 target 11 lun 0 sd0: <SEAGATE ST15150W 0023> type 0 fixed SCSI 2 sd0: Direct-Access 4095MB (8388315 512 byte sectors) Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> ed0 not found at 0x280 fe0 not found at 0x300 sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A lpt0 at 0x378-0x37f irq 7 on isa lpt0: Interrupt-driven port lp0: TCP/IP capable interface lpt1 at 0x378-0x37f on isa lpt1 not probed due to I/O address conflict with lpt0 at 0x378 mse0 not found at 0x23c psm0 at 0x60-0x64 irq 12 on motherboard psm0: device ID 0 fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in wdc0 not found at 0x1f0 wdc1 not found at 0x170 bt0 not found at 0x330 uha0 not found at 0x330 aha0 not found at 0x330 aic0 not found at 0x340 nca0 not found at 0x1f88 nca1 not found at 0x350 sea0 not found at 0xffff wt0 not found at 0x300 mcd0 not found at 0x300 matcdc0 not found at 0x230 scd0 not found at 0x230 ie0: unknown board_id: f000 ie0 not found at 0x300 ep0 not found at 0x300 ex0 not found le0 not found at 0x300 lnc0 not found at 0x280 ze0 not found at 0x300 zp0 not found at 0x300 npx0 on motherboard npx0: INT 16 interface changing root device to sd0a WARNING: / was not properly dismounted. -- jwd@unx.sas.com (w) John W. De Boskey (919) 677-8000 x6915
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199711052209.AA29928>