Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Nov 1997 17:09:45 -0500 (EST)
From:      "John W. DeBoskey" <jwd@unx.sas.com>
To:        freebsd-current@freebsd.org
Cc:        jwd@unx.sas.com (John W. DeBoskey)
Subject:   fxp0 causes machine lockup
Message-ID:  <199711052209.AA29928@iluvatar.unx.sas.com>

next in thread | raw e-mail | index | archive | help
Hi,

   I've had a problem (which I've ignored until now) with my fxp0
device causing my machine to completely lockup, requiring a power reset
to clear. If anyone can give me a clue as to where to start looking,
I'll be glad to try and run this down.

   I have an vx0 device which works, but does not speak correctly with
my network appliance fileserver. I have other machines with fxp0
devices installed which work fine with the network appliance 
fileserver, but don't work in all machines...

   When I issue the command:

ifconfig fxp0 10.26.1.237 netmask 0xffff0000

   the machine locks up.

ifconfig vx0 10.26.1.237 netmask 0xffff0000

   works correctly.


  I have tried with only the fxp0 card, with the fxp0 card and the vx0
card, and with the fxp0 card in every available pci slot with the same
result.

  Hardware: Dell Optiplex 200MHz PPro (running 3.0-110397-SNAP or 0911)

the fxp0 card works like a champ in:

  Hardware: Dell Optiplex 180MHz Pentium (running 3.0-091197-SNAP)


  I have tried the 0911 and 1103 snaps on the failing machine with the
same result (all tests done using the generic kernel). Note: a Notama
SMP/UP fix was put in since the last time this worked.

Machine Mem     kernel     Snap         result
P6      64MB    Generic    0911         fail
P6     128MB    Generic    0911         fail
P6      64MB    Generic    1103         fail
P6     128MB    Generic    1103         fail


  The oldest SNAP I have lying around is 3.0-970716-SNAP, which
exibits the same problem

 
  In looking through the archives I found this message which appears
to be similar, though with different hardware:

>From: "Mike Durian" <durian@plutotech.com>
>Date: Wed, 01 Oct 1997 12:45:27 -0600
>Subject: strange interaction with Pentium and fxp
>
>  I've been chatting with David Greenman about this problem
>I'm seeing, but since we've determined it's not really a
>fxp driver bug, I'd like to get some input from a wider
>audience.
>  When I boot single user and ifconfig fxp0 I get a PCI
>bus failure with a new -current kernel, but don't with
>an old kernel.  The nature of the PCI bus failure is
>that the 430fx chipset never asserts TRDY# for the
>read mem multiple command issued by the EtherExpress as
>part of its very first DMA.  Eventually the command times
>out, the PCI cards (including the EtherExpress) get confused
>by the invalid PCI command and start throwing interrupts
>that aren't normally checked for in the interrupt handler,
>thus locking up the system.
>  So I need to figure out why TRDY# isn't getting asserted
>with the new kernel.  I've got traces of both a working
>instance of this first mem read multiple from an old kernel
>and one that fails with the new kernel.  The only difference
>I can detect is that the old kernel stores the mbuf at
>a physical address like 0x2bxx54 and the new one has the
>mbuf at 0x3f54 - a much lower memory address.
>  I should also mention that this problem does not occur
>on a Pentium Pro system.  I have not stuck my PCI bus
>analyzer on the P6 machine, so I'm not positive it uses
>the same addresses, but I'm assuming it would.  This could
>very well be a 430FX chipset bug, but I still need a work
>around.
>  I have not yet verified that this problem exists on a
>different Pentium system, so it is possible that it is
>specific to the motherboard.
>  Does anyone have any ideas?
>
>mike

with this single followup from the original poster:

>From: "Mike Durian" <durian@plutotech.com>
>Date: Wed, 01 Oct 1997 17:06:05 -0600
>Subject: Re: strange interaction with Pentium and fxp
>
>On Wed, 01 Oct 1997 12:45:27 MDT, "Mike Durian" <durian@plutotech.com> wrote:
>>The only difference
>>I can detect is that the old kernel stores the mbuf at
>>a physical address like 0x2bxx54 and the new one has the
>>mbuf at 0x3f54 - a much lower memory address.
>>  I should also mention that this problem does not occur
>>on a Pentium Pro system.  I have not stuck my PCI bus
>>analyzer on the P6 machine, so I'm not positive it uses
>>the same addresses, but I'm assuming it would.  This could
>>very well be a 430FX chipset bug, but I still need a work
>>around.
>
>  I'm got a better grasp on the problem now.  I tried running
>the new kernel on another P6 system and when I experienced the
>same problem, I knew it wasn't a chipset bug.  The only difference
>between the two P6's was the amount of memory.  The one that
>worked had 64MB and the one that failed on 32MB.  When I put
>64MB in the one that failed, it started working.  Then I put
>64MB in the Pentium machine and it too started working.  Here's
>what I know:
>
>Machine Mem     kernel  mbuf Phys Addr. result
>P6      64MB    new     NA              OK
>P6      32MB    new     NA              fail
>P5      32MB    new     0x00003f54      fail
>P5      32MB    old     0x002b9f54      OK
>P5      64MB    new     0x0009bf54      OK
>
>Apparently, there is a problem with the EtherExpress card
>DMAing data out of host memory at physical address 0x3f54
>using the memory read multiple PCI transaction.
>  Does anyone know why 0x3f54 would be an unacceptable
>address, and does anyone have a fix?
>
>mike



My complete dmesg output follows:


Copyright (c) 1992-1997 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California. All rights reserved.
FreeBSD 3.0-971102-SNAP #0: Sun Nov  2 10:15:35 GMT 1997
    root@make.ican.net:/usr/src/sys/compile/GENERIC
CPU: Pentium Pro (199.43-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x617  Stepping=7
  Features=0xfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV>
real memory  = 67108864 (65536K bytes)
avail memory = 62435328 (60972K bytes)
Probing for devices on PCI bus 0:
Correcting Natoma config for non-SMP
chip0: <Intel 82440FX (Natoma) PCI and memory controller> rev 0x02 on pci0.0.0
chip1: <Intel 82371SB PCI to ISA bridge> rev 0x00 on pci0.13.0
ide_pci0: <Intel PIIX3 Bus-master IDE controller> rev 0x00 on pci0.13.1
chip2: <PCI to PCI bridge (vendor=1011 device=0021)> rev 0x00 on pci0.14.0
vga0: <VGA-compatible display device> rev 0x00 int a irq 9 on pci0.16.0
vx0: <3COM 3C905 Fast Etherlink XL PCI> rev 0x00 int a irq 15 on pci0.17.0
mii[*mii*] address 00:a0:24:bb:88:3e
Probing for devices on PCI bus 1:
fxp0: <Intel EtherExpress Pro 10/100B Ethernet> rev 0x04 int a irq 14 on pci1.9.0
fxp0: Ethernet address 00:a0:c9:8b:09:a5
ahc0: <Adaptec 2940 Ultra SCSI host adapter> rev 0x00 int a irq 11 on pci1.10.0
ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs
ahc0: waiting for scsi devices to settle
scbus0 at ahc0 bus 0
sd0 at scbus0 target 11 lun 0
sd0: <SEAGATE ST15150W 0023> type 0 fixed SCSI 2
sd0: Direct-Access 4095MB (8388315 512 byte sectors)
Probing for devices on the ISA bus:
sc0 at 0x60-0x6f irq 1 on motherboard
sc0: VGA color <16 virtual consoles, flags=0x0>
ed0 not found at 0x280
fe0 not found at 0x300
sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa
sio0: type 16550A
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A
lpt0 at 0x378-0x37f irq 7 on isa
lpt0: Interrupt-driven port
lp0: TCP/IP capable interface
lpt1 at 0x378-0x37f on isa
lpt1 not probed due to I/O address conflict with lpt0 at 0x378
mse0 not found at 0x23c
psm0 at 0x60-0x64 irq 12 on motherboard
psm0: device ID 0
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
wdc0 not found at 0x1f0
wdc1 not found at 0x170
bt0 not found at 0x330
uha0 not found at 0x330
aha0 not found at 0x330
aic0 not found at 0x340
nca0 not found at 0x1f88
nca1 not found at 0x350
sea0 not found at 0xffff
wt0 not found at 0x300
mcd0 not found at 0x300
matcdc0 not found at 0x230
scd0 not found at 0x230
ie0: unknown board_id: f000
ie0 not found at 0x300
ep0 not found at 0x300
ex0 not found
le0 not found at 0x300
lnc0 not found at 0x280
ze0 not found at 0x300
zp0 not found at 0x300
npx0 on motherboard
npx0: INT 16 interface
changing root device to sd0a
WARNING: / was not properly dismounted.



-- 
jwd@unx.sas.com       (w) John W. De Boskey          (919) 677-8000 x6915



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199711052209.AA29928>