Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 13 Oct 2007 03:27:06 +0200
From:      Alson van der Meulen <alson+ml@alm.flutnet.org>
To:        freebsd-stable@freebsd.org
Subject:   Re: Unable to boot recent -stable with MSI/MSIX enabled
Message-ID:  <20071013012706.GA2546@waalsdorp.nl>
In-Reply-To: <2a41acea0710121630i7c1f2e9dk6e55832c69e864a7@mail.gmail.com>
References:  <20071012175042.GA1750@waalsdorp.nl> <2a41acea0710121630i7c1f2e9dk6e55832c69e864a7@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
* Jack Vogel <jfvogel@gmail.com> [2007-10-13 01:30]:
> Hmmm, so am I correct in understanding that this root is remote, so its
> really coming in over the the em driver?

No, the root is local: gmirror of two SATA disks on ATA (AHCI)
controller, this host has no remote filesystems. em is not needed for
mounting the root fs. I'm not 100% sure if em is to blame, but:
- The em merge is the only remotely related commit to RELENG_6 that I
  could find between October 1 and October 10.
- Disabling MSI/MSIX fixes it, and em is the only MSI user as far as I
  can see in the dmesg.

It's possible that the use of MSI by em triggers a bug in the PCI/ATA
driver. It's even possible that the chipset has broken MSI support (see
previous mail for dmesgs).

Friday morning (local time, CEST), it did boot up with the new kernel
and mounted its root FS successfully, but when I attempted to log in a
few hours later, none of the network interfaces (em and fxp) worked. fxp
is not even on a PCIe link, but a PCI card, so it appears to break
any PCI/PCIe device. Logging in via the console gave this error:
getty[1709]: /usr/bin/login: Exec format error
Probably because it couldn't properly access /usr (which is on ATA
disks) anymore.

The system appears to have worked initially, but started to fail when my
workstation, which is directly connected to the em interface, was turned
on. I also saw a watchdog timeout on the em interface about ten minutes
after the link went up. After my workstation was turned on this box lost
all network connections. Unplugging the cable to the em interface might
prevent the problem to occur, this also points at the em driver as the
trigger. I'll try to verify this.

Below is a list of files in /usr/src/sys changed since the last working
kernel of 2007-10-01. I don't see any PCI changes relevant to amd64, so
it appears to be at least triggered by the em driver.

regards,
Alson

./alpha/isa/isa.c
./alpha/pci/apecs_pci.c
./alpha/pci/lca_pci.c
./alpha/pci/pcibus.c
./amd64/acpica/madt.c
./amd64/amd64/local_apic.c
./amd64/amd64/mp_machdep.c
./amd64/amd64/mptable.c
./amd64/amd64/nexus.c
./amd64/conf/NOTES
./amd64/include/apicvar.h
./arm/arm/nexus.c
./arm/xscale/i80321/i80321_pci.c
./arm/xscale/i80321/obio.c
./compat/ia32/ia32_sysvec.c
./conf/files
./conf/files.amd64
./conf/files.i386
./conf/kern.pre.mk
./dev/em/LICENSE
./dev/em/if_em.c
./dev/em/if_em.h
./dev/em/e1000_80003es2lan.c
./dev/em/e1000_80003es2lan.h
./dev/em/e1000_82540.c
./dev/em/e1000_82541.c
./dev/em/e1000_82541.h
./dev/em/e1000_82542.c
./dev/em/e1000_82543.c
./dev/em/e1000_82543.h
./dev/em/e1000_82571.c
./dev/em/e1000_82571.h
./dev/em/e1000_82575.c
./dev/em/e1000_82575.h
./dev/em/e1000_api.c
./dev/em/e1000_api.h
./dev/em/e1000_defines.h
./dev/em/e1000_hw.h
./dev/em/e1000_ich8lan.c
./dev/em/e1000_ich8lan.h
./dev/em/e1000_mac.c
./dev/em/e1000_mac.h
./dev/em/e1000_manage.c
./dev/em/e1000_manage.h
./dev/em/e1000_nvm.c
./dev/em/e1000_nvm.h
./dev/em/e1000_osdep.h
./dev/em/e1000_phy.c
./dev/em/e1000_phy.h
./dev/em/e1000_regs.h
./dev/re/if_re.c
./dev/mxge/eth_z8e.h
./dev/mxge/ethp_z8e.h
./dev/mxge/if_mxge.c
./dev/mxge/if_mxge_var.h
./dev/mxge/mcp_gen_header.h
./dev/mxge/mxge_lro.c
./dev/mxge/mxge_mcp.h
./dev/mxge/mxge_eth_z8e.c
./dev/mxge/mxge_ethp_z8e.c
./fs/devfs/devfs_vnops.c
./fs/fifofs/fifo_vnops.c
./i386/acpica/madt.c
./i386/conf/NOTES
./i386/i386/local_apic.c
./i386/i386/mp_machdep.c
./i386/i386/mptable.c
./i386/i386/nexus.c
./i386/include/apicvar.h
./ia64/ia64/nexus.c
./kern/uipc_usrreq.c
./kern/vfs_vnops.c
./modules/acpi/Makefile
./modules/em/Makefile
./modules/mxge/mxge_eth_z8e/Makefile
./modules/mxge/mxge_ethp_z8e/Makefile
./net/if_bridge.c
./netgraph/ng_l2tp.c
./opencrypto/cryptodev.c
./powerpc/powermac/grackle.c
./powerpc/powermac/hrowpic.c
./powerpc/powermac/macio.c
./powerpc/powermac/uninorth.c
./powerpc/powerpc/openpic.c
./powerpc/psim/iobus.c
./sparc64/ebus/ebus.c
./sparc64/isa/ofw_isa.c
./sparc64/pci/apb.c
./sparc64/pci/ofw_pci.c
./sparc64/pci/ofw_pcib_subr.c
./sparc64/pci/ofw_pcibus.c
./sparc64/pci/psycho.c
./sparc64/sbus/sbus.c
./sparc64/sparc64/nexus.c
./vm/vnode_pager.c



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071013012706.GA2546>