Date: Wed, 27 Oct 1999 22:40:00 -0400 (EDT) From: Andrew Gallatin <gallatin@cs.duke.edu> To: freebsd-hackers@freebsd.org Cc: freebsd-alpha@freebsd.org Subject: ip forwarding broken on alpha Message-ID: <14359.43410.495963.975277@grasshopper.cs.duke.edu>
next in thread | raw e-mail | index | archive | help
I have an older AlphaStation 600 5/266 running -current (cvsupped last week) which is setup as a router between 2 100mb networks. When the machine is pushed fairly hard (like running a netperf -tUDP_STREAM -- -m 100 across the router, eg about 10-20k 100byte packets/sec ) the alpha falls over almost instantly. I have not enabled any NAT or firewall functionality, just ip forwarding. It generally crashes in MCLGET down in the ethernet driver's receiver interrupt handler. The driver doesn't seem to matter -- I've tried Intel Etherexpress Pro 100Bs and 3Com 3c905C-TX Fast Etherlink XLs. A typical stack trace looks like this: fatal kernel trap: trap entry = 0x2 (memory management fault) a0 = 0x826417b78f222 a1 = 0x1 a2 = 0x0 pc = 0xfffffc00004b31bc ra = 0xfffffc00004b315c curproc = 0 ddbprinttrap from 0xfffffc00004b31bc ddbprinttrap(0x826417b78f222, 0x1, 0x0, 0x2) panic: trap panic Stopped at Debugger+0x2c: ldq ra,0(sp) <0xfffffe0005ab57d0> <ra=0xff fffc00005042e0,sp=0xfffffe0005ab57d0> db> tr Debugger() at Debugger+0x2c panic() at panic+0xf4 trap() at trap+0x5cc xl_newbuf() at xl_newbuf+0x15c (null)() at 0x4 db> c this maps to pci/if_xl.c:1654. But the if_xl driver is probably not at fault, as I can crash just as easily in fxp_add_rfabuf() when using intel nics. Before trying the 3com cards, I had been working under the assumption that it was a problem with the fxp driver. I instrumented the mbuf routines somewhat (i hate debugging macros) and it seems the bad access is due to mclfree getting trashed & replaced by a "random" bad value (0x826417b78f222 in this panic). This might be a red herring, but I've found that if I run the entire ip_input path under splnet() (added splnet() around the call to ip_input() in ipintr().), things get a hell of a lot more stable. Rather than crashing in a few seconds, it sometimes takes minutes. And rather than an illegal access, I tend to run out of kernel stack space ( either a panic("possible stack overflow\n"); in alpha/alpha/interrupt.c, or I end up in the SRM console after calling halt from a PC which isn't in the kernel, which smells like an overrun stack to me). I'm not sure if this is related, or if it is a separate problem entirely. Since an x86 (PII@300MHz, 440lx motherboard, kernel built from same sources) is rock solid under the same workload, I suspect there's something wrong that is alpha specific, but I'll be damned if I can figure it out. My best guess is that it has something to do with the different interrupt structure on i386 & alpha. As I understand it, the i386 can mask off particular interrupt sources, but the alpha simply raises & lowers the ipl with the following levels available (from alpha/include/alpha_cpu.h): #define ALPHA_PSL_IPL_0 0x0000 /* all interrupts enabled */ #define ALPHA_PSL_IPL_SOFT 0x0001 /* software ints disabled */ #define ALPHA_PSL_IPL_IO 0x0004 /* I/O dev ints disabled */ #define ALPHA_PSL_IPL_CLOCK 0x0005 /* clock ints disabled */ #define ALPHA_PSL_IPL_HIGH 0x0006 /* all but mchecks disabled */ Can anybody hazard a guess as to what's going on? I've appended dmesg output & my config file for completeness. BTW, as long as the load is light, ip forwarding seems to work. I can't seem to make this happen using 2 100Mb tulips in this box (which must copy on the input path due to DMA alignment problems, this slows things down quite a bit, due to the low memory bandwidth of this machine) Thanks, Drew ------------------------------------------------------------------------------ Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin Duke University Email: gallatin@cs.duke.edu Department of Computer Science Phone: (919) 660-6590 Copyright (c) 1992-1999 The FreeBSD Project. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 4.0-CURRENT #4: Wed Oct 27 11:35:25 EDT 1999 gallatin@torrent.cs.duke.edu:/usr/project/ari_scratch2/gallatin/src/sys/comp ile/ALPHA AlphaStation 500 or 600 (KN20AA) Digital AlphaStation 600 5/266, 266MHz 8192 byte page size, 1 processor. CPU: EV5 (21164) major=5 minor=0 OSF PAL rev: 0x1000000020116 real memory = 131940352 (128848K bytes) avail memory = 122200064 (119336K bytes) Preloaded elf kernel "kernel" at 0xfffffc0000674000. cia0: ALCOR/ALCOR2, pass 2 pcib0: <2117x PCI host bus adapter> on cia0 pci0: <PCI bus> on pcib0 xl0: <3Com 3c905C-TX Fast Etherlink XL> irq 8 at device 7.0 on pci0 xl0: interrupting at CIA irq 8 xl0: Ethernet address: 00:50:da:09:3e:41 miibus0: <MII bus> on xl0 xlphy0: <3c905C 10/100 internal PHY> on miibus0 xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto pcib1: <DEC 21050 PCI-PCI bridge> at device 8.0 on pci0 pci1: <PCI bus> on pcib1 de0: <Digital 21040 Ethernet> irq 16 at device 0.0 on pci1 de0: interrupting at CIA irq 16 de0: DEC 21040 [10Mb/s] pass 2.3 de0: address 08:00:2b:e7:e6:d6 isp0: <Qlogic ISP 1020/1040 PCI SCSI Adapter> irq 17 at device 1.0 on pci1 isp0: interrupting at CIA irq 17 isp0: invalid NVRAM header (aa,aa,aa,aa) isp0: isp_mboxcmd sees mailbox int with 0x0 in mbox0 isp0: isp_mboxcmd sees mailbox int with 0x0 in mbox0 <..> isp1: <Qlogic ISP 1020/1040 PCI SCSI Adapter> irq 18 at device 2.0 on pci1 isp1: interrupting at CIA irq 18 isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0 isp1: invalid NVRAM header (55,55,55,55) isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0 isp1: isp_mboxcmd sees mailbox int with 0x0 in mbox0 de1: <Digital 21140 Fast Ethernet> irq 12 at device 9.0 on pci0 de1: interrupting at CIA irq 12 de1: DEC DE500-XA 21140 [10-100Mb/s] pass 1.1 de1: address 00:00:f8:00:99:ba de1: enabling Full Duplex 100baseTX port isab0: <Intel 82375EB PCI-EISA bridge> at device 10.0 on pci0 isa0: <ISA bus> on isab0 xl1: <3Com 3c905C-TX Fast Etherlink XL> irq 0 at device 11.0 on pci0 xl1: interrupting at CIA irq 0 xl1: Ethernet address: 00:50:da:09:42:41 miibus1: <MII bus> on xl1 xlphy1: <3c905C 10/100 internal PHY> on miibus1 xlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl2: <3Com 3c905C-TX Fast Etherlink XL> irq 4 at device 12.0 on pci0 xl2: interrupting at CIA irq 4 xl2: Ethernet address: 00:50:da:09:3f:e8 miibus2: <MII bus> on xl2 xlphy2: <3c905C 10/100 internal PHY> on miibus2 xlphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto mcclock0: <MC146818A real time clock> at port 0x70-0x71 on isa0 sio0 at port 0x3f8-0x3ff irq 4 on isa0 sio0: type 16550A, console sio0: interrupting at ISA irq 4 sio1 at port 0x2f8-0x2ff irq 3 flags 0x80 on isa0 sio1: type 16550A sio1: interrupting at ISA irq 3 fdc0: interrupting at ISA irq 6 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 atkbd0: interrupting at ISA irq 1 struct nfssvc_sock bloated (> 256bytes) Try reducing NFS_UIDHASHSIZ struct nfsuid bloated (> 128bytes) Try unionizing the nu_nickname and nu_flag fields Timecounter "alpha" frequency 266671691 Hz Waiting 3 seconds for SCSI devices to settle isp0: driver initiated bus reset of bus 0 isp1: driver initiated bus reset of bus 0 de0: autosense failed: cable problem? Creating DISK da0 Creating DISK da1 Creating DISK cd0 da0 at isp0 bus 0 target 0 lun 0 da0: <SEAGATE ST15150W 0023> Fixed Direct Access SCSI-2 device da0: 20.000MB/s transfers (10.000MHz, offset 12, 16bit), Tagged Queueing Enabled da0: 4095MB (8388315 512 byte sectors: 255H 63S/T 522C) da1 at isp0 bus 0 target 1 lun 0 da1: <SEAGATE ST32171W 0484> Fixed Direct Access SCSI-2 device da1: 20.000MB/s transfers (10.000MHz, offset 12, 16bit), Tagged Queueing Enabled da1: 2062MB (4223444 512 byte sectors: 255H 63S/T 262C) cd0 at isp0 bus 0 target 5 lun 0 cd0: <DEC RRD45 (C) DEC 1645> Removable CD-ROM SCSI-2 device cd0: 4.032MB/s transfers (4.032MHz, offset 12) cd0: Attempt to query device size failed: NOT READY, Medium not present # machine alpha cpu EV4 cpu EV5 ident ALPHA maxusers 32 # Platforms supported options DEC_AXPPCI_33 # UDB, Multia, AXPpci33, Noname options DEC_EB164 # EB164, PC164, PC164LX, PC164SX options DEC_EB64PLUS # EB64+, Aspen Alpine, etc options DEC_2100_A50 # AlphaStation 200, 250, 255, 400 options DEC_KN20AA # AlphaStation 500, 600 options DEC_ST550 # Personal Workstation 433, 500, 600 options DEC_ST6600 # xp1000, dp264, ds20, ds10, family #options DEC_3000_300 # DEC3000/300* Pelic* family #options DEC_3000_500 # DEC3000/[4-9]00 Flamingo/Sandpiper family options INET #InterNETworking `options FFS #Berkeley Fast Filesystem options NFS #Network Filesystem options MFS #Memory Filesystem options MFS_ROOT #Memory Filesystem as rootfs options MSDOSFS #MSDOS Filesystem options CD9660 #ISO 9660 Filesystem options CD9660_ROOT #CD-ROM usable as root device options FFS_ROOT #FFS usable as root device [keep this!] options NFS_ROOT #NFS usable as root device options PROCFS #Process filesystem options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!] options SCSI_DELAY=3000 #Be pessimistic about Joe SCSI device options UCONSOLE #Allow users to grab the console options SOFTUPDATES # Standard busses controller pci0 controller isa0 # A single entry for any of these controllers (ncr, ahb, ahc, amd) is # sufficient for any number of installed devices. controller ncr0 controller isp0 controller ahc0 #controller esp0 controller scbus0 device da0 device sa0 device pass0 device cd0 # # ATA and ATAPI devices # This is work in progress, use at your own risk. # It currently reuses the majors of wd.c and friends. # It cannot co-exist with the old system in one kernel. # You only need one "controller ata0" for it to find all # PCI devices on modern machines. controller ata0 device atadisk0 # ATA disk drives device atapicd0 # ATAPI CDROM drives device atapifd0 # ATAPI floppy drives device atapist0 # ATAPI tape drives # real time clock device mcclock0 at isa0 port 0x70 controller fdc0 at isa? port IO_FD1 irq 6 drq 2 disk fd0 at fdc0 drive 0 controller atkbdc0 at isa? port IO_KBD device atkbd0 at atkbdc? irq 1 device psm0 at atkbdc? irq 12 device vga0 at isa? port ? conflicts # splash screen/screen saver pseudo-device splash # syscons is the default console driver, resembling an SCO console device sc0 at isa? device sio0 at isa0 port IO_COM1 irq 4 device sio1 at isa0 port IO_COM2 irq 3 flags 0x80 # MII bus support, required for some 10/100 NICs. controller miibus0 # Operational PCI Ethernet drivers. device al0 device ax0 device de0 device dm0 device fxp0 device le0 device mx0 device pn0 device rl0 device sf0 device sis0 device ste0 device tl0 device vr0 device wb0 device xl0 pseudo-device loop pseudo-device ether pseudo-device sl 1 pseudo-device ppp 1 pseudo-device tun pseudo-device pty pseudo-device bpf 4 # KTRACE enables the system-call tracing facility ktrace(2). # This adds 4 KB bloat to your kernel, and slightly increases # the costs of each syscall. options KTRACE #kernel tracing # This provides support for System V shared memory and message queues. # options SYSVSHM options SYSVMSG options SYSVSEM # # everything above is essentially GENERIC. customizations below. # options DDB options BREAK_TO_DEBUGGER To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14359.43410.495963.975277>