Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Aug 2001 12:54:26 -0400
From:      Mike Tancsa <mike@sentex.net>
To:        "Michael M." <mikem@wmis.net>, freebsd-stable@FreeBSD.ORG
Subject:   Re: fxp SCB timeout problems, anyone have a solution?
Message-ID:  <5.1.0.14.0.20010814122135.01abc0a0@marble.sentex.ca>
In-Reply-To: <5.1.0.14.2.20010814120333.04649230@127.0.0.1>
References:  <5.1.0.14.0.20010814081137.0593b758@192.168.0.12> <5.1.0.14.2.20010813181107.04174140@127.0.0.1> <20010810100632.D18533@nexus.root.com> <BBDEEDD2EB67D311A0240008C74B9345129C78@ntxmidcity.sdccd.cc.ca.us> <BBDEEDD2EB67D311A0240008C74B9345129C78@ntxmidcity.sdccd.cc.ca.us>

next in thread | previous in thread | raw e-mail | index | archive | help

Try /usr/ports/benchmarks/netperf in 100baseTX. I forget which test, but=20
one of them should be able to do it as I remember reading results on a pair=
=20
of OC-3 cards that pushed close to 120Mb/s.  Or, check to see if the ping=20
flood is generating or trying to generate data up near the medium limit.=
 e.g.

netstat -ni;sleep 10;netstat -ni;sleep 10;netstat -ni

to see if you are near 100Mb/s

also, see if the problem happens at 10Mb full-duplex.  Thats what I run=20
here on the ET version of the chip and I dont run into the problems.  Also,=
=20
I have never had the machine panic on me, even with the EM version which I=
=20
have problems with period.

         ---Mike

At 12:13 PM 8/14/01 -0400, Michael M. wrote:
>Good thing you asked.. this one slipped by me.  It turns out I can=
 duplicate
>the problem on both the A *AND* the A2 board..  when I was testing the
>machine with the A board, I didn't realize that it was plugged in at
>100baseTX (f/d), while the A2 board was in running at "10baseT/UTP".
>I did some shuffling.. tried the A board in the 10baseT hub.. was able to=
=20
>cause
>kernel panic with just a few ping -fs1500's ...  BUT, when I plugged either
>machine into the 100baseTX switch, not only could I not crash them, but the
>fxp SCB timeout messages disappeared.  (Perhaps I just couldn't generate
>enough data??).
>
>So.. to sum it up, both the A and A2 board will both have fxp SCB timeout
>messages, as well as kernel panics if running at 10baseT.  The timeout
>messages and kernel panics go away if they're zipping along at 100baseTX.
>(Unless I'm not sending enough data to crash them at this higher speed...)
>
>Hope this helps!
>
>Mike
>
>
>At 08:16 AM 8/14/2001 -0400, Mike Tancsa wrote:
>
>>Very strange. I have been using the ET version of the board without the=20
>>timeout problems. the EM however, I can reproduce them in very short=20
>>order. I have a very similar board as yours (AOpen)
>>
>>fxp1: <Intel Pro/100 Ethernet> port 0xc400-0xc43f mem=20
>>0xd5101000-0xd5101fff irq 11 at device 8.0 on pci1
>>fxp1: Ethernet address 00:01:80:05:d2:67
>>inphy1: <i82562ET 10/100 media interface> on miibus1
>>inphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
>>
>>But I dont see these problems.  Are you running in 10baseT ? 100BaseTX ?=
=20
>>Half or full duplex ?
>>
>>         ---Mike
>>
>>At 07:07 PM 8/13/2001 -0400, Michael M. wrote:
>>>Greetings..
>>>
>>>I've been experiencing problems with fxp SCB timeouts too..  I've=
 actually
>>>had the problem for almost a month now.. starting with a build from
>>>7/17/2001, 06:42:09 EDT.  I installed FreeBSD on this system about
>>>a month ago, running a -STABLE build dated 7/17/2001.  I've cvsup'd a
>>>couple of times since then, through my recent build today:
>>>
>>>FreeBSD 4.4-PRERELEASE (CORP2) #4: Mon Aug 13 14:34:10 EDT 2001
>>>
>>>I'm still experiencing the same problem.  For the past month, this=
 machine
>>>would hard lock every few days, when it was sitting idle (not in=20
>>>production),
>>>seemingly without reason.  After reading some of the new posts about the
>>>fxp messages on the list, I decided to do some testing... If I set up a=
=20
>>>ping -f
>>>to hammer away at the machine in question from just 2 other machines on
>>>the LAN, I can make corp2 (the machine with the fxp problem) kernel=
 panic.
>>>
>>>At first it was surprising to me that this machine was experiencing a=20
>>>problem,
>>>as it is running the same board as another machine that was just built=
 that
>>>has been running -STABLE fine for at least 2 months now (it's current=
 build
>>>date is 7/24/2001).  After further investigation, I found corp2 was=
 actually
>>>running the Intel "D815EEA2" board - a slightly different model than the
>>>"D815EEA" board the other (working) machine is running.
>>>
>>>A snippet from Intel's website:
>>>
>>>What is the difference between Intel Desktop Boards D815EEA and
>>>D815EPEA and the Intel Desktop Boards D815EEA2 and D815EPEA2?
>>>The main differences are Intel=AE Desktop Boards D815EEA2 and D815EPEA2
>>>now support two additional USB ports out the back panel (a total of=20
>>>four). Also
>>>the game port on the back panel has been removed.
>>>Note: The D815EEA2 and D815EPEA2 require a different I/O shield and
>>>software image than the one used on D815EEA and D815EPEA.
>>>
>>>Possible bug in just the A2?
>>>
>>>Anyway.. here is the dmesg, and stack trace from the crash:
>>>(notice the fxp0 / SCB timeouts too..)
>>>
>>>Fatal trap 10: trace trap while in kernel mode
>>>instruction pointer     =3D 0x8:0xc0212d86
>>>stack pointer           =3D 0x10:0xc025085c
>>>frame pointer           =3D 0x10:0x0
>>>code segment            =3D base 0x0, limit 0xfffff, type 0x1b
>>>                         =3D DPL 0, pres 1, def32 1, gran 1
>>>processor eflags        =3D interrupt enabled, IOPL =3D 0
>>>current process         =3D Idle
>>>interrupt mask          =3D none
>>>trap number             =3D 10
>>>panic: trace trap
>>>
>>>syncing disks... 4 4 4 4 4 4 4 fxp0: SCB timeout: 0x70, 0x0, 0x50 0x400
>>>4 4 4 4 4 fxp0: SCB timeout: 0x80, 0x0, 0x50 0x400
>>>fxp0: SCB timeout: 0x80, 0x0, 0x50 0x400
>>>4 4 4 4 4 4 fxp0: SCB timeout: 0x80, 0x0, 0x50 0x400
>>>fxp0: SCB timeout: 0x80, 0x0, 0x50 0x400
>>>4 4
>>>giving up on 4 buffers
>>>ad0: WRITE command timeout tag=3D0 serv=3D0 - resetting
>>>ata0: resetting devices .. done
>>>fxp0: SCB timeout: 0x80, 0x0, 0x50 0x400
>>>fxp0: SCB timeout: 0x80, 0x0, 0x50 0x400
>>>Uptime: 11m10s
>>>
>>>dumping to dev #ad/0x30001, offset 24787
>>>dump ata0: resetting devices .. done
>>>254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237=
=20
>>>236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219=
=20
>>>218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201=
=20
>>>200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183=
=20
>>>182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165=
=20
>>>164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147=
=20
>>>146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129=
=20
>>>128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111=
=20
>>>110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91=20
>>>90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67=
=20
>>>66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43=
=20
>>>42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19=
=20
>>>18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 succeeded
>>>Automatic reboot in 15 seconds - press a key on the console to abort
>>>Rebooting...
>>>Copyright (c) 1992-2001 The FreeBSD Project.
>>>Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>>>         The Regents of the University of California. All rights=
 reserved.
>>>FreeBSD 4.4-PRERELEASE #4: Mon Aug 13 14:34:10 EDT 2001
>>>     root@corp2.:/usr/obj/usr/src/sys/CORP2
>>>Timecounter "i8254"  frequency 1193182 Hz
>>>Timecounter "TSC"  frequency 996769942 Hz
>>>CPU: Pentium III/Pentium III Xeon/Celeron (996.77-MHz 686-class CPU)
>>>   Origin =3D "GenuineIntel"  Id =3D 0x68a  Stepping =3D 10
>>>Features=3D0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,=
CMOV,PAT,PSE36,MMX,FXSR,SSE>
>>>real memory  =3D 267124736 (260864K bytes)
>>>config> di sn0
>>>config> di lnc0
>>>config> di ie0
>>>config> di cs0
>>>config> q
>>>avail memory =3D 257245184 (251216K bytes)
>>>Preloaded elf kernel "kernel" at 0xc02d5000.
>>>Preloaded userconfig_script "/boot/kernel.conf" at 0xc02d509c.
>>>Pentium Pro MTRR support enabled
>>>md0: Malloc disk
>>>npx0: <math processor> on motherboard
>>>npx0: INT 16 interface
>>>pcib0: <Host to PCI bridge> on motherboard
>>>pci0: <PCI bus> on pcib0
>>>pci0: <Intel model 1132 VGA-compatible display device> at 2.0 irq 11
>>>pcib1: <Intel 82801BA/BAM (ICH2) Hub to PCI bridge> at device 30.0 on=
 pci0
>>>pci1: <PCI bus> on pcib1
>>>fxp0: <Intel Pro/100 Ethernet> port 0xdf00-0xdf3f mem=20
>>>0xff8ff000-0xff8fffff irq 11 at device 8.0 on pci1
>>>fxp0: Ethernet address 00:03:47:xx:xx:xx
>>>inphy0: <i82562ET 10/100 media interface> on miibus0
>>>inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
>>>isab0: <Intel 82801BA/BAM (ICH2) PCI to LPC bridge> at device 31.0 on=
 pci0
>>>isa0: <ISA bus> on isab0
>>>atapci0: <Intel ICH2 ATA100 controller> port 0xffa0-0xffaf at device=20
>>>31.1 on pci0
>>>ata0: at 0x1f0 irq 14 on atapci0
>>>ata1: at 0x170 irq 15 on atapci0
>>>pci0: <Intel 82801BA/BAM (ICH2) USB controller USB-A> at 31.2 irq 11
>>>pci0: <unknown card> (vendor=3D0x8086, dev=3D0x2443) at 31.3 irq 9
>>>pci0: <Intel 82801BA/BAM (ICH2) USB controller USB-B> at 31.4 irq 10
>>>pci0: <unknown card> (vendor=3D0x8086, dev=3D0x2445) at 31.5 irq 9
>>>orm0: <Option ROMs> at iomem 0xc0000-0xcbfff,0xcc000-0xcd7ff on isa0
>>>fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
>>>fdc0: FIFO enabled, 8 bytes threshold
>>>fd0: <1440-KB 3.5" drive> on fdc0 drive 0
>>>atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
>>>atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
>>>kbd0 at atkbd0
>>>psm0: failed to get data.
>>>psm0: <PS/2 Mouse> irq 12 on atkbdc0
>>>psm0: model Generic PS/2 mouse, device ID 0
>>>vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
>>>sc0: <System console> at flags 0x100 on isa0
>>>sc0: VGA <16 virtual consoles, flags=3D0x300>
>>>sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
>>>sio0: type 16550A
>>>sio1 at port 0x2f8-0x2ff irq 3 on isa0
>>>sio1: type 16550A
>>>ad0: 39266MB <IBM-DTLA-305040> [79780/16/63] at ata0-master UDMA100
>>>acd0: CDROM <FX4824T> at ata1-master using PIO4
>>>Mounting root from ufs:/dev/ad0s1a
>>>
>>>----------
>>>
>>>corp2# nm -n /kernel | grep c0212d
>>>c0212d00 T __bb_init_func
>>>c0212d1c t idle
>>>c0212d64 t idle_loop
>>>c0212d84 T default_halt
>>>c0212d88 T cpu_switch
>>>corp2#
>>>
>>>----------
>>>
>>>(kgdb) where
>>>#0  dumpsys () at ../../kern/kern_shutdown.c:472
>>>#1  0xc0142f7b in boot (howto=3D256) at ../../kern/kern_shutdown.c:312
>>>#2  0xc0143348 in poweroff_wait (junk=3D0xc024856c, howto=3D-1071349574)=
 at=20
>>>../../kern/kern_shutdown.c:580
>>>#3  0xc0213c76 in trap_fatal (frame=3D0xc025081c, eva=3D0) at=20
>>>../../i386/i386/trap.c:951
>>>#4  0xc021367f in trap (frame=3D{tf_fs =3D 16, tf_es =3D 16, tf_ds =3D=
 -65520,=20
>>>tf_edi =3D -1, tf_esi =3D 0, tf_ebp =3D 0, tf_isp =3D -1071314872,
>>>       tf_ebx =3D 0, tf_edx =3D 75616, tf_ecx =3D -886177536, tf_eax =3D=
 0,=20
>>> tf_trapno =3D 10, tf_err =3D 0, tf_eip =3D -1071567482, tf_cs =3D 8,
>>>       tf_eflags =3D 582, tf_esp =3D -1071567487, tf_ss =3D 15}) at=20
>>> ../../i386/i386/trap.c:613
>>>(kgdb)
>>>
>>>............................
>>>
>>>Is it likely a work-around can be found.. or do I need to scrap this=
 board?
>>>
>>>I hope this helps to shed some light on this problem...
>>>If any additional information is needed, please don't hesitate to ask.
>>>
>>>Mike
>>>
>>>At 10:06 AM 8/10/2001 -0700, you wrote:
>>>> >
>>>> >There is currently no fix for this. I went through the same thing=
 about 2
>>>> >months ago with intel's newest i815 chipset MB. What I got from David
>>>> >Greenman is although the 82562 is seen as an fxp0 it was never=20
>>>> tested. I had
>>>> >looked into this problem quiet a bit and forget exactly why it does=
 not
>>>> >work, but it does not work. You might try upgrading to the newest=
 stable.
>>>>
>>>>    Jonathan Lemon looked into this problem extensively and found that=
=20
>>>> there
>>>>was actually some bugs in the new chip that were responsible for the=20
>>>>problem.
>>>>There were some Intel workarounds, which were implemented, but I don't=
=20
>>>>think
>>>>they solved the problem for all versions of the new chips.
>>>>
>>>>-DG
>>>>
>>>>David Greenman
>>>>Co-founder, The FreeBSD Project - http://www.freebsd.org
>>>>President, TeraSolutions, Inc. - http://www.terasolutions.com
>>>>Pave the road of life with opportunities.
>>>
>>>--------------------------------------------------------------------
>>>Mike Tancsa,                                      tel +1 519 651 3400
>>>Network Administration,                           mike@sentex.net
>>>Sentex Communications                             www.sentex.net
>>>Cambridge, Ontario Canada                         www.sentex.net/mike
>>
>>
>>To Unsubscribe: send mail to majordomo@FreeBSD.org
>>with "unsubscribe freebsd-stable" in the body of the message


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5.1.0.14.0.20010814122135.01abc0a0>