Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Apr 2001 22:29:45 -0400 (EDT)
From:      "T. William Wells" <bill@twwells.com>
To:        barney@tp.databus.com (Barney Wolff)
Cc:        stable@freebsd.org
Subject:   Re: possible problem with dc driver
Message-ID:  <E14kzWr-000M5Y-00@twwells.com>
In-Reply-To: <20010404193113.A43291@tp.databus.com> from "Barney Wolff" at Apr 04, 2001 07:31:13 PM

next in thread | previous in thread | raw e-mail | index | archive | help
Barney Wolff <barney@tp.databus.com> wrote:
> If the machine actually locks up, rather than the transfer
> just running slowly while other things on the machine run
> normally, then it's either h/w or s/w on the machine itself.

It had actually locked up. I've been daring my machine to lock up
again by running some tests while writing this e-mail. :) So far,
it's been behaving normally. That is, not locking up but the
transfers are slow. [I got a crash!]

> The full list of possibilities includes cables, the hub,
> the cards, the motherboards, the disk controllers, the disks.
> And of course, it might actually be a driver bug.

I've eliminated the cables and hub as possibilities, by swapping
things and using tests between various machines.

> To elminate the disks, either run ttcp (from ports) or make
> sure the file is cached and ftp it to /dev/null.

Well, if memory serves, it was transfers to the machine that
caused lockups. [Memory did not serve -- I made it fail by
transferring from the machine.] My current tests have been with,
and without, writing the file to disk. Still no locks, though I've
definitely been getting losses.

> You might also try moving the card to a different slot, where
> it doesn't share an irq with something else (if dmesg.boot
> shows that it does that).

It doesn't.

> If you need to start swapping components, start with the cheapest
> and keep going until the problem is gone.

There's only one swappable "component" -- the NIC. :) Everything
else is on the motherboard.

> Try to enjoy this :)

Heh. I'm not a driver newbie (I've written a few) so I won't be
lost. However, I'm really not looking forward to debugging a
driver I didn't write. :) OTOH, the dc driver looks better than
most....

========

As my editorial notes said, I got it to crash. What I did was, on
P (the machine with the failing NIC):

	while :; do scp /usr/tmp/bigfile G:/usr/tmp/bigfile; done

and waited until the system froze. Occasionally, I noted that
after a transfer, there was no output from the next scp. The
system wasn't frozen; I could switch to another console and do a
ps, to see that scp was waiting in [connec] (I think) state. It
was possible to ^C out of that and re-enter the command to
continue the test.

After a bit of work, I got a coredump. BTW, the handbook is in
error here; DDB panic followed by continue doesn't do the right
thing. What I did was 'call panic(0)' but I wouldn't be surprised
to discover there is a better way. Anyway, here's the relevant
portion of the backtrace. (I've removed the nonsense relating to
the control-alt-ESC that got me into the debugger.)

#14 0xc019787b in dc_rxeof (sc=0xc095f000) at /usr/src/sys/pci/if_dc.c:2365
#15 0xc0197edf in dc_intr (arg=0xc095f000) at /usr/src/sys/pci/if_dc.c:2640
#16 0xc020aa92 in slow_copyin ()
#17 0xc015d248 in sosend (so=0xc3d03480, addr=0x0, uio=0xc420fed8, top=0x0,
    control=0x0, flags=0, p=0xc3fa73c0) at /usr/src/sys/kern/uipc_socket.c:585
#18 0xc015178c in soo_write (fp=0xc09dbd00, uio=0xc420fed8, cred=0xc0a69180,
    flags=0, p=0xc3fa73c0) at /usr/src/sys/kern/sys_socket.c:81
#19 0xc014e3b1 in dofilewrite (p=0xc3fa73c0, fp=0xc09dbd00, fd=3,
    buf=0x8145004, nbyte=135088, offset=-1, flags=0)
    at /usr/src/sys/sys/file.h:163
#20 0xc014e26a in write (p=0xc3fa73c0, uap=0xc420ff80)
    at /usr/src/sys/kern/sys_generic.c:329
#21 0xc020c27d in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
      tf_edi = 134684360, tf_esi = -1077939264, tf_ebp = -1077939324,
      tf_isp = -1004470316, tf_ebx = 135088, tf_edx = 134682404, tf_ecx = 3,
      tf_eax = 4, tf_trapno = 0, tf_err = 2, tf_eip = 672976284, tf_cs = 31,
      tf_eflags = 642, tf_esp = -1077939368, tf_ss = 47})
    at /usr/src/sys/i386/i386/trap.c:1150
#22 0xc01ffe65 in Xint0x80_syscall ()
#23 0x804ef42 in ?? ()
#24 0x804c7aa in ?? ()
#25 0x804c111 in ?? ()
#26 0x804af05 in ?? ()

Suggestions as to where to go from here?

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E14kzWr-000M5Y-00>