Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Apr 95 23:25:42 +0000
Subject:   IP problem with 950412-SNAP (and earlier -SNAPs)
Message-ID:  <"MAC-950420002635-558D*/G=Andrew/S=Gordon/O=Net-Tel Computer Systems Ltd/PRMD=Net-Tel/ADMD=Gold 400/C=GB/"@MHS>

Next in thread | Raw E-Mail | Index | Archive | Help
I have been suffering from a networking-related problem that causes my
machines to spontaneously reboot.  I had previously assumed that this was
down to the new VM/buffer stuff, but since 950412 is supposed to be much more
stable, yet I have exactly the same problem, maybe my problem is somthing

The problem is hard to get a handle on, since you can do various tests on one
day and think you are getting a nice reproducible pattern, yet on another day
it can be different.  The best common factor I can find is situations where
packets would probably be discarded due to congestion at an internal
interface -  the following are typical failure cases:

a) Use tcpdump injudiciously on a busy network - dumping with some filter
that keeps the level of traffic down to what the terminal can keep up with,
but a 'dump everything' runs for a while and then reboots the machine when
traffic on the network hots up.

b) Start the IIJ PPP in on-demand mode.  Ping some external host and it dials
up as required.  You can now ping away as much as you like and it doesn't
break.  However, if you do ANYTHING involving TCP packets, the machine

c) Build a kernel with IPFIREWALL, then give a silly ipfw command that blocks
all packets on all interfaces.  The machine reboots after a few seconds (and
can be kicked over faster by pinging localhost for example).

I am generally running a kernel that is essentially generic with the devices
that I don't use deleted; obviously for tests a) and c) I also have
"pseudo-device bpfilter 4", and for test c) "options IPFIREWALL".  Since the
latest snap has tun0 in the generic kernel, I have also tried test c) having
booted from kernel.GENERIC out of the bindist - but in all cases the
behaviour seems exactly the same whichever brew of kernel I try.

When I say 'the machine reboots' I mean just that - no kernel panic message,
just a freeze for a second or so then the BIOS memory count starts.  This
makes it rather hard to guess what is going on...

I am not really inclined to blame the hardware, since I have two machines
that are COMPLETELY different in hardware configuration yet behave the same
way, and both will stay up for days on end if I avoid doing any of the things
I know will kill it.

For reference, the two machines are:

a) 486/66, 16MB, Adaptec VLB SCSI with disc, tape and CD attached, WD
Paradise VLB video, SMC ISA ethernet card (16 bit).  IDE disc fitted but not
in use (DOS on it).

b) 386SX/20, 10Mb, IDE disk, cheapo unaccellerated VGA card, SMC ISA ethernet
card (8 bit).

Since I seem to be able to kill myself so easily, I'm surprised noone else
ahs run into this one.  Any ideas?


Want to link to this message? Use this URL: <"MAC-950420002635-558D*/G=Andrew/S=Gordon/O=Net-Tel Computer Systems Ltd/PRMD=Net-Tel/ADMD=Gold 400/C=GB/">