Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Jul 1995 14:31:27 -0400 (EDT)
From:      A boy and his worm gear <wpaul@skynet.ctr.columbia.edu>
To:        graichen@omega.physik.fu-berlin.de (Thomas Graichen)
Cc:        bugs@freebsd.org
Subject:   Re: 3 ways to crash FreeBSD (2.0.5 and 950412-SNAP)
Message-ID:  <199507211831.OAA05262@skynet.ctr.columbia.edu>
In-Reply-To: <9507210817.AA25956@omega.physik.fu-berlin.de> from "Thomas Graichen" at Jul 21, 95 10:17:53 am

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the world, Thomas Graichen had to walk
into mine and say:
 
> hello - here are my 3 ways to crash FreeBSD:

[three-finger salute during boot causes stange crash]

Just a guess: I remember someone saying that CTRL-ALT-DEL is supposed to
cause the kernel to send a signal to init to tell it to start shutting
down the system. If it tries to do the same thing when init isn't
running (yet), then I imagine nasty things would happen.

> * simply do the following as root at the console:
> 
>   modload -u -o /tmp/saver_mod -e saver_init -q /lkm/${saver}_saver_mod.o
>   modload -u -o /tmp/saver_mod -e saver_init -q /lkm/${saver}_saver_mod.o
>   (this will giva an error)
>   modunload -n star_saver

I think I fixed this one. I noticed a similar problem with the if_sl
module; all the MISC type modules are suceptible to the bug. The problem
is that duplicate checking (i.e. checking to see if the user is trying
to load a second instance of the same module) didn't work quite right
with MISC modules: there are handler routines for other module types
(VFS, EXEC, etc...) which do the checking before the module is actually
called for the first time, but there is no such handler function for
MISC modules, and MISC modules aren't smart enough to do it themselves.
 
What's happening is that the duplicate checking is done
_after_ the module's internal initialization routine is called. By
the time the kernel notices the problem, the module has already wired
itself in. As part of the error handling, the kernel tries to unmap the
duplicate instance of the module. This is akin to gnawing your own arm
off: the next time the now-dead module's address space is referenced,
the system will blow up.

What you need to do to fix this is grab a new copy of /usr/src/sys/sys/lkm.h,
install it (it goes in /usr/include/sys too, if you have just the lkm
sources installed) and rebuild all the MISC modules. The new lkm.h
has a tiny modification in the DISPATCH() macro: it makes a quick call
to lkmexists() before actually trying to run the module's initialization
routine.

(This should be pulled into the STABLE branch if it hasn't already.)

> * the last is a problem i have since the early january SNAP's - this is what
> i've written some times before to jordan:
> 
> the system crashes then i log in (but couriously not if somebody else or
> root does this) via xdm - /var/log/messages says
> 
> Feb  9 10:49:44 julia /vmunix: Error in getattr: 70

Lessee... errno 70 is Stale NFS file handle.

> Feb  9 10:49:45 julia /vmunix: instruction pointer      = 0x8:0xf0125b1b
                                                                  ^^^^^^^^
Do an 'nm /vmunix' and see if you can find a symbol with an address
close to this one. This will give you a rough idea of where the system
is getting hosed (though it may not point you directly at the problem).
 
> its absolutely reproducable: reboot - xdm is started - i try to login - the
> xdm login window disappears - i here the disk writing the coredump - but the
> problem did'nt appear if somebody else logs in (who has his homedirectory on
> another machine - but both dec alpha's osf/1 3.0 - we are mounting the
> homedirs via amd with nfs) - my nfs-homedir-server says:
> 
> Feb  9 10:45:48 sirius vmunix: NFS server: stale file handle fs(8,2054) file
> 116839 gen 792314558
> Feb  9 10:45:48 sirius vmunix:  getattr, client address = 130.133.3.235, errno
> 22

Errno 22 is Invalid Argument. This stuff is out of my league, but it
sounds like a locking problem or race condition. It happens that there
have been many changes to the NFS and VM code in FreeBSD-current.
You might try setting up a -current system and seeing if the problem
persists.
 
> * and one last thing:
> we mount all our homedir's via amd - which mounts them if they are needed
> (user logs in) and tries to unmount them automatic if they are no longer used
> - but FreeBSD seems to loose the directories from time to time - that means
> the directory will be mounted again and again (this way i sometimes get 20
> times the same dir mounted) - and after each of these overmountings i get an
> "getting cwd failed" from my tcsh (because the directory is new ... mounted) -
> do you have any ideas ?

Sorry: I use amd on my system and it works fine. My configuration is
probably different from yours though. (I mount each user's home directory
just once and then use amd to create symlinks that point to the right
directies in each filesystem. So amd mounts /q/elara/home/elara via NFS,
creates a /home/elara link that points to /q/elara/home/elara, then
it makes, for example, a /homes/foouser link that points to 
/home/elara/foouser (and a /homes/baruser that points to 
/home/elara/baruser, and a /homes/bazuser, etc...) This way, everyone's 
home directory is always /homes/<username>.  Note that I use the Berkeley 
amd on all my machines too. The only special thing I have to do with 
FreeBSD is use the resvport option.)
 
> to all the points above - i'll try to give you all the information you need
> and as far as i can all the help i may give you - thanks in advance - t

Try -current first and see if the problems are still there.

-Bill

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~T~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Bill Paul            (212) 854-6020 | System Manager
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The Møøse Illuminati: ignore it and be confused, or join it and be confusing!
~~~~~~ "Welcome to All Things BSDish! If it's not BSDish, it's crap!" ~~~~~~~



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199507211831.OAA05262>