Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Apr 1997 11:51:19 +0930 (CST)
From:      Michael Smith <msmith@atrad.adelaide.edu.au>
To:        Shimon@i-Connect.Net (Simon Shapiro)
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: A Desparate Plea for Help...
Message-ID:  <199704280221.LAA13874@genesis.atrad.adelaide.edu.au>
In-Reply-To: <XFMail.970427164721.Shimon@i-Connect.Net> from Simon Shapiro at "Apr 27, 97 04:24:58 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
Simon Shapiro stands accused of saying:
> 
> At first, the problems appear to have been ahc related.  This was worked on
> and appear to be corrected.  Then we had a problem with the sd.c code (?)
> flipping when an Iomega Jaz drive is in sleep mode and being accessed.
> This causes panics and we know how to live with it (by not using the Jaz
> drive;  We cannot get out of Iomega ANY technical data, not even how to
> keep it spinning for a while longer.  So much for using Jaz drive in this
> project (their loss of large sale.  Not really mine)).

Simon; I have heard this reported before, but I have a Jaz, and
regularly access it when it's not spinning.  I have _never_ had a
problem with this, and if you can provide something that can be
reproduced elsewhere, then I am certain that it will be fixed.

> The last and most troubling problem is a complete crash/freeze when running
> certain X11 applications;  At ONE point, we managed to observe a panic that
> went something like:
> 
> Fata trap 12 page fault at 0xf71e0014
> 
> _spec_open+0x6e
> _vm_open
> _open
> _syscall
> _Xsyscall
> 
> There was a mention of bash in this panic.

That would probably just be the 'current process'.

> I need to solve this problem, not to be told 9indirectly) that it must be
> my fault, as it soes not happen on someone else's machine.

That's not what that response means.  "I can't make it happen here"
means "I can't work out what is wrong because I can't reproduce the
problem, and I need to reproduce the problem to have all the
information I need to hand".

If you can configure your system(s) to dump kernel cores, and put the 
cores and and matching kernels _compiled_with_debugging_information_
up for FTP, this is _very_ helpful.  If you can give details so that 
we can check out exactly the same sources as you are using, that'll
help too.

> I have spent enough years (over 25) in the Unix kernel business to know how
> to read a config file and to know that when applications make a system
> call, they are NOT supposed to panic the system.  Even if the confoguration
> file is not perfect (this should no compile, and should not compile and
> crash on an open(2) call).

The trap you see above is somewhere near the top of spec_open in
sys/miscfs/specfs.c.  Without knowing exactly what the trap was;
specifically the fault address, it's hard to infer more.  There are
several pointer references near the top of spec_open that might be
the problem, the most likely IMHO is :

        /*
         * Don't allow open if fs is mounted -nodev.
         */
        if (vp->v_mount && (vp->v_mount->mnt_flag & MNT_NODEV))
                return (ENXIO);

We have seen problems with vp->v_mount being NULL before; this
appears most often with MFS filesystems.  Are you using MFS in your
systems?

-- 
]] Mike Smith, Software Engineer        msmith@gsoft.com.au             [[
]] Genesis Software                     genesis@gsoft.com.au            [[
]] High-speed data acquisition and      (GSM mobile)     0411-222-496   [[
]] realtime instrument control.         (ph)          +61-8-8267-3493   [[
]] Unix hardware collector.             "Where are your PEZ?" The Tick  [[



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704280221.LAA13874>