From owner-freebsd-bugs  Wed Sep 20 05:20:11 1995
Return-Path: owner-bugs
Received: (from root@localhost)
          by freefall.freebsd.org (8.6.12/8.6.6) id FAA00849
          for bugs-outgoing; Wed, 20 Sep 1995 05:20:11 -0700
Received: (from gnats@localhost)
          by freefall.freebsd.org (8.6.12/8.6.6) id FAA00843
          ; Wed, 20 Sep 1995 05:20:06 -0700
Resent-Date: Wed, 20 Sep 1995 05:20:06 -0700
Resent-Message-Id: <199509201220.FAA00843@freefall.freebsd.org>
Resent-From: gnats (GNATS Management)
Resent-To: freebsd-bugs
Resent-Reply-To: FreeBSD-gnats@freefall.FreeBSD.org,
        kato@eclogite.eps.nagoya-u.ac.jp
Received: from mail.barrnet.net (mail.barrnet.net [131.119.246.7])
          by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id FAA00818
          for <FreeBSD-gnats-submit@freebsd.org>; Wed, 20 Sep 1995 05:17:46 -0700
Received: from marble.eps.nagoya-u.ac.jp (marble.eps.nagoya-u.ac.jp [133.6.57.68]) by mail.barrnet.net (8.6.10/MAIL-RELAY-LEN) with ESMTP id FAA12434 for <FreeBSD-gnats-submit@freebsd.org>; Wed, 20 Sep 1995 05:17:45 -0700
Received: (from kato@localhost) by marble.eps.nagoya-u.ac.jp (8.6.12+2.4W/3.3W9) id VAA00386; Wed, 20 Sep 1995 21:13:55 +0900
Message-Id: <199509201213.VAA00386@marble.eps.nagoya-u.ac.jp>
Date: Wed, 20 Sep 1995 21:13:55 +0900
From: kato@eclogite.eps.nagoya-u.ac.jp
Reply-To: kato@eclogite.eps.nagoya-u.ac.jp
To: FreeBSD-gnats-submit@freebsd.org
X-Send-Pr-Version: 3.2
Subject: kern/729: unexpected signal 4/10/11
Sender: owner-bugs@freebsd.org
Precedence: bulk


>Number:         729
>Category:       kern
>Synopsis:       unexpected signal 4/10/11
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Sep 20 05:20:00 PDT 1995
>Last-Modified:
>Originator:     KATO Takenori
>Organization:
Dept. Earth Planet. Sci. Nagoya Univ.
>Release:        FreeBSD 2.2-CURRENT i386
>Environment:
FreeBSD-current (after September 5) on i486DX4 box
>Description:
Programs catches signal 11 and terminated just after execution.  If some
progrma caught signal 11, such program cannot be excuted.  So I have to
reboot my box.   Signal caught by program is usually signal 11, but
somtimes it is signal 10 and as a rare case signal 4.

In most case, virtual address where signal occured is in shared library
(I checked it by running programs on gdb).

This phenomenon has appered since September 5.  Before then, this problem
occurs rarely.  (Many vm related code had been changed from Semtember
3 to 5.)

>How-To-Repeat:
I don't know how to repeat this problem on any machine.  On my box,
this problem happens every day!

>Fix:
I think this problem is due to vm bug, but I don't know complete fix.
I have found three problem related vm.

(1) Function splimp doesn't block disk I/O.  Even though 4.4BSD derived
code assumes splhigh is higher than or equals to splbio + splnet, net_imask
doesn't include bio_imask (cf. isa.c).   This may cause access to kmem without
lock, if disk I/O intterupton occurs.  In most code, splimp call in 4.4BSD
has been changed into splhigh (why 'splhigh' which block ALL intterupton?),
but some has not been changed yet.  The next proble is one of them.

My quick hack is that I add following code just above spl0() in isa_conigure:
	net_imask |= bio_imask;


(2) In function mbinit (/sys/kern/uipc_mbuf.c), function m_clalloc is
called at splimp.   In m_clalloc, kmem_malloc is called.   The comment
of kmem_malloc in /sys/vm/vm_kern.c says that kmem_malloc should be called
at splhigh.  So splhigh and splx should be added before and after
kmem_malloc call in m_clalloc. 

(3) splhigh() is misplaced in function vm_map_functon (/sys/vm/vm_map.c).
I think this splhigh is added to avoid recursive lock_write call
(splhigh doesn't appear in vm_map_function in 4.4BSD).  To avoid recursive
lock there are two way.  One is block interruption as FreeBSD does and
another is make submap to avoid competition of map.  I think FreeBSD choose
former way.  In this case, splhigh should be placed BEFORE vm_map_lock, because
interruption may occur between vm_map_lock and splhigh, and kmem_map is not
locked.  (I heard that combination of both two way makes splhigh unneccessary
in NetBSD.)

Applying above three fixes, the time from reboot to appearing the problem
becomes long (but once proble happens, I have to reboot yet.)
>Audit-Trail:
>Unformatted: