Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Oct 2004 14:44:53 -0400 (EDT)
From:      Mike Tancsa <freebsd-dev@sentex.net>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   kern/72935: sio tty and uhid tty (perhaps others) stomp on each other leading to kernel data corruption and a panic
Message-ID:  <200410201844.i9KIirjF099052@granite.sentex.ca>
Resent-Message-ID: <200410201850.i9KIoMrl095763@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         72935
>Category:       kern
>Synopsis:       sio tty and uhid tty (perhaps others) stomp on each other leading to kernel data corruption and a panic
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Oct 20 18:50:21 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     Mike Tancsa
>Release:        RELENG_4
>Organization:
Sentex Communications
>Environment:
System: FreeBSD station.sentex.ca 4.10-STABLE FreeBSD 4.10-STABLE #19: Wed Oct 20 10:44:23 EDT 2004     root@station.sentex.ca:/usr/obj/usr/src/sys/gas  i386


	i386,RELENG_4
>Description:
	        In 4-10 STABLE we have been experiencing an intermittent problem / panic 
when engaging in constant serial I/O and constant usb/uhid device I/O.
What happens is that the kernel panics related to data in cfreelist in 
sys/kern/tty_subr.c

The two panics we have seen are

         panic("clist reservation botch"); in sys/kern/tty_subr.c:103

And

         panic("free: multiple frees"); in sys/kern/kern_malloc.:632

What we believe might be the problem is that the tty_subr routines rely on 
spltty() for concurrency.  The uhid device
(sys/dev/usb/uhid.c) is not of class TTY, but it uses the b_to_q routine 
within its interrupt handler (uhid_intr), so we believe
that uhid_intr will be serviced during some other tty servicing of the 
cfreelist.

The cfreelist within tty_subr is getting corrupted (and/or going to 
null).   We have been able to reproduce the problem in a short
period of time, by introducing a delay within cblock_alloc() and 
cblock_free().  Also we have been able to fix the problem (in
concept only) by doing the following in uhid_open

     int s = splhigh();
     tty_imask |= bio_imask;
     splx( s );



>How-To-Repeat:
	Do a lot of sio activity (preferably with a PUC card) and UHID activity at the same time.  
On average, about 3-5 days for a panic. See 
http://lists.freebsd.org/pipermail/freebsd-stable/2004-October/008964.html
>Fix:

Possibly,

*** uhid.c.orig Wed Oct 20 14:16:05 2004
--- uhid.c      Wed Oct 20 14:16:56 2004
***************
*** 411,416 ****
--- 411,424 ----
        if (sc->sc_dying)
                return (ENXIO);

+ /* KDW - test change to force class tty to include uhid */
+       {
+               int s = splhigh();
+               tty_imask |= bio_imask;
+               splx( s );
+       }
+ /* end KDW */
+
        if (sc->sc_state & UHID_OPEN)
                return (EBUSY);
        sc->sc_state |= UHID_OPEN;






>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200410201844.i9KIirjF099052>