From owner-freebsd-scsi@FreeBSD.ORG  Mon Oct  9 19:25:47 2006
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
X-Original-To: freebsd-scsi@freebsd.org
Delivered-To: freebsd-scsi@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DEF5116A417
	for <freebsd-scsi@freebsd.org>; Mon,  9 Oct 2006 19:25:47 +0000 (UTC)
	(envelope-from spork@bway.net)
Received: from xena.bway.net (xena.bway.net [216.220.96.26])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 82B8643D7D
	for <freebsd-scsi@freebsd.org>; Mon,  9 Oct 2006 19:25:37 +0000 (GMT)
	(envelope-from spork@bway.net)
Received: (qmail 98517 invoked by uid 0); 9 Oct 2006 19:25:37 -0000
Received: from unknown (HELO gee5.nat.fasttrackmonkey.com) (216.220.116.154)
	by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP;
	9 Oct 2006 19:25:37 -0000
Date: Mon, 9 Oct 2006 15:25:36 -0400 (EDT)
From: Charles Sprickman <spork@bway.net>
X-X-Sender: spork@gee5.nat.fasttrackmonkey.com
To: freebsd-scsi@freebsd.org
Message-ID: <Pine.OSX.4.61.0610091516450.416@gee5.nat.fasttrackmonkey.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Subject: mysterious panic w/adaptec RAID
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Oct 2006 19:25:48 -0000

Hi all,

I have a problem that is moving from box to box as I move a specific 
Postgres database around.  I think so far it has happened on at least 
three different boxes.  I am unable to get a kernel dump since the panic 
seems to lock up the whole disk subsystem and then freeze up the box.

These are all 4.11 boxes, all using Adaptec 2110s or 2015s ZCR cards.  2 
are supermicro boxes, 1 an older Intel box.  All are dual processor boxes 
(1 dual P-III, 2 dual Xeon).  I understand this is not much to go on, but 
are there any known issues with the asr driver in 4.11?

In all examples, the "current process" is dpteng.

This is all I get, the panic message gets caught by the console server and 
saved in a log:

[-- MARK -- Fri Oct  6 03:00:00 2006]


Fatal trap 12: page fault while in kernel mode
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x4
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc012c76e
stack pointer           = 0x10:0xe9b4dc44
frame pointer           = 0x10:0xe9b4dc44
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 29675 (dpteng)
interrupt mask          = cam  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks... asr0: Blink LED 0x43 resetting adapter
[-- MARK -- Fri Oct  6 04:00:00 2006]

Another:

[-- MARK -- Tue Aug  1 16:00:00 2006]


Fatal trap 12: page fault while in kernel mode
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x4
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc012c78e
stack pointer           = 0x10:0xeb8b2c44
frame pointer           = 0x10:0xeb8b2c44
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 23840 (dpteng)
interrupt mask          = cam  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks... asr0: Blink LED 0x3 resetting adapter
(da0:asr0:0:0:0): lost device

Any ideas?

Thanks,

Charles