From owner-freebsd-scsi@FreeBSD.ORG Mon Oct 9 19:25:47 2006 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DEF5116A417 for ; Mon, 9 Oct 2006 19:25:47 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 82B8643D7D for ; Mon, 9 Oct 2006 19:25:37 +0000 (GMT) (envelope-from spork@bway.net) Received: (qmail 98517 invoked by uid 0); 9 Oct 2006 19:25:37 -0000 Received: from unknown (HELO gee5.nat.fasttrackmonkey.com) (216.220.116.154) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 9 Oct 2006 19:25:37 -0000 Date: Mon, 9 Oct 2006 15:25:36 -0400 (EDT) From: Charles Sprickman X-X-Sender: spork@gee5.nat.fasttrackmonkey.com To: freebsd-scsi@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: mysterious panic w/adaptec RAID X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Oct 2006 19:25:48 -0000 Hi all, I have a problem that is moving from box to box as I move a specific Postgres database around. I think so far it has happened on at least three different boxes. I am unable to get a kernel dump since the panic seems to lock up the whole disk subsystem and then freeze up the box. These are all 4.11 boxes, all using Adaptec 2110s or 2015s ZCR cards. 2 are supermicro boxes, 1 an older Intel box. All are dual processor boxes (1 dual P-III, 2 dual Xeon). I understand this is not much to go on, but are there any known issues with the asr driver in 4.11? In all examples, the "current process" is dpteng. This is all I get, the panic message gets caught by the console server and saved in a log: [-- MARK -- Fri Oct 6 03:00:00 2006] Fatal trap 12: page fault while in kernel mode mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 fault virtual address = 0x4 fault code = supervisor read, page not present instruction pointer = 0x8:0xc012c76e stack pointer = 0x10:0xe9b4dc44 frame pointer = 0x10:0xe9b4dc44 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 29675 (dpteng) interrupt mask = cam <- SMP: XXX trap number = 12 panic: page fault mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 boot() called on cpu#0 syncing disks... asr0: Blink LED 0x43 resetting adapter [-- MARK -- Fri Oct 6 04:00:00 2006] Another: [-- MARK -- Tue Aug 1 16:00:00 2006] Fatal trap 12: page fault while in kernel mode mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 fault virtual address = 0x4 fault code = supervisor read, page not present instruction pointer = 0x8:0xc012c78e stack pointer = 0x10:0xeb8b2c44 frame pointer = 0x10:0xeb8b2c44 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 23840 (dpteng) interrupt mask = cam <- SMP: XXX trap number = 12 panic: page fault mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 boot() called on cpu#0 syncing disks... asr0: Blink LED 0x3 resetting adapter (da0:asr0:0:0:0): lost device Any ideas? Thanks, Charles