From owner-freebsd-stable  Thu Apr 18 11:28:14 2002
Delivered-To: freebsd-stable@freebsd.org
Received: from mikea.ath.cx (ip68-97-1-197.ok.ok.cox.net [68.97.1.197])
	by hub.freebsd.org (Postfix) with ESMTP id 6096A37B4DB
	for <freebsd-stable@FreeBSD.ORG>; Thu, 18 Apr 2002 11:27:16 -0700 (PDT)
Received: (from mikea@localhost)
	by mikea.ath.cx (8.11.6/8.11.1) id g3IIQl204054
	for freebsd-stable@FreeBSD.ORG; Thu, 18 Apr 2002 13:26:47 -0500 (CDT)
	(envelope-from mikea)
Date: Thu, 18 Apr 2002 13:26:47 -0500
From: mikea <mikea@mikea.ath.cx>
To: freebsd-stable@FreeBSD.ORG
Subject: Re: Clarification of kernel log in messages
Message-ID: <20020418132647.D3021@mikea.ath.cx>
Mail-Followup-To: mikea <mikea@mikea.ath.cx>,
	freebsd-stable@FreeBSD.ORG
References: <20020418082343.E2170-100000@stalker.amigo.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <20020418082343.E2170-100000@stalker.amigo.net>; from randys@amigo.net on Thu, Apr 18, 2002 at 08:27:15AM -0600
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-stable.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-stable>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-stable>
X-Loop: FreeBSD.ORG

On Thu, Apr 18, 2002 at 08:27:15AM -0600, Randy Smith wrote:
> Hi all,
> 
> I noticed this in my logs yesterday and I'm not sure what it means. Can
> one of you fine people explain this to me? Do I need to worry about it?
> 
> Thank you for your time.
> 
> $ uname -a
> FreeBSD pop1.amigo.net 4.4-RELEASE FreeBSD 4.4-RELEASE #3: Thu Sep 27
> 15:11:13 MDT 2001     root@pop1.amigo.net:/usr/obj/usr/src/sys/POP1  i386
> 
> ----
> 
> Apr 17 11:32:43 pop1 /kernel: (da1:ahc0:0:2:0): SCB 0x66 - timed out
> Apr 17 11:32:43 pop1 /kernel: ahc0: Dumping Card State while idle, at
> SEQADDR 0x7
> Apr 17 11:32:43 pop1 /kernel: ACCUM = 0xdf, SINDEX = 0x23, DINDEX = 0xe4,
> ARG_2 = 0x0
> Apr 17 11:32:43 pop1 /kernel: HCNT = 0x0
> Apr 17 11:32:43 pop1 /kernel: SCSISEQ = 0x12, SBLKCTL = 0x2
> Apr 17 11:32:43 pop1 /kernel: DFCNTRL = 0x0, DFSTATUS = 0x28
> Apr 17 11:32:43 pop1 /kernel: LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 =
> 0x80
> Apr 17 11:32:43 pop1 /kernel: SSTAT0 = 0x5, SSTAT1 = 0xa
> Apr 17 11:32:43 pop1 /kernel: STACK == 0x3, 0xf7, 0x150, 0x0
> Apr 17 11:32:43 pop1 /kernel: SCB count = 140
> Apr 17 11:32:43 pop1 /kernel: Kernel NEXTQSCB = 17
> Apr 17 11:32:43 pop1 /kernel: Card NEXTQSCB = 17
> Apr 17 11:32:43 pop1 /kernel: QINFIFO entries:
> Apr 17 11:32:43 pop1 /kernel: Waiting Queue entries:
> Apr 17 11:32:43 pop1 /kernel: Disconnected Queue entries: 19:67 11:86
> 6:107 24:48 4:41 10:79 0:46
> 26:13 17:108 8:92 9:95 25:121 14:123 31:7 7:83 23:25 12:70 21:100 1:3
> 30:45 3:60 15:61 28:53 29:30 20:2 27:104 2:15 5:77 16:18 22:139 18:102
> Apr 17 11:32:43 pop1 /kernel: QOUTFIFO entries:
> Apr 17 11:32:43 pop1 /kernel: Sequencer Free SCB List: 13
> Apr 17 11:32:43 pop1 /kernel: Pending list: 54, 67, 86, 107, 32, 38, 48,
> 37, 31, 81, 34, 56, 5, 6, 4, 101, 109, 49, 75, 57, 93, 41, 63, 79, 46, 13,
> 108, 92, 95, 121, 123, 7, 83, 25, 40, 19, 106, 36, 43, 97, 28, 73, 29, 51,
> 80, 58, 23, 66, 62, 70, 100, 3, 45, 60, 61, 53, 30, 2, 104, 15, 77, 18,
> 139, 102
> Apr 17 11:32:43 pop1 /kernel: Kernel Free SCB list: 35 9 78 44 27 84 26 14
> 21 10 16 99 72 59 103 122 74 88 42 87 1 98 24 120 89 12 76 82 71 96 105 55
> 68 52 91 69 50 65 20 94 0 39 85 90 119 118 117 116 115 114 113 112 111 110
> 129 128 127 126 125 124 8 33 11 22 47 64 138 137 136 135 134 133 132
> 131 130
> Apr 17 11:32:43 pop1 /kernel: sg[0] - Addr 0x18b5e000 : Length 1024
> Apr 17 11:32:43 pop1 /kernel: (da1:ahc0:0:2:0): Queuing a BDR SCB
> Apr 17 11:32:43 pop1 /kernel: (da1:ahc0:0:2:0): Bus Device Reset Message
> Sent
> Apr 17 11:32:43 pop1 /kernel: (da1:ahc0:0:2:0): no longer in timeout,
> status = 34b
> Apr 17 11:32:43 pop1 /kernel: ahc0: Bus Device Reset on A:2. 64 SCBs
> aborted

Looks like a disk I/O request took longer than the watchdog timer
interval, and so the I/O handler dumped all the information it
had about the SCSI card and its state. 

Sending a Bus Device Reset to a disk and aborting 64 SCBs that
were active or pending wastes a lot of work and requires all that
I/O to be redriven. 

I would feel a bit unconfortable about it. Maybe even a _lot_ 
unconfortable. It's a should-not-occur situation, and may be
your first inkling of an impending failure. That's me wearing
my 37-years-as-a-mainframe-system-programmer hat, but I think
the situation translates well from one arena to the other.


-- 
Mike Andrews
mikea@mikea.ath.cx
Tired old sysadmin since 1964

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message