From owner-freebsd-arch Thu Nov 11 21:32:37 1999 Delivered-To: freebsd-arch@freebsd.org Received: from ns1.yes.no (ns1.yes.no [195.204.136.10]) by hub.freebsd.org (Postfix) with ESMTP id 3E7001541B for ; Thu, 11 Nov 1999 21:32:19 -0800 (PST) (envelope-from eivind@bitbox.follo.net) Received: from bitbox.follo.net (bitbox.follo.net [195.204.143.218]) by ns1.yes.no (8.9.3/8.9.3) with ESMTP id GAA10063 for ; Fri, 12 Nov 1999 06:32:17 +0100 (CET) Received: (from eivind@localhost) by bitbox.follo.net (8.8.8/8.8.6) id GAA16089 for freebsd-arch@freebsd.org; Fri, 12 Nov 1999 06:32:17 +0100 (MET) Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169]) by hub.freebsd.org (Postfix) with ESMTP id 2A38915172 for ; Thu, 11 Nov 1999 21:31:56 -0800 (PST) (envelope-from ken@panzer.kdm.org) Received: (from ken@localhost) by panzer.kdm.org (8.9.3/8.9.1) id WAA32504; Thu, 11 Nov 1999 22:30:22 -0700 (MST) (envelope-from ken) Message-Id: <199911120530.WAA32504@panzer.kdm.org> Subject: Re: I/O Evaluation Questions (Long but interesting!) In-Reply-To: <382BA304.EE2F0D66@simon-shapiro.org> from Simon Shapiro at "Nov 12, 1999 00:17:56 am" To: shimon@simon-shapiro.org (Simon Shapiro) Date: Thu, 11 Nov 1999 22:30:22 -0700 (MST) Cc: rjesup@wgate.com (Randell Jesup), freebsd-arch@freebsd.org From: "Kenneth D. Merry" X-Mailer: ELM [version 2.4ME+ PL54 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Simon Shapiro wrote... > "Kenneth D. Merry" wrote: > > It could be that the combination of the DPT controller's 256MB cache and > > fancy queueing, and your 1GB of RAM is causing the amazingly fast disk speeds. > > These DPTs seem to be optimal for RAID-5, very good at RAID-0 > and nothing exciting for single disks. I have some FC-AL > gear on order. > > What worries me is not the perfromance, but the corruption > of the stack that I see. > > For example, I can run the same 400 processes against the > raw device all day and all night without a hitch. > Run them against a block device and something bizzare > happens; A filesystem get corrupted, the Adaptec driver > times out, tsleep segfaults, something. At times I can > get the error in the driver, but then it makes no sense > either. There are tons of self-checks and state > verifications in the code. None trip, or when they do > they are as illogical as the null pointer inside tsleep. Well, since you've done a lot of work to try to isolate the problem in your code, but haven't tracked it down, I'd suggest taking your code out of the picture as a variable. Create a CCD or Vinum array, using the same disks on Adaptec controllers. Run the same tests, against the raw and block devices, and see if you get the same sort of weird behavior. If you do, you have solid proof that it's not your code, since your code wasn't in the kernel. If you don't, unfortunately, you don't have solid proof either way. (Since in that case, it could be some set of circumstances that your driver tickles that CCD or Vinum don't.) One other thing to make sure of is that you're running a -stable with Justin's Adaptec driver bug fix from September 20th. It fixed some cases where corruption could happen with Ultra 2 Adaptec controllers. Ken -- Kenneth Merry ken@kdm.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message