Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Jul 1995 11:07:05 -0500 (CDT)
From:      Karl Denninger <karl@Mcs.Net>
To:        rgrimes@gndrsh.aac.dev.com (Rodney W. Grimes)
Cc:        karl@Mcs.Net, freebsd-hackers@FreeBSD.ORG
Subject:   Re: SCSI disk wedge
Message-ID:  <199507131607.LAA01870@Jupiter.mcs.net>
In-Reply-To: <199507130214.TAA19888@gndrsh.aac.dev.com> from "Rodney W. Grimes" at Jul 12, 95 07:14:53 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > The drives on these machines are (1) less than two months old, (2) have
> > current firmware, and (3) don't have ANY problems with BSDI.
> 
> Slow down... (1) new drives are often prone to firmware bugs if by
> new you also mean new model.  (2) good!!  But the ``new'' firmware
> could still have a bug in it (3) This is good, but it does not
> necessarily mean the bug is in FreeBSD.  We do things like
> very large I/O requests through the vm system, perhaps one of your
> drives does not like it when we drop a 64K I/O operation to it.

This is possible, but I believe that BSDI 2.x does support it.

> > Those 83-day uptimes are recorded on our production NFS servers which run a
> > much heavier disk load, with the same devices, on a different OS with no
> > problems.
> 
> Same _exact_ devices, or same _model_/_pn_/_revision_/_date_code_?

Same EXACT devices in one machine's case, in that the machine WAS running
BSDI 2.x. and is now running FreeBSD.

> I know these things:
> a) You have a hang problem on a 2742 with no error message
> b) You have a hang problem on a 1742 with some error before it, but
>    I did not see any error in your mail.
> c) You are using Seagate and Micropolis (I think that is what you said)
>    disk drives, but I have no idea as to what models).
> d) You have running on similiar hardware (maybe even the exact hardware)
>    BSDI with long uptimes.
> e) You crash once a day.
> f) You publically posted that you get a 200% performance boost running
>    FreeBSD over BSDI, telling me we are probably pushing your hardware
>    quite a bit harder than BSDI did.
> 
> What I do not know:
> 
> a) Are you using active termination?

Yes.

> b) Do your scsi cables meet the SCSI-ii spec with respect to all
>    parameters (length, impendence, capacitance, etc)?

Yes.  We use HP SCSI-II cables in most of our applications.  No cheap 
stuff here.  

> c) What exact model of disk drives you are using?

For the one which has the 274X adapter:

ahc1: target 0 synchronous at 10.0MB/s, offset = 0xf
ahc1: target 0 Tagged Queuing Device
(ahc1:0:0): "MICROP 3221-10MZ 1128K1 HT02" type 0 fixed SCSI 2
sd0(ahc1:0:0): Direct-Access 1955MB (4004219 512 byte sectors)
ahc1: target 1 synchronous at 10.0MB/s, offset = 0xf
ahc1: target 1 Tagged Queuing Device
(ahc1:1:0): "MICROP 4221-09MZ  Q4D HT02" type 0 fixed SCSI 2
sd1(ahc1:1:0): Direct-Access 1955MB (4004219 512 byte sectors)
ahc1: target 2 synchronous at 10.0MB/s, offset = 0xf
ahc1: target 2 Tagged Queuing Device
(ahc1:2:0): "MICROP 4221-09MZ  Q4D HT02" type 0 fixed SCSI 2
sd2(ahc1:2:0): Direct-Access 1955MB (4004219 512 byte sectors)
ahc1: target 3 synchronous at 10.0MB/s, offset = 0xf
ahc1: target 3 Tagged Queuing Device
(ahc1:3:0): "MICROP 4221-09MZ  Q4D HT02" type 0 fixed SCSI 2
sd3(ahc1:3:0): Direct-Access 1955MB (4004219 512 byte sectors)

For the systems (2) which have the 1742 adapters:

ahb0 waiting for scsi devices to settle
(ahb0:0:0): "SEAGATE ST31200N 8648" type 0 fixed SCSI 2
sd0(ahb0:0:0): Direct-Access 1006MB (2061108 512 byte sectors)

> d) What that error message you get is?

I'll get it the next time we get a crash; I have posted this one before.  It
is a timeout message.  We probably have it in the logbook, but I want to
make sure the message matches exactly.

> e) What motherboard you are running on, as much detail as possible.

ASUS Dual Pentia motherboard, single P90 processor.  This is the EISA/PCI
model of their product and has been EXTREMELY stable in other applications.
All systems in consideration have 64MB RAM.

> f) What exact model/revision aha174x and 274x are you using.

I'll have to get this one; both of these board are *very* recent production
(less than 2 months old for most; the one machine which was converted has a
1742 that is about a year old, but is the same revision -- which indicates
that they haven't changed it)

> g) What other I/O cards are in the machine.

Two SMC Ethernet cards in each, one standard (512k) VGA ISA board.

> h) What is the system running as far as a work load, does any one specific
>    work load tend to bring the crash out?

Varied workloads; one is a news server (INN), the others run http and user
processes.  There is no pattern to the crashes related to time of day or
work in progress at the time.

> i) Are you willing to pay for production type support, or is this the
>    reason you switched from BSDI to FreeBSD and now expect to get that
>    level of support for free?  Contracted support is avaliable from
>    several people if you expect that level of service.

Sure, provided we really get the fixes.  I am not adverse to paying for
support that actually performs.  What I won't pay for is support that
doesn't get answers to us in a timely fashion.

> What I am willing to do:
> 
> a) As long as you keep answering the questions and filling in the
>    details I will continue to follow the thread so that we might
>    come to a final resolution of your problem.
> 
> b) Reserect my DX2/66 EISA 1742 based system to run some testing on
>    duplicating your environment as much as I can with time permitting
>    (and I am one very busy person) to try and duplicate the bug here.

The DX2 may NOT have the problem due to it being significantly slower.

> c) Loan you my aha1742 that I know has worked for 2.5 years with 
>    FreeBSD with out a single hickup.
> 
> d) Since you mentioned ``production'': If you are in a real hurry
>    to get it fixed, you can pay me at contracted rates and I will be 
>    at your site with my equipment within 2 days.  This is an expensive
>    option, but one that does exist.

--
--
Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity
Modem: [+1 312 248-0900]     | (shell, PPP, SLIP, leased) in Chicagoland
Voice: [+1 312 248-8649]     | 7 Chicagoland POPs, ISDN, 28.8, much more
Fax: [+1 312 248-9865]       | Email to "info@mcs.net" WWW: http://www.mcs.net
ISDN - Get it here TODAY!    | Home of Chicago's only FULL AP Clarinet feed!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199507131607.LAA01870>