Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Mar 1998 22:55:31 +0100 (MET)
From:      Wilko Bulte <wilko@yedi.iaf.nl>
To:        shimon@simon-shapiro.org
Cc:        grog@lemis.com, hackers@FreeBSD.ORG, blkirk@float.eli.net, jdn@acp.qiv.com, tlambert@primenet.com, sbabkin@dcn.att.com
Subject:   Re: SCSI Bus redundancy...
Message-ID:  <199803032155.WAA04054@yedi.iaf.nl>
In-Reply-To: <XFMail.980303121008.shimon@simon-shapiro.org> from Simon Shapiro at "Mar 3, 98 12:10:08 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
As Simon Shapiro wrote...
> 
> On 03-Mar-98 Greg Lehey wrote:
> ...
> 
> > Obviously there are a number of problems.  But in fact it's not as
> > difficult as it sounds.  There's a problem with RAID 5 anyway if
> > there's, say, a power failure during a write.  After bringing it back
> > up again, you can recognize that there's a parity error, but where?
> 
> This sounds like an uncomitted transaction.  Quite easy to arrange for a
> rollback at boot time.  You keep such gems in NVRAM, for example, or in a
> known place on the disks, or you implement a journal, or...

Not on disk, in NVRAM. Or you would need to do sync writes on hardware
level. Say you have the I/O out to your 'rollback' disk. The drive
stashes it in it's I/O queue (tagged queueing and all). So, the data
is not yet on disk and the power fails.... Chances are slim, agreed.

<cut>

> > Does that make sense?  I'll try to formulate it more clearly if
> > anybody has difficulty with the concepts.
>  
> The only problem I have here, is the assumption that the O/S will do all
> that.  Not only it consumes much CPU, I/O bus, memory bandwidth, etc., but
> O/S crashes are the number one cause of failure in any modern computer. 
> Putting all this logic there is asking for it to crash frequently, and run
> under load all the time.  I think that the RAID logic should be outside the
> VPU/O/S proper, just like CRC checking is not done in the CPU anymode, and
> since SCSI and IDE, so is data separation, PLL detection loops, etc.  If
> your data is so important to you, spend few dollars to get it done in a
> predictable and reliable manner.

Hear hear. RAID parity is also done in hardware these days. Mostly for
speed reasons. A second reason to go for a standalone RAIDbox is of course
the clustering/multi-host thingy. Backplane RAID is IMHO more for
low(er)-end solutions.

> BTW, Since 1984 or so, I NEVER lost data due to disk failure.  I lost a LOT
> of data due to O/S failures, and some data due to bugs in RAID logic.
> 
> Although I do not belive Seagate's claim for 1,000,000 hours MTBF, I think
> the realized MTBF will far exceed any FreeBSD uptime.

This is probably true. You also want to realise that the early production 
units of a given drive model tend to have substantially lower MTBFs. It
seems when manufacturing plants get the 'feel' for producing a specific
model MTBF gets better. 

> This discussion is very enlightening none the less.

Sure.
_     ______________________________________________________________________
 |   / o / /  _  Bulte email: wilko @ yedi.iaf.nl http://www.tcja.nl/~wilko
 |/|/ / / /( (_) Arnhem, The Netherlands - Do, or do not. There is no 'try'
---------------  Support your local daemons: run [Free,Net,Open]BSD Unix  --

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803032155.WAA04054>