Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Mar 1998 22:54:04 +0100 (MET)
From:      Wilko Bulte <wilko@yedi.iaf.nl>
To:        shimon@simon-shapiro.org
Cc:        julian@whistle.com, hackers@FreeBSD.ORG
Subject:   Re: SCSI Bus redundancy...
Message-ID:  <199803042154.WAA04141@yedi.iaf.nl>
In-Reply-To: <XFMail.980304125832.shimon@simon-shapiro.org> from Simon Shapiro at "Mar 4, 98 12:58:32 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
As Simon Shapiro wrote...
> 
> On 04-Mar-98 Wilko Bulte wrote:
> 
> ...
> 
> >   Anxiously awaiting. I just missed an opportunity today to obtain a
> >   Mylex DAC960 3 channel RAIDcard. Bah.
> 
> Last I touched these, they were where DPT was 5 years prior, only buggier.
> I was at Intel at the time, working on a ``big'' benchmark and could get
> zilch support.  I far a lot better calling, anonymously into DPT hotline,
> saying ``I have this 1991 vintage card a friend gave me, and it does...''
> 
> Part of a product is its producer and support.  Maybe Mylex is much better
> at it today.

I never talked to Mylex directly. And I'd only get one if I get it (nearly)
free, like a $10-15 pricepoint. No real use for it here, but maybe fun
to play with. Like the FDDI network here ;-)

> >> enough cache to hold it, it is pretty fast.  I can sustain about 2us per
> >> transaction overhead and about 120MB/Sec.  This gives us about a second
> >> or
> >> two.  The new DPT's can retain the cache until power returns.
> >> Even a small UPS (with poer alarms will last long enough.
> > 
> >   But how do you checkpoint things? So, where did the processor leave
> >   off?
> 
> The DPT gets transactions form the host.  It processes them in an
> autonomous manner.  If the entire transaction is OK, an ACK is sent to the
> host.  If not, not.  If Power-Fail is detected, the DPT simply halts until
> it sees a reset from the host.  Once the reset arrives, it checks the
> disks.  If they are all there, it can choose to flush the caches.
> 
> One the host, once you detect a power-fail, you write all that you want to
> the DPT.  The DPT takes the WRITE requests and ACKs (it acts as a
> write-back cache, normal modus operandum).  The only fly in this cup is; 
> Whatt if there is more main memory than cache on the DPT (which is normally
> the case)?  What we do here, is a callback to an emergency shutdown routine
> that calls sync() in the kernel, and then calls boot().  It assumes the UPS
> can sustain the system this long, but that is very doable.  1GB worth of
> buffers will take (at 6 MB/sec - slow RAID-5) just over two minutes to
> flush.  Most systems are much faster than that.

 Agreed. Sounds ok.

> So, the answer is;  There is exactly one checkpoint, and it is a one-shot.
> Once we detect power failure, we assume we have reserve power to flush
> everything and shutdown.
> 
> This does not protect you from disk bay power failures, but these are
> almost aloways on N+1 power systems and hooked up to separate UPSs.

 An extra power supply is money well spent. We ship all our standalone
 arrays at least with N+1, optional 2N power. 2N gives you 2 seperate
 power entry points to the power grid. Now we only need to educate people
 to use two different power branches (phases? what's the right English term?)

> To have the kernel actually checkpoint itself, with any better resolution,
> or intelligence will have to change too many things.  I am trying to make

 OK, that was my original question. Had a bad feeling about exactly what you
 mention here.

_     ______________________________________________________________________
 |   / o / /  _  Bulte email: wilko @ yedi.iaf.nl http://www.tcja.nl/~wilko
 |/|/ / / /( (_) Arnhem, The Netherlands - Do, or do not. There is no 'try'
---------------  Support your local daemons: run [Free,Net,Open]BSD Unix  --

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803042154.WAA04141>