Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Jul 2001 21:16:59 +0200
From:      Gabriel Ambuehl <gabriel_ambuehl@buz.ch>
To:        Paul Robinson <paul@akita.co.uk>
Cc:        freebsd-isp@freebsd.org, freebsd-cluster@freebsd.org
Subject:   Re[2]: Redundant setup on a budget??
Message-ID:  <6335845483.20010725211659@buz.ch>
In-Reply-To: <20010725151745.A36223@jake.akitanet.co.uk>
References:  <510EAC2065C0D311929200A0247252622F7A7B@NETIVITY-FS> <20010724154211.C34017@jake.akitanet.co.uk> <1241681557.20010725114735@buz.ch> <20010725112250.N83511@jake.akitanet.co.uk> <1996903256.20010725131437@buz.ch> <20010725124353.A6548@jake.akitanet.co.uk> <2411019395.20010725142313@buz.ch> <20010725151745.A36223@jake.akitanet.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE-----

Hello Paul,

Wednesday, July 25, 2001, 4:17:45 PM, you wrote:

> On Jul 25, Gabriel Ambuehl <gabriel_ambuehl@buz.ch> wrote:
>> Actually, it's more solarisy ;-). AFAIK, it works  on *BSD,
>> Solaris and HPUX and Linux 2.0.x. But the state stuff is really
>> interesting.  

> Solaris is even worse.

Slowlaris, as I tend to call it.

> Technically ipfw appears to have stateful extensions,
> but it does them through "dynamic rules" and so people don't think
> it's stateful.

Didn't work for my LAN firewall without some awfully stupid ruleset.
ipf does just great.

>> ACk. But as said, I don't care for them on FS level. FS isn't
>> meant to be used as a DB, for that we got DBMS installed.
> And what if you want multiple MySQL/Postgres servers, writing to
> the same DB?

As said: use one master with write perms and as many slaves as you
like which are responsible for the selects.

> And you think NFS is flaky? That sounds pretty dodgy to me as
> you're describing it, but it's your neck.

This is just some sort of realtime backup. Good for sleeping well,
you
know.

>> For DBMS, the only solution I can think of is faster
>> hardware. Shared DBMS is a big mess.
> Over lunch, this came up in conversation, and the only real problem
> anybody could see was detecting dead-locks. Apart from that, it
> shouldn't be too difficult.

Not too difficult? I mean I don't know hell a lot about DBMS
implementation, but if I were to do one, I'd surely cache immense
amounts of data INSIDE my app which I surely wouldn't rebuild from HD
after an statement that could affect it. Instead, I'd do the
statement
against the cache and write back the changes made to the cache to the
HD. Now how does your cluster know when it should rebuild its caches?

> All MTAs are NFS safe these days provding you have locking
> implemented. On FBSD, that won't be possible unless you're tracking
> -current. Otherwise, I bet I can break a qmail box in a cluster.


You could try to hack OpenSSH, ACK.

> Anyway, every admin with any sense is running Exim. </flambait>

I always thought that is some LAN quality software not up to handling
thousands of mails per day ;-). The only alternative to qmail I can
think of is postfix.


>> I'd rather want FreeBSD to support TCP/IP over firewire ;-)
> Ummmmm.... yeah. that sounds *great*. :-)

Cheap and cheerful 400mbit/s. Anyone?

> Which is why I want to try and fix it. Like I say, I'm doing this
> because I want to. If it's possible to get working, I personally
> believe that would be a useful contribution to the community.

Oh if it were to work like you want, I'd probably use it. It's only
that I don't think it would. Or if it would, it would be SLOW.

> We come back to where we started. Replication is not a safe way to
> deal with atomic transactions, and therefore is useless in anything
> that is important to your business. MySQL *should*not* be looking
> after this. Lower-layer transports and architectures should be
> supporting it. Or at least, that's the way I'm progressing.


IMHO not. MySQL knows better than the FS what it's operation will
cause to the data. The FS just does the container for the MySQL data,
remember, many older DBMS used raw partitions!

>> Badly written daemon?
> Because you need to lock spools against multiple popper's being
> invoked of multiple machines at the same time.

WTF does pop3d need to do with my spool, anyway? I only allow SMTP
messing with it.

> people (as it should be). At the moment, daemons have to implement
> it in their own code. My argument is why NFS can't help it in a
> clustered setup.  

My opinion about NFS is that it's best to stay away from it anyway.
Security is way too weak. This might get better with NFSv4 but I
currently don't see anything like this for FBSD.

> come from. On the majority of systems, you'll be able to deliver
> maybe 5,000 mails a day without ever needing locking. But the odd
> mail will occasionally go missing. Been there, didn't buy the
> t-shirt because it was a horrid place to be. :-)

qmail does NOT lose ANY mail. Never (unless some really braindead MTA
which is too stupid to resend a message for which it didn't get an OK
or your fsync() is broken).

>> stable enough for production. And I've got my doubts whether I
>> would want to rely on a multiple master setup with MySQL 3.23.
> Well, the tech I'm talking about is at a level where MySQL is the
> most likely to break it, but because I'm talking about FS level, it
> should benefit mail clusters and suchlike as well.

As said, I don't see any need for locking with regards to mail
clusters. About the only place it is really required are flat file
based CGI scripts but I don't think it's my job to take care of this
kind of badly written code. Need a DB? Use the DBMS.

> I thought you said qpopper was a badly written daemon?

Considering securityfocus shows 7 vulnerabilities, I'd say so, yes.

> How do you think qmail does it if isn't using big, fat, expensive
> lock files?  


"Why should I use maildir?
Two words: no locks. An MUA can read and delete messages while new
mail
is being delivered: each message is stored in a separate file with a
unique name,
so it isn't affected by operations on other messages. An MUA doesn't
have to
worry about partially delivered mail: each message is safely written
to disk
in the tmp subdirectory before it is moved to new. The maildir format
is
reliable even over NFS." - http://cr.yp.to/proto/maildir.html

Nice, uh?

> of, but the point is that with NFS level locking your daemon
> (whether it's an MTA or a UberWidget 2001 MultiVibrationAlert or
> whatever), doesn't need to worry about it.

Instead, I got to worry about the locking performance. Great.

>> Oh I see, you follow our business model (lots of cheap servers are
>> much better for your reliability than one expensive one). I just
>> feel 
> Better for the wallet as well. :-)

For damn sure. And it leaves much more places to do creative work.

> THAT'S MY POINT! THAT is EXACTLY what I want to work on. I want to
> try and work out what it would take (patching software, whatever,
> if need be) to make this a relatively trivial exercise. I know
> we're on -isp here, but this is exactly the sort of conversation we
> should be having on -cluster and trying to make it happen.


I think the DBMS should take care of this. Best approach, IMHO, would
be to have global row/table/whatever locking for the DBMS and then a
reliable replication protocol.

Still, I can think of situations where not even a single host DBMS
can
guarantee the consistency of your data.

> I think the current transaction support in the beta relies on
> Berkley DB transaction support.

Uhm yes. There's some other DB format which allows it as well but
it's
still far from being ACID, AFAIK.



Best regards,
 Gabriel
!!è

-----BEGIN PGP SIGNATURE-----
Version: PGP 6.5i

iQEVAwUBO18NIMZa2WpymlDxAQEQIAgAiJUQciA8VlOqoq7KzvNR1XRWj0VDjsTF
u13yDsHhEDvlSyLjMUXIvrDPMIgCe9n40MJNVKNTSLs5j9Xx+dJ74pcAnW42+6OC
Av8WWlTJ7n5nHhoar4a37kVu2nNyiBHLRct6RavCk9gGeTrfgfZKhSfn7r6IHU8f
XhvMozCfxIpYsPMqp9CxkvUtUQfM1RK72WTaly8WLczv80typL2FbRIbgExZx4cs
5AAgBUdhhJ3ZjiGR/QyYrMg82UNZ1Aal+C3OCen4vdKezI0c8igKQ3lfO5dhcDoe
HT4G5WtdeYe7Vpz/J3KT1jPYaLXZj7gRI1u8aeZUJdBJ97Dcs9vMTg==
=Pp5/
-----END PGP SIGNATURE-----


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-isp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6335845483.20010725211659>