Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Feb 2003 01:40:53 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Brad Knowles <brad.knowles@skynet.be>
Cc:        Rahul Siddharthan <rsidd@online.fr>, freebsd-chat@freebsd.org
Subject:   Re: Email push and pull (was Re: matthew dillon)
Message-ID:  <3E4CB9A5.645EC9C@mindspring.com>
References:  <20030211032932.GA1253@papagena.rockefeller.edu>				 <a05200f2bba6e8fc03a0f@[10.0.1.2]>				 <3E498175.295FC389@mindspring.com>			 <a05200f37ba6f50bfc705@[10.0.1.2]>			 <3E49C2BC.F164F19A@mindspring.com>		 <a05200f43ba6fe1a9f4d8@[10.0.1.2]>		 <3E4A81A3.A8626F3D@mindspring.com>	 <a05200f4cba70710ad3f1@[10.0.1.2]>	 <3E4B11BA.A060AEFD@mindspring.com> <a05200f5bba7128081b43@[10.0.1.2]> <3E4BC32A.713AB0C4@mindspring.com> <a05200f07ba71ee8ee0b6@[10.0.1.2]>

next in thread | previous in thread | raw e-mail | index | archive | help
Brad Knowles wrote:
> At 8:09 AM -0800 2003/02/13, Terry Lambert wrote:
> >  OK, then why do you keep talking about I/O throughput?  Do you
> >  mean *network I/O*?  Why the hell would you care about disk I/O
> >  on a properly designed message store, when the bottleneck is
> >  going to first be network I/O, followed closely by bus bandwidth?
> 
>         Disk I/O is many orders of magnitude slower than any other thing
> on the system.

If you can saturate 50% of a PCI bus, and the other half of that
goes to networking, you are set, as far as disk I/O speed.  If you
are using an NFS server (which you are), then it's based on your
ability to saturate your network device.


> Moreover, disk I/O suffers from issues with synchronous meta-data
> updates where entire directories must be locked for the entire
> period of time during which an update is occuring, thus reducing
> by many more orders of magnitude the number of small operations
> (e.g., file creation and deletion, renaming, updating of other file
> attributes, etc...) that we can perform in a given unit of time.

Disagree.  These locking issues are an artifact of the system
design (FS, application, or both).


>         This is an issue for MTAs, and is an issue for message stores,
> especially when the message stores use a meta-data intensive storage
> mechanism such as found in Maildir and Cyrus (to a lesser degree).

Simple answer: Don't use a metadata intensive storage mechanism.


> >  So what's the difference between not enforcing a quota, and ending
> >  up with the email sitting on your disks in a user maildrop, or
> >  enforcing a quota, and ending up with the email sitting on your
> >  disks in an MTA queue?

[ ... bogus business model ... ]

> In the case
> where you do have this issue, at the very least you can hold the
> message in the queue for a while, in the hope that the user will come
> clean out their mailbox.

In other words, the message takes up your disk space, no matter
what.


> >  Quotas are actually a strong argument for single image storage.
> 
>         SIS increases SPOFs, reduces reliability, increases complexity,
> increases the probability of hot-spots and other forms of contention,
> and all for very little possible benefit.

The only one of these I agree with is that it increases complexity.


> >  Obviously, unless setting the quota low on purpose is your revenue
> >  model (HotMail, Yahoo Mail).
> 
>         As I said above, "free" systems frequently set quotas
> ridiculously low.  They are not of interest for this discussion.

This discussion *started* because there was a set of list floods,
and someone made a stupid remark about an important researcher
indicating he was cancelling his subscription to the -hackers
mailing list over it, and I pointed out to the person belittling
the important researcher that such flooding has consequences that
depend on the mail transport technology over and above "just having
to delete a bunch of identical email".



> >  How?  It's going to sit on your disks, no matter what, the only
> >  choice you really have on it is *which* disk it's going to sit on.
> 
>         True, but it's easier for me to deal with multiple gigabyes of
> DOS crap in the mail queue than it is for the user to try to deal
> with multiple gigabytes of crap in their mailbox.  There are things
> that they need to be protected from, because they don't have the
> access or the power on their end.  If they did, they wouldn't need us.

They need the middlemen because there are antidisintermediation
strategies in use on most leaf node connections to the Internet,
not because the middlement have some inherent value that can be
obtained no other way.  8-|.

As far as "dealing with DOS", in for a penny, in for a pound: if
you are willing to burn CPU cycles, then implement Sieve or some
other technology to permit server-side filtering.

In reality, we both know that at some point it becomes too
computationally expensive to deal with thise sort of thing on
the ISP side of things, and that there's an impedence mismatch
in the transport mechanism vs. the bandwidth reduction point.
That's exactly the niche that "value added email services"
attempt to exploit (fee for compute resources on the fat side
of the pipe).

We also know that, for most DOS cases on maildrops, the user
simply loses, and that's that.


> >>          If 95-99% of all users never even notice that there is a quota,
> >>  then I've solved the part of the problem that is feasible to solve.
> >>  The remainder cannot possibly be solved with any quota at any level,
> >>  and these users need to be dealt with separately.
> >
> >  Again, how?
> 
>         Outside of the DOS problem, they need education and proper
> management of their expectations.  TANSTAAFL.

Let's quit talking about the free services.  Outside of funneling
idiots into the Microsoft Passport or competing Yahoo "single
signon" mechanisms, the free mail services are loss-leaders.  The
business model is simply unsustainable.  Neither service offers
that they will *guarantee* commision to stable storage before
sending the "250 OK" response and taking ultimate responsibility
that the message will not be lost/dropped/dumped/hacked prior to
final delivery at *any* level of payment.

So let's limit ourselves to the realm of LWCYM - "Lunches Which
Cost You Money".


> >  Flood fill will only work as part of an individual infrastructure,
> >  not as part of a shared infrasstrusture, if what you are trying to
> >  sell is to be any different from what everyone else is giving away
> >  for free.
> 
>         Ahh, something akin to the Yasushi model.  See
> <http://www.shub-internet.org/brad/papers/dihses/lisa2000/sld038.htm>.
> 
>         When restricted to the network internal to the mail system,
> replicating the mailbox over multiple servers is not a bad idea,
> although I don't think it matters so much what replication model you
> use.

The replication model is actually a pretty profound issue.  Prior
to replication, if you connect to one of the replicas, the message
can be seen as "in transit".  Post deletion on an original prior to
the replication, and the deletion can bee seen as "in transit".  The
worst case failure modes are that a message has increased apparent
delivery latency, or the message "comes back" after it's deleted.
Both of these are acceptable, in terms of failure modes, particularly
if you compare them to the alternatives.


> >>          If you store them on the recipient system, you have what exists
> >>  today for e-mail.  Of the three, this is the only one that has proved
> >>  sustainable (so far) and sufficiently reliable.
> >
> >  This argument is flawed.  Messages are not stored on recipient
> >  systems, they are stored on the systems of the ISP that the
> >  recipient subscribes to.
> 
>         That's what I was calling the "recipient system".  It is the
> system where the message was received.

This is not useful to talk about in terms of a POP3 maildrop.
To all intents and purposes, message in a POP3 maildrop are
"in transit on a point to point mail transport".  That's really
the whole point of acknowledging a "pull" technology exists, in
the first place.


> >  Yet those same guarantees are specifically disclaimed by HotMail
> >  and other "free" providers, even though there is no technological
> >  difference between a POP3 maildrop hosted at EarthLink and accessed
> >  via a mail client, and a POP3/IMAP4 maildrop hosted at HotMail and
> >  accessed via a mail client.
> 
>         Again, you're referencing situations that I consider to be
> irrelevant to the discussion.  I don't give a flying flip about the
> poor business model they employ.  I care about real systems that are
> paid for by real people and real companies.

Good, then we are in agreement that we will not reference things
like quotas and so on, which are artifacts of their business model,
and not things which actually save anyone total disk space.  8-).


[ ... ]

> >  I think I see the misunderstanding here.  You think IDE disks are
> >  server parts.  8-).
> 
>         No, not at all.  I think that focusing on disk storage capacity
> and not paying attention to disk I/O latency and I/O capacity is pure
> folly.

The majority of that latency is an artifact of the FS technology,
not an artifact of the disk technology, except as it impacts the
ability of the FS technology to be implemented without stall
barriers (e.g. IDE write data transfers not permitting disconnect
ruin your whole day).


> >  It gets rid of the quota problem.
> 
>         No, not at all.  You eliminate damn few duplicate messages, you
> greatly increase system complexity, you increase SPOFs, you increase
> system hot-spots, you reduce system reliability (and replication,
> something which you seem to be so fond of), and all for very, very
> little benefit.

Unless I can use someone else's stored copy of the message to
recover my corrupted stored copy of the message, that's not
replication, it's duplication.

The reason I brought up SIS again is that you seemed more than
willing to let a message sit in the main mail queue, but almost
paniced at the idea of throwing it into the user mailbox instead.

The only legitimate reason for such a panic is if you felt that
moving it into the user's mailbox would result in amplification
of the disk space being used.  Otherwise, you've already accepted
responsibility for delivery of the message, and deleting it out
of the mail queue is not really an option.


>         Try taking a real-world mail server and processing the logs.
> Count the number of recipients per message and see just how much
> space you'd actually save.  I did that, and included my numbers in
> the previous message -- an average of ~1.3 recipients per message.
> 
>         You want to do all this for about 30% savings?!?

Nope; I want to do it to get you to agree to turn off quotas,
if your business model is not based on the idea that it's OK
to drop email into /dev/null for customers who don't pay you
more money.


> >  Mark's wrong.  His assumptions are incorrect, and based on the
> >  idea that metadata updates are not synchronous in all systems.
> 
>         Meta-data updates are at least partially synchronous on all
> systems I know of.  Well, unless you are running with asynchronous
> mounts, but if you're doing that then you shouldn't be running a mail
> system until you understand why that's a bad idea.
> 
>         Even if they're not synchronous, they're still bottlenecks to be
> avoided if possible.

FS design issue.  And metadata updates in FreeBSD (with soft
updates) or SVR4.2 or Solaris (with delayed ordered writes) are
*NOT* synchronous, they are merely ordered.


> >  Cyrus is much closer to commercial usability, but it has it's own
> >  set of problems, too.
> 
>         It is somewhat closer.  If you want real commercial usability,
> you have to start with the MessagingDirect code, which is based on
> Cyrus but with lots of bug fixes, increased reliability and
> robustness, etc....  Then you graduate to Sendmail Advanced Message
> Server, which takes that to the next level.

You limited my options to Open Source, however.


> >>          Either way, locking is a very important issue that has to be
> >>  solved, one way or the other.
> >
> >  No, it's a very important issue that has to be designed around,
> >  rather than implemented.
> 
>         Somebody said that when they invented Maildir.  I didn't believe
> it then, and I don't believe it now.

Maildir is a kludge aound NFS locking.  Nothing more, and nothing
less.

> >  You are unlikely to ever find someone using NFS in this capacity,
> >  except as a back end for a single server message store.
> 
>         Show me an IMAP server that actually implements SIS.  I don't know of any.

MS Exchange does, and so does Lotus Notes.  I know they suck, but
they are examples.

In the Open Source world, you're not going to find one: another
problem that Open Source has is an inability to tackle problems
above a certain level of complexity.


> >  The point was that, without making changes requiring an in depth
> >  understanding of the code of the components involved, which Nick's
> >  solution doesn't really demonstrate, you're never going to get more
> >  than "marginally better" numbers.
> 
>         Could be.  In that case, we may have to find an alternative
> message suore solution.  If I can prove that this really is a
> problem, then I'll try to help them find a suitable SAN solution and
> then drop in SAMS.  If not, I may end up writing a paper or doing
> another invited talk.

8-).


> >  It works on NFS.  You just have to run the delivery agent on the
> >  same machine that's running the access agent, and not try to mix
> >  multiple hosts accessing the same data.
> 
>         Nope.  mmap on NFS doesn't work.

Who's using mmap?!?

[ ... ]

> >  The part of Netscape that Sun bought used to provide an IMAP4
> >  server (based on heavily modified UW IMAP code).  Is there a
> >  reason you can't use that?  I guess the answer must be "I have
> >  been directed to use Open Source".  8-).
> 
>         Actually, no.  They would much prefer commercial software.
> However, they don't have any money to spend on software, and I know
> from personal experience that the Netscape/iPlanet stuff doesn't
> scale.  Indeed, we're already in the process of scrapping all other
> Netscape/iPlanet software because we've had excessive problems with
> it.

This is interesting to know; from the documentation available,
they imply they scale, and a single instance of one seems to
match their claims for a single instance.  I guess it's always
worse than the marketing literature, when you deploy it.  8-(.


> >  This should be no problem.  You should be able to handle this
> >  with a single machine, IMO, without worrying about locking, at
> >  all.
> 
>         Remember, Maildir doesn't do locking.
> 
> >        10,000 client machines is nothing.
> 
>         10,000 LAN clients?  With 44MB messages and 200MB mailboxes?  On
> NFS?  Sorry, my testing so far indicates that this is a significant
> load and we need to take care to make sure that it is handled
> properly.

40 seconds to transfer on a Gigabit ethernet... assuming you can get
it of the disks.  8-).  Do you really expect them all simultaneously?


> >                                              so you can treat the
> >  inbound one as a bastion host, and keep the outbound entirely
> >  inside, and the inbound server should use a transport protocol
> >  for internal delivery to the machine running the IMAP4 server,
> >  which makes lockign go away.
> 
>         How does locking go away?  Through Maildir?  Or did you have
> something else in mind?

You don't need to assert a lock over NFS, if the only machine doing
the reading is the one doing the writing, and it asserts the lock
locally (this was more talking about the Cyrus cache files, not
maildir).


> >                                At worst, you can limit the number
> >  of bastion to internal server connections, which will make things
> >  queue up at the bastion, if you get a large activity burst, and
> >  let it drain out to the internal server, over time.
> 
>         I'm not worried about internal SMTP connections.  But we have to
> be careful to make sure we don't put any additional limits on POP3 or
> IMAP connections.

I was talking about machine capacity for connections.  POP3 is one
at a time, IMAP4 is (usually, worst case) 4 per client.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E4CB9A5.645EC9C>