Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Oct 1996 13:18:41 -0800 (PST)
From:      Mark Crispin <MRC@Panda.COM>
To:        Terry Lambert <terry@lambert.org>
Cc:        terry@lambert.org, gpalmer@FreeBSD.org, jgreco@brasil.moneng.mei.com, j@uriah.heep.sax.de, roberto@keltia.freenix.fr, current@FreeBSD.org, scrappy@ki.net
Subject:   Re: /var/mail (was: re: Help, permission problems...)
Message-ID:  <MailManager.846796721.5917.mrc@Ikkoku-Kan.Panda.COM>
In-Reply-To: <199610312059.NAA26174@phaeton.artisoft.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 31 Oct 1996 13:59:28 -0700 (MST), Terry Lambert wrote:
> Well, my personal opinion is that it should be runtime, not build-time,
> configured.

Great!!!!!!  This is real progress here!  You're seeing it my way.

> Failing that, you could look at /var/mail with stat and look at st_dev
> to see if it was NFS

Alas!!  Been there, done that.

Using st_dev to check for NFS isn't portable.  There's no portable way to
check for NFS on BSD systems.  On SVR4, you could do ustat and check f_tinode,
but SUN broke that in Solaris 2.5.

> and the permissions to see if they will let your
> users create .lock files successfully.

Of course, you do this on a file-by-file basis, since it may work in some
directories but not others (not all mail is on /var/mail).

Or, you could just try to create the .lock file and see what happens.

OK, we're still on the same track.

> You may not be able to know that fcntl() *will* work, but you can know
> .lock files *won't* work.

So, what do we do now?  We've determined that .lock files won't work.  We no
longer have any guidance from run time tests.  We have to choose a policy from
here.

What are our choices?
	1) Error: "Lock failed, mailbox open aborted"
	2) Warning: "Lock failed, proceeding as if locked"
	3) Ignore the condition entirely.

Policy (1) is safe, but is probably unacceptable for almost everyone.  It is
certainly unacceptable to FreeBSD users; it is a perfectly normal condition
for .lock files to fail.

Policy (2) is potentially dangerous, but it lets the user see her mail and she
is alerted that there may be a problem.  This may or may not be right for
FreeBSD users, depending upon whether or not they are part of an NFS cluster.

Policy (3) is extremely dangerous unless you know that .lock files are never
needed.  This is probably right for FreeBSD users who never use NFS, and know
they will never do so.

I distribute the software with policy (2), and a build-time conditional to set
policy (3).

> > Please explain how config will know what sort of systems you will access
> > over NFS in the future.
> The same way your config knows that the remotely mounted /var/mail that
> will be accessed in the future is mode 1777.  8^).

In other words, it doesn't.  The only thing it can do is be prepared.  Good
Boy Scout software it is, yessiree.  Apply every possible lock I can, that's
the trick.

> Again, I would prefer runtime configuration.  If nothing else, because
> I use AMD to relocate my mail files to user accounts, and some are local
> and some are NFS mounted.

This is EXACTLY what I'm doing for you, sir!!

Now, do you have some magic run time test that could lead me to discriminate
between the two cases:
	1) system call locking was the right thing, don't worry.
	2) you really needed the .lock file, and the fool who set up the
	   system did so in a way that you don't have privileges to lock the
	   mail file.
If you did, I'd be happier than a pig in mud.

About 6 years ago, a VAX BSD guru assured me that "turn off the annoying
message; if .lock file locking fails, system call locking is the right thing,
don't worry."  I made the mistake of believing him.

I subsequently have YEARS of unhappy experience with systems that use setgid
access to the mail spool.  This is a particularly nasty problem, since the
poor guys don't know that they have a problem for months until bad luck
strikes and mail delivery happens while their mailbox is being rewritten.
And, for a while, they put it down as a random glitch or user goof, until a
long-term continuous patten comes up.  And when they finally ask for help,
they are VERY upset, and even more upset to learn that the condition was
detected but they were never told about it.

It is going to take a lot more than flaming to get me to trust the word of BSD
folks again.  You're going to have to show me what I can do to make things any
better for you without reintroducing this problem in the remaining 99% of the
market.

> > Please explain how FreeBSD, perhaps the only system in the world that
> > still uses system call locking, is going to get the world to convert.
> By publicising the fact that fcntl and flock locking are available on
> all POSIX conformant and POSIX compliant systems, and that POSIX
> strictly mandates how an FS must act, and NFS is an FS?

Too many NFS clusters have had cluster-wide shutdowns from lockd/statd.

I am unaware of any POSIX requirement that "mail spool locking must be done
using fcntl() and must not be done by using .lock files".  If there is such a
requirement, it's being widely ignored out there.  Out of 3 dozen UNIX
variants that I support, only two are believed to have mail delivery programs
that use fcntl() or flock().

> Actually the question is where there will be a BSD4.4-Lite specific #ifdef
> for the default FreeBSD configuration to avoid the warning so that we
> don't have to hack around the warning using "port" code.

Sigh.  How many times did I tell you that the conditional is already there?
For the benefit of the lazy, it's IGNORE_LOCK_EACCES_ERRORS.  So, assuming you
still use the neb port, it's
	make neb -DIGNORE_LOCK_EACCES_ERRORS

This is not going to be the default, because of the reasons with NFS outlined
above.

> fcntl() should work, and removing the incentive to fix fcntl()
> is not an effective means of moving forward.

I'll back off a bit, and say that in theory it is possible that fcntl() over
NFS (lockd/statd) may somehow be made to work.  I have yet to see a system
where it has.

Recently:

The guys at IBM swore on a stack of Bibles that it would work in AIX, and
backed up their word with an offer of free RS/6000s if it didn't work.  Thanks
for the free machines, guys.

They guys at SUN swore up and down that they got it working on Solaris 2.5.
Well, the field results are in.  It doesn't work.  Now they're saying 2.6, for
sure.  At least one of my spys inside SUN is wagering with me that it won't.

Now, you guys may be smarter than the guys at IBM and SUN, and will come up
with fcntl()/lockd/statd that works for everyone.  That would be great news,
and you would save the world.

By the way, in terms of saving the world, here's a few more items:

I hope that you'll fix the bug in fcntl() that if you have a lock on a file,
and then lock the same file on a different file designator in the same
process, the first lock is blown away.  This apparently happens even if the
locks are owned by different threads in the same process; locks are apparently
recorded in a per-process inode table.

Oh, and there's also the flock() bug that if you promote a lock from shared to
exclusive, or demote from exclusive to shared, the lock is actually unlocked.
I didn't check to see if fcntl() has the same bug.

Consider adding locks by a named resource on a file, so it is possible to have
multiple locks on the same file but with different purposes.  File data
sections don't cut it; I want names so I can define my own resources.  In
particular, I want a resource which indicates "file open by mailer", and which
must be exclusive in order to remove messages; and another one which indicates
"currently parsing mailbox data", and which must be exclusive in order to
append new data.  Neither flock() nor fcntl() give this level of support, yet
DEC's and IBM's old proprietary operating systems did.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?MailManager.846796721.5917.mrc>