Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Apr 2004 12:25:47 -0600
From:      Tillman Hodgson <tillman@seekingfire.com>
To:        FreeBSD-Questions <freebsd-questions@freebsd.org>
Subject:   NFS occassionally gives "permission" denied in the middle of a large transfer
Message-ID:  <20040426182547.GF92049@seekingfire.com>

next in thread | raw e-mail | index | archive | help
Howdy folks,

I run a -STABLE (Apr 15 at the moment) NFS file server named Athena on
generic Intel hardware. It has a large number of disks, primarily SCSI
though 2 are IDE, and has a variety of Vinum mirrors from which it
exports filesystems.

I also have a cariety of clients that mount exports from that server.
They're a mix of -STABLE (on x86), -CURRENT (on x86 and saprc64), and
NetBSD (sgimips and pmax platforms).

I run a weekly cron job on all boxes to to backup to a central
NFS-exported filesystem on Athena (/exports/backups, usually mounted as
/nfs/backups). I typically use a script like this (example is from the
host Caliban, which runs -CURRENT on sparc64):

[root@caliban ~]# cat /usr/local/etc/periodic/weekly/110.backup
#!/bin/sh
### Backup important directories to Athena
###  (To restore do a 'gzcat file.dump.gz | restore -i -f -')
mount /nfs/backups
dump 0Lf - /    | gzip > /nfs/backups/caliban/weekly/root.dump.gz
dump 0Lf - /var | gzip > /nfs/backups/caliban/weekly/var.dump.gz
dump 0Lf - /usr | gzip > /nfs/backups/caliban/weekly/usr.dump.gz
umount /nfs/backups

However, this has been failing sporadically with messages like:

  <snip>
  DUMP: 50.71% done, finished in 1:03
  DUMP: 53.06% done, finished in 1:01
  DUMP: 54.72% done, finished in 1:02
gzip: stdout: Permission denied
  DUMP: Broken pipe
  DUMP: The ENTIRE dump is aborted.

I've confirmed that the scripts work correctly when run by hand (and
most of the time when run from cron), that I'm not running out of disk
space, and that I'm not running out of network I/O (Athena is Gigabit on
the switch, all the other machines are 100Mbit). There shouldn't be any
concurrent access (nothing else uses this filesystem, and each host gets
it's own diretory tree) so locking shouldn't be an issue.

The odd part is how it's sporadic, and when it does occur it's typically
pretty far into what had been, until that point, a successful dump.

Is this a known issue? Are there any workarounds for it? Am I doing
something blindingly-obviously-wrong? ;-)

- Tillman


-- 
If enlightenment is not where you are standing, where will you look?
	- Zen saying



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040426182547.GF92049>