Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Dec 2012 17:56:13 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Hub- Marketing <marketing@hub.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: 9-STABLE -> NFS -> NetAPP:
Message-ID:  <549354325.1509120.1355957773993.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <B7529290-01FC-4E14-ACE5-1EBFCF2367C3@hub.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hub-Marketing wrote:
> I'm running a few servers sitting on top of a NetAPP file server =E2=80=
=A6
> everything runs great, but periodically I'm getting:
>=20
> nfs_getpages: error 13
> vm_fault: pager read error, pid 11355 (https)
>=20
13 is EACCES. This message means that the Netapp server is
replying EACCES to a read for a pagein. I notice that both
root and www are running the executable. (Also, root is often
mapped to something like "nobody" in the NFS server.)

You could try making sure the httpd executable file has r_x
permissions for all users (chmod 555 httpd).

If it still keeps hapenning once you've done that, you'd need
to capture packets when this happens and take a look at the
NFS RPCs via wireshark to see when the EACCES is returned and
what <uid, gids> are sent in the credentials for that Read.

rick

> errors on my screen =E2=80=A6 not always same pid =E2=80=A6 the annoying =
part is that
> it seems to always affect the same jail that is running .. if I
> shutdown all jails on that physical server, everything shuts down
> except for that *one* jail, with a ps listing looking like:
>=20
> USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
> root 6670 0.0 0.0 9936 1372 ?? DsJ 3:00AM 0:00.01 newsyslog
> root 6815 0.0 0.0 9936 1288 ?? DsJ 3:00AM 0:00.01 /usr/sbin/newsyslog
> -f /usr/local/etc/rotate_logs.cfg
> root 8361 0.0 0.1 220740 11400 ?? DsJ 7:33PM 0:01.25
> /usr/local/sbin/httpd -DNOHTTPACCEPT
> www 8364 0.0 0.0 0 0 ?? ZJ 7:33PM 0:00.00 <defunct>
> www 11866 0.0 0.1 318444 16792 ?? TJ 7:36PM 0:00.03
> /usr/local/sbin/httpd -DNOHTTPACCEPT
> www 11872 0.0 0.1 297964 14008 ?? TJ 7:36PM 0:00.01
> /usr/local/sbin/httpd -DNOHTTPACCEPT
> www 11873 0.0 0.1 306156 15028 ?? DEJ 7:36PM 0:00.02
> /usr/local/sbin/httpd -DNOHTTPACCEPT
> root 17190 0.0 0.0 9936 1240 ?? DsJ 8:00PM 0:00.01 /usr/sbin/newsyslog
> -f /usr/local/etc/rotate_logs.cfg
> root 24864 0.0 0.0 9936 1392 ?? DsJ 4:00AM 0:00.01 newsyslog
> root 24910 0.0 0.0 9936 1336 ?? DsJ 4:00AM 0:00.01 /usr/sbin/newsyslog
> -f /usr/local/etc/rotate_logs.cfg
> root 29972 0.0 0.0 9936 1240 ?? DsJ 9:00PM 0:00.01 /usr/sbin/newsyslog
> -f /usr/local/etc/rotate_logs.cfg
> root 34221 0.0 0.0 51480 4332 ?? DsJ 4:47AM 0:00.02 sshd: root@pts/1
> (sshd)
> root 42452 0.0 0.0 9936 1296 ?? DsJ 10:00PM 0:00.01 newsyslog
> root 42522 0.0 0.0 9936 1240 ?? DsJ 10:00PM 0:00.01
> /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
> root 55179 0.0 0.0 9936 1296 ?? DsJ 11:00PM 0:00.01 newsyslog
> root 55244 0.0 0.0 9936 1240 ?? DsJ 11:00PM 0:00.01
> /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
> root 67592 0.0 0.0 9936 1336 ?? DsJ 12:00AM 0:00.01 newsyslog
> root 67762 0.0 0.0 9936 1288 ?? DsJ 12:00AM 0:00.01
> /usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
> root 81603 0.0 0.0 9936 1340 ?? DsJ 1:00AM 0:00.01 newsyslog
> root 81640 0.0 0.0 9936 1284 ?? DsJ 1:00AM 0:00.01 /usr/sbin/newsyslog
> -f /usr/local/etc/rotate_logs.cfg
> root 93792 0.0 0.0 9936 1344 ?? DsJ 2:00AM 0:00.01 newsyslog
> root 93815 0.0 0.0 9936 1288 ?? DsJ 2:00AM 0:00.01 /usr/sbin/newsyslog
> -f /usr/local/etc/rotate_logs.cfg
> root 34228 0.0 0.0 67960 4464 1 Ds+J 4:47AM 0:00.00 sshd: root@pts/1
> (sshd)
> root 38473 0.0 0.0 17556 3272 3 SJ 4:53AM 0:00.02 /bin/tcsh
> root 38475 0.0 0.0 14212 1512 3 R+J 4:53AM 0:00.00 ps aux
>=20
> I can do a 'jexec <JID> /bin/tcsh' to get into the jail, I can perform
> ps commands, etc =E2=80=A6 I just can't get those processes to shutdown =
=E2=80=A6
>=20
> everything within the jail is 'up to date' =E2=80=A6 updates the userland=
 and
> ports =E2=80=A6 I've checked over the NetApp, but everything appears fine=
, and
> it only seems to repeatedly affect that one jail, on that same
> physical server ...
>=20
> I have no ideas on what / how to debug this =E2=80=A6 thoughts? help?
>=20
> thx
>=20
>=20
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
> "freebsd-stable-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?549354325.1509120.1355957773993.JavaMail.root>