Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Dec 2012 20:58:36 -0800
From:      Hub- Marketing <marketing@hub.org>
To:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   9-STABLE -> NFS -> NetAPP: 
Message-ID:  <B7529290-01FC-4E14-ACE5-1EBFCF2367C3@hub.org>

next in thread | raw e-mail | index | archive | help

I'm running a few servers sitting on top of a NetAPP file server =85 =
everything runs great, but periodically I'm getting:

nfs_getpages: error 13
vm_fault: pager read error, pid 11355 (https)

errors on my screen =85 not always same pid =85 the annoying part is =
that it seems to always affect the same jail that is running .. if I =
shutdown all jails on that physical server, everything shuts down except =
for that *one* jail, with a ps listing looking like:

USER   PID %CPU %MEM    VSZ   RSS TT  STAT STARTED    TIME COMMAND
root  6670  0.0  0.0   9936  1372 ??  DsJ   3:00AM 0:00.01 newsyslog
root  6815  0.0  0.0   9936  1288 ??  DsJ   3:00AM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root  8361  0.0  0.1 220740 11400 ??  DsJ   7:33PM 0:01.25 =
/usr/local/sbin/httpd -DNOHTTPACCEPT
www   8364  0.0  0.0      0     0 ??  ZJ    7:33PM 0:00.00 <defunct>
www  11866  0.0  0.1 318444 16792 ??  TJ    7:36PM 0:00.03 =
/usr/local/sbin/httpd -DNOHTTPACCEPT
www  11872  0.0  0.1 297964 14008 ??  TJ    7:36PM 0:00.01 =
/usr/local/sbin/httpd -DNOHTTPACCEPT
www  11873  0.0  0.1 306156 15028 ??  DEJ   7:36PM 0:00.02 =
/usr/local/sbin/httpd -DNOHTTPACCEPT
root 17190  0.0  0.0   9936  1240 ??  DsJ   8:00PM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 24864  0.0  0.0   9936  1392 ??  DsJ   4:00AM 0:00.01 newsyslog
root 24910  0.0  0.0   9936  1336 ??  DsJ   4:00AM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 29972  0.0  0.0   9936  1240 ??  DsJ   9:00PM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 34221  0.0  0.0  51480  4332 ??  DsJ   4:47AM 0:00.02 sshd: =
root@pts/1 (sshd)
root 42452  0.0  0.0   9936  1296 ??  DsJ  10:00PM 0:00.01 newsyslog
root 42522  0.0  0.0   9936  1240 ??  DsJ  10:00PM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 55179  0.0  0.0   9936  1296 ??  DsJ  11:00PM 0:00.01 newsyslog
root 55244  0.0  0.0   9936  1240 ??  DsJ  11:00PM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 67592  0.0  0.0   9936  1336 ??  DsJ  12:00AM 0:00.01 newsyslog
root 67762  0.0  0.0   9936  1288 ??  DsJ  12:00AM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 81603  0.0  0.0   9936  1340 ??  DsJ   1:00AM 0:00.01 newsyslog
root 81640  0.0  0.0   9936  1284 ??  DsJ   1:00AM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 93792  0.0  0.0   9936  1344 ??  DsJ   2:00AM 0:00.01 newsyslog
root 93815  0.0  0.0   9936  1288 ??  DsJ   2:00AM 0:00.01 =
/usr/sbin/newsyslog -f /usr/local/etc/rotate_logs.cfg
root 34228  0.0  0.0  67960  4464  1  Ds+J  4:47AM 0:00.00 sshd: =
root@pts/1 (sshd)
root 38473  0.0  0.0  17556  3272  3  SJ    4:53AM 0:00.02 /bin/tcsh
root 38475  0.0  0.0  14212  1512  3  R+J   4:53AM 0:00.00 ps aux

I can do a 'jexec <JID> /bin/tcsh' to get into the jail, I can perform =
ps commands, etc =85 I just can't get those processes to shutdown =85

everything within the jail is 'up to date' =85 updates the userland and =
ports =85 I've checked over the NetApp, but everything appears fine, and =
it only seems to repeatedly affect that one jail, on that same physical =
server ...

I have no ideas on what / how to debug this =85 thoughts?  help?

thx





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B7529290-01FC-4E14-ACE5-1EBFCF2367C3>