Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 10 Feb 2013 16:56:00 -0800
From:      Marc Fournier <scrappy@hub.org>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        freebsd-stable@freebsd.org, John Baldwin <jhb@freebsd.org>
Subject:   Re: 9-STABLE -> NFS -> NetAPP:
Message-ID:  <0EB27C56-93A1-4FAE-9FB5-CAD960098609@hub.org>
In-Reply-To: <1946688889.2870936.1360542666536.JavaMail.root@erie.cs.uoguelph.ca>
References:  <1946688889.2870936.1360542666536.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_4E15226B-B282-4E39-9A08-2093243749EB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252


On 2013-02-10, at 4:31 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote:

> Marc Fournier wrote:
>> Hi John =85
>>=20
>> Does this help?
>>=20
>> root@io:~ # ps auxl | grep du
>> root 1054 0.0 0.1 16176 6600 ?? D 3:15AM 0:05.38 du -skx /vm/2799 0
>> 81426 0 20 0 newnfs
>> root 12353 0.0 0.1 16176 5104 ?? D Sat03AM 0:05.41 du -skx /vm/2799 0
>> 91597 0 20 0 newnfs
>> root 64529 0.0 0.1 16176 5164 ?? D Fri03AM 0:05.40 du -skx /vm/2799 0
>> 43227 0 20 0 newnfs
>> root 12855 0.0 0.0 16308 1988 0 S+ 5:26AM 0:00.00 grep du 0 12847 0 =
20
>> 0 piperd
> It is probably too late, but all the lines (without the | grep du) =
would be
> more useful. I also include the "H" flag, so it lists threads as well =
as
> processes. The above just says the "du" command is waiting for a vnode =
lock.
> The interesting process/thread is the one that is holding a vnode lock
> while waiting for something else.

As requested, 'ps auxlH' attached =85



--Apple-Mail=_4E15226B-B282-4E39-9A08-2093243749EB
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=us-ascii



> 
> Are you still getting the:
> nfs_getpages: error 13
> vm_fault: pager read error, pid 11355 (https)

Fairly quiet:


--Apple-Mail=_4E15226B-B282-4E39-9A08-2093243749EB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252



And that is it since last reboot ~20 days ago =85=20

>=20
> messages logged?
>=20
> With John's recent patch, the error# would no longer be 13 if it was
> caused by the "intr" flag resulting in a Read RPC terminating with =
EINTR.
> If you are still getting the above with "error 13", it suggests that
> the server is replying EACCES for the Read RPC.
> I suggested before that you check to make sure that the executable had
> read access for everyone one the file server. Since I didn't hear =
back,
> I'll assume this is the case.

Don't understand this question =85 I have 34 VPSs running off of this =
server right now =85 that 'du process' runs against each of those VPSs =
every night, and this problem started happening on Friday night's run =85 =
~18 days into uptime =85 so the same process has run repeatedly, with no =
issues, 18 times before it hung on Friday =85 also, the hang, once =
'triggered', only seems to recur against the same directory =85 the same =
directory doesn't necessarily trigger it, but once it starts, it appears =
to do it for the same directory =85 I'm not sure if I've ever seem it =
happening to two different directories at the same time =85

Also, please note that the du command is run from the physical server, =
as root =85

> rick
> ps: If it is still up and hasn't been rebooted, you could:
>    sysctl debug.kdb.break_to_debugger=3D1
>    - then type <ctrl><alt><esc> at the console and do the following
>      from the debugger
>    =
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerne=
ldebug-deadlocks.html
>    How well this work depends on what options your kernel was built =
with.

My remote console on that one doesn't work very well =85 I can view, but =
I can't type =85



--Apple-Mail=_4E15226B-B282-4E39-9A08-2093243749EB--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0EB27C56-93A1-4FAE-9FB5-CAD960098609>