Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 May 2003 11:02:20 +0200 (CEST)
From:      Martin Blapp <mb@imp.ch>
To:        rwatson@freebsd.org
Cc:        current@freebsd.org
Subject:   AMD non-blocking RPC problem now reproducable
Message-ID:  <20030515101503.A47986@cvs.imp.ch>

next in thread | raw e-mail | index | archive | help

Hi all,

As already told, we still encounter a AMD problem in
pre 5.1. With help of Genesys of #bsdcode I could
reproduce it here.

I'm now able to reproduce it, but debugging is quite
difficult ! It's not specific to linux clients. A FreeBSD
client suffers too.

Make at least two fs exported. Here in my example we use / and /usr

On client, do the following:

- Start amd
- Run this loop:

while true ; do amq -u /net/yourserver ; sleep 1 ; ls -ld \
/net/yourserver/usr/local || break ; done

It is important that you list the imput of a subdir of use, because
the first call seems to succeed always. It's the second one which fails.

You will see output like:

drwxr-xr-x   11 root     root          512 May  5 14:11 /net/yourserver/usr/local

It will fail after 2-150 successful trys. If the blocking case (old behaviour)
is used within the mountd server, whis will not happen.

Even more strange. If I attach a ktrace on the pid of mountd, the bug
appears always ! I'm not sure if we trigger the same bug then, but it
appears to me that we do.

And I begin to suspect that it's timing related. The faster the network
response, the less we hit this bug.

This is a ktrace on the server ...

 86984 mountd   RET   read 4
 86984 mountd   CALL  gettimeofday(0x80589c0,0)
 86984 mountd   RET   gettimeofday 0
 86984 mountd   CALL  read(0x8,0x807a000,0x74)
 86984 mountd   GIO   fd 8 read 116 bytes

"~wG\^W\0\0\0\0\0\0\0\^B\0\^A\M^F\M-%\0\0\0\^C\0\0\0\^A\0\0\0\^A\0\0\0D>\M-COo\0\0\0\rlevais.imp.ch\0\0\0\0\0\0\0\

\0\0\0\0\0\0\0\b\0\0\0\0\0\0\0\0\0\0\0\^B\0\0\0\^C\0\0\0\^D\0\0\0\^E\0\0\0\^T\0\0\0\^_\0\0\0\0\0\0\0\0\0\0\0\^D/u\
        sr"
 86984 mountd   RET   read 116/0x74
 86984 mountd   CALL  gettimeofday(0x80589c0,0)
 86984 mountd   RET   gettimeofday 0
 86984 mountd   CALL  read(0x8,0x80545c8,0x4)
 86984 mountd   RET   read -1 errno 35 Resource temporarily unavailable
 86984 mountd   CALL  close(0x8)
 86984 mountd   RET   close 0
 86984 mountd   CALL  select(0x8,0xbfbffb98,0,0,0)

EAGAIN is ok, since we use non-blocking RPC. But something goes wrong then
and the connection get's closed. Of course additional requests will fail
then from client side then.

May 15 10:27:31 myclient amd[38168]: mountd rpc failed: RPC: Unable to receive

Martin

Martin Blapp, <mb@imp.ch> <mbr@FreeBSD.org>
------------------------------------------------------------------
ImproWare AG, UNIXSP & ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 61 826 93 00 Fax: +41 61 826 93 01
PGP: <finger -l mbr@freebsd.org>
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
------------------------------------------------------------------



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030515101503.A47986>