Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Oct 2001 12:14:39 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Paul van der Zwan <paulz@trantor.xs4all.nl>
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: Multiple NFS server problems with Solaris 8 clients
Message-ID:  <3BC9E41F.2D7BC40@mindspring.com>
References:  <200110141431.f9EEVFh22336@trantor.xs4all.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
Paul van der Zwan wrote:
> If I run snoop on Solaris I see a getattr request being sent and
> an answer being received but apparently it gets ignored by Solaris.
> This happens on both Sol x86 and Sparc ( both with MU5 installed)

Please do a tcpdump, and examine it; I suspect you will find
that your problem is that the IP address it was sent to is
not the same as the IP address it was replied from.  In general,
this is because the code doesn't explicitly use recvfrom/sendto
semantics, and just takes the route.  This will most often occur
when you mount it using an IP alias, but the primary (non-alias)
IP address is is on the same subnet as the alias.  It can also
occur if you are using two address sets on the same wire, and do
not use an intervening router.


> Another problem I see is that rebooting the client causes the server
> to ignore request afterwards. I see SYNS sent to the server but no
> respons at all...

Again, you will need to tcpdump it.  One prospect is for the ARP
table to be different on the "who has" after the reboot.  I've
noticed that a ping socket gets a route, and even after an ICMP
redirect, I still get a bunch of redirects, since FreeBSD does
not update the route table for already created clones (this is a
bug in FreeBSD's routing code).

Another possibility is the reboot reset the sequence number; a
common thing is to ensure that the random sequence number used
is later than the one that was used last for the same IP/port
pairs.  The client will most likely reuse the same numbers, or
lower numbers, even if it is RFC compliant as to non-guessable
sequence numbers (you will see this on the tcpdump).  FreeBSD
will not guarantee increasing sequence numbers -- and will thus
"ignore" the packets -- unless you enable the sysctl to disable
the "pure random" sequence nu,mber hack.  Look for it via the
command "sysctl -A | grep -i seq".

NB: FreeBSD also does not reset connections in TIME_WAIT, if it
gets packets from the same IP/port on the client while the
server is in TIME_WAIT because the connections are dead.  This
is a common hack (NT does this by default, and so does Solaris),
but it opens you up for connection force-down attacks for active
connections, if your network is improperly firewalled.


> One more problem is in nfsd, if I set it to use udp only it starts
> eating all cpu cycles it can get,but only the master process.
> Trussing the proces shows no system calls whatsoever being performed.

The I/O daemons make a system call and never return to user
space.  To track down this problem, truss is of no use: you
must use DDB in the kernel (or remote kernel debugging, if
you have two systems available: see the FreeBSD Developer's
Handbook), and find out what it's doing in the kernel when
this happens... I suspect that you are having one of the
problems above, and are being packet-flooded by the clients,
when they get no response, or at least none they like, from
the server.

> BTW This is -current built yesterday ( oct 13).

You may also want to try 4.3 or 4.4 instead.

> PS Snoop logs or tcpdump logs are avialable for those who know what
> to look for...

I'll look at them if they are up on a web site, but not if
you mail them, so _DON'T_ mail them to me!

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3BC9E41F.2D7BC40>