Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Jan 1997 16:24:35 -0500 (EST)
From:      Bill Paul <wpaul@skynet.ctr.columbia.edu>
To:        mark@grondar.za (Mark Murray)
Cc:        freebsd-current@freebsd.org
Subject:   Re: VM bogon? Was: Re: NIS breakage
Message-ID:  <199701212124.QAA06983@skynet.ctr.columbia.edu>
In-Reply-To: <199701212032.WAA10713@grackle.grondar.za> from "Mark Murray" at Jan 21, 97 10:31:55 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Mark Murray had 
to walk into mine and say:
[chop]

> I am still having one other YP problem. I thought it was related to the
> portmap crashes, but it is clear now that this is not the case:
> 
> I get these on the slave server every time I try to push the maps (one per
> map):
> 
> Jan 21 21:40:12 grackle ypxfr[5902]: call to rpc.ypxfrd failed: RPC: Timed out master.passwd.byname
> Jan 21 21:40:19 grackle ypxfr[5903]: call to rpc.ypxfrd failed: RPC: Timed out master.passwd.byuid
> Jan 21 21:40:24 grackle ypxfr[5904]: call to rpc.ypxfrd failed: RPC: Timed out passwd.byname
> Jan 21 21:40:28 grackle ypxfr[5905]: call to rpc.ypxfrd failed: RPC: Timed out passwd.byuid
> Jan 21 21:40:33 grackle ypxfr[5906]: call to rpc.ypxfrd failed: RPC: Timed out netid.byname

Hm. What's supposed to happen is that ypxfr on the slave will attempt
to use rpc.ypxfrd to transfer the maps, and if that fails it falls back
to the 'old' method (it basically uses yp_all() to suck over the contents
of the map directly fromy ypserv and constructs a copy of the map).

I'm not sure why rpc.ypxfrd is timing out, unless it's wedged for some
reason. This message is coming from the ypxfrd_getmap.c module in
/usr/libexec/ypxfr. Apparently it's able to establish an RPC handle to
rpc.ypxfrd, but the clnt_call() to initiate the map transfer times out.

The sucky thing about bugs like this is that they _never_ happen to me.
Somebody always find some new and strange way to configure FreeBSD that
makes all my carefully crafted code fall apart. And of course the machine
where it fails is always thousands of miles away and I have to debug
by remote control. (Note that I don't want to fix the bugs, but it would
be much faster if I could temporarily put my brain in your head. Messy,
but faster. :)

> (The map name at the end is my debugging code).
> 
> I did an rpcinfo of the master:
> 
> $ rpcinfo -p grunt
>    program vers proto   port
>     100000    2   tcp    111  portmapper
>     100000    2   udp    111  portmapper
>     100004    1   udp   1021  ypserv
>     100004    2   udp   1021  ypserv
>     100004    1   tcp   1023  ypserv
>     100004    2   tcp   1023  ypserv
>  600100069    1   udp   1015
>  600100069    1   tcp   1022
   ^^^^^^^^^

This is actually the FreeBSD rpc.ypxfrd program number. The Sun ypxfrd
program number is 100069. The FreeBSD value falls into the 'user-assigned'
range. (I haven't tried registering the FreeBSD ypxfrd protocol with Sun. :)

> ...and see no ypxfrd in there. Is that the problem? I get the impression
> from RTFM and RTSL that ypxfr reverted to a slower method than rpc.ypxfrd
> to get the maps.

Correct.

> Could this be because rpc.ypxfrd is not mentioned in rpc?

No, it's there.

> Funny - there are no errors syslogged.

There's nothing on the master in /var/log/messages? Weird.
 
> if I make the slave the master (and vice-versa), I get the same problem -
> the slave times out reading rpc.ypxfrd.

This could be some kind of multihoming weirdness.

> My configuration:
> 
> master: the now-famous amd386dx40, 3.0current(real recent) 8MB, ed0, lo0.
> slave: i486dx50, 3.0current(real recent) 16MB, ed0, lo0.

Okay, maybe not. Can you try to manually run /usr/libexec/ypxfr on the
slave and try to read a map from the master? What should happen is that
rpx.ypxfrd will spawn a child to handle the transfer, so you should see
a second rpc.ypxfrd appear on the master. One thing that can happen is
if you use yppush in 'parallel' mode and specify a large number of jobs,
you can cause rpc.ypxfrd to fork() too many children. In this case, ypxfr
will again fail over to the old method of transfering maps. The default
/var/yp/Makefile doesn't use yppush in parallel mode though.

rpc.ypxfrd is really a very simple program. I'm not sure where it could
be going wrong. The ypxfrd_getmap() routine in ypxfr isn't that complex
either. Can you run a tcpdump to see if there's actually any data being
exchanged between ypxfr and rpc.ypxfrd? Better still, can you do a ktrace
on rpc.ypxfrd on the master to see if it actually receives the request
from the slave? Hm... seems to work for me over here. :(
 
> > Okay. Does somebody else want to check this stuff over and commit it to
> > -current (and 2.2 I would think)? I can do it but I don't know what the
> > hell I'm looking at.
> 
> I think BDE and JohnD better check this out first.

I agree.

-Bill

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=============================================================================



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199701212124.QAA06983>