Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 08 May 2003 15:39:08 +0200
From:      "Ian Freislich" <ianf@za.uu.net>
To:        Lars =?iso-8859-1?Q?K=F6ller?= <Lars.Koeller@Uni-Bielefeld.DE>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Please, Urgent: Need ideas/help to solve PR bin/51586 
Message-ID:  <43122.1052401148@wcom.com>
In-Reply-To: Your message of "Thu, 08 May 2003 13:46:16 %2B0200." <200305081146.h48BkHP13996@rayadm.hrz.uni-bielefeld.de> 
References:  <200305081146.h48BkHP13996@rayadm.hrz.uni-bielefeld.de> 

next in thread | previous in thread | raw e-mail | index | archive | help
Lars wrote:
> > rresvport_af(3) returns this error because I suspect that it thinks
> > this address is already in use, perhaps because the address/port
> > pair is in TIME_WAIT, although I don't have time to test this
> > suspicion and my network programming and protocol experience is not
> > good enough to say this is the case outright without testing.
> 
> NO,NO! Netstat says nothing about that. Even I tune msl time to go out =
> of TIME_WAIT very fast (only intranet connection on same switch!).
> The ethereal dump in the PR shown, that an initial communication takes =
> place, but the final ACK to establish the connection fails!

Interesting.  I setup rshd and inet exactly like you did and ran
your test script and it broke in almost exactly the same way it did
for you:

while true
do
	/usr/bin/rsh brane -l ianf pwd; ret=$?
	if [ "$ret" != "0" ]
	then
		echo "Return Code: $ret"
		break
	fi
done

Loops several hundred times and the immediately prints:

/usr/home/ianf
/usr/home/ianf
/usr/home/ianf
select: protocol failure in circuit setup
Return Code: 1

At this point on the server 'brane' I get the following in /var/log/messages:
May  8 14:23:10 brane rshd[16886]: can't get stderr port: Can't assign requested address

This message is logged by rshd when it is unable to open the
connection for stderr back to the originating rsh client.  Have you
turned on net.inet.tcp.blackhole=2 which would result in ICMP port
unreachable messages not being sent? What is the output of 'netstat
-anf inet |grep -v TIME_WAIT' on machine2 after you get the timeout
connecting to machine2?  Is the tcp *.514 LISTEN line missing after
you get the timeout.  What do you get in your messages file on
machine2 (the one running the rsh server)?  I suspect that you're
not getting ICMP port unreachable after inetd silently terminated
the shell service because of rshd's exit code so your connection
timed out.

> > (/usr/src/libexec/rshd, apply this, make and make install the patched r=
> shd)
> > --- rshd.c.orig Thu May  8 12:55:46 2003
> > +++ rshd.c      Thu May  8 12:43:31 2003
> > @@ -296,7 +296,7 @@
> >                 s =3D rresvport_af(&lport, af);
> >                 if (s < 0) {
> >                         syslog(LOG_ERR, "can't get stderr port: %m");
> > -                       exit(1);
> > +                       exit(0);
> >                 }
> >                 if (port >=3D IPPORT_RESERVED ||
> >                     port < IPPORT_RESERVED/2) {
> > =
> 
> > I know this is a horrible solution and shouldn't be committed, but
> > at least you have a work-around so you can get your virus scanner
> > farm up in the mean time while someone fixes this propperly.
> 
> This dosen't help, cause the port can be reserved by the rshd. The
> problem is the establishing of the connection, so this is not the right
> place in the source.

Which port is reserved by rshd?  An incoming connection is established
on 514.  rshd reads a port number off that connection and initiates
a connection back to the originator on the specified port.  Both
these connections need to be established for the shell service to
come up.  I'm not sure that I trust the tcpdump in your PR becuase
I tried to dump the entire run from the script on both my test
servers and the two dumps didn't match and some sequences were out
of order.  Only when I dumped the packets to a file and used tcpdump
to read the file did the dumps from each server match.

Here's a good rsh session:

15:04:31.944902 196.7.162.26.1001 > 196.7.162.25.514: S 242763540:242763540(0) win 65535 <mss 1460,nop,wscale 1,nop,nop,timestamp 483614 0> (DF)
15:04:31.944965 196.7.162.25.514 > 196.7.162.26.1001: S 1769914383:1769914383(0) ack 242763541 win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 14908587 483614> (DF)
15:04:31.945271 196.7.162.26.1001 > 196.7.162.25.514: . ack 1 win 33304 <nop,nop,timestamp 483614 14908587> (DF)
15:04:31.945572 196.7.162.26.1001 > 196.7.162.25.514: P 1:6(5) ack 1 win 33304 <nop,nop,timestamp 483614 14908587> (DF)
15:04:31.945600 196.7.162.25.514 > 196.7.162.26.1001: . ack 6 win 57915 <nop,nop,timestamp 14908587 483614> (DF)
15:04:31.952264 196.7.162.25.929 > 196.7.162.26.1000: S 206573132:206573132(0) win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 14908588 0> (DF)
15:04:31.952525 196.7.162.26.1000 > 196.7.162.25.929: S 740063972:740063972(0) ack 206573133 win 65535 <mss 1460,nop,wscale 1,nop,nop,timestamp 483615 14908588> (DF)
15:04:31.952560 196.7.162.25.929 > 196.7.162.26.1000: . ack 1 win 57920 <nop,nop,timestamp 14908588 483615> (DF)
15:04:31.953030 196.7.162.26.1001 > 196.7.162.25.514: P 6:11(5) ack 1 win 33304 <nop,nop,timestamp 483615 14908587> (DF)
15:04:31.953064 196.7.162.25.514 > 196.7.162.26.1001: . ack 11 win 57915 <nop,nop,timestamp 14908588 483615> (DF)
15:04:31.953316 196.7.162.26.1001 > 196.7.162.25.514: P 11:20(9) ack 1 win 33304 <nop,nop,timestamp 483615 14908588> (DF)
15:04:31.953334 196.7.162.25.514 > 196.7.162.26.1001: . ack 20 win 57911 <nop,nop,timestamp 14908588 483615> (DF)
15:04:31.954560 196.7.162.25.514 > 196.7.162.26.1001: P 1:2(1) ack 20 win 57920 <nop,nop,timestamp 14908588 483615> (DF)
15:04:31.954787 196.7.162.26.1001 > 196.7.162.25.514: . ack 2 win 33303 <nop,nop,timestamp 483615 14908588> (DF)
15:04:31.958429 196.7.162.25.514 > 196.7.162.26.1001: P 2:17(15) ack 20 win 57920 <nop,nop,timestamp 14908588 483615> (DF)
15:04:31.958516 196.7.162.25.514 > 196.7.162.26.1001: F 17:17(0) ack 20 win 57920 <nop,nop,timestamp 14908588 483615> (DF)
15:04:31.958697 196.7.162.26.1001 > 196.7.162.25.514: . ack 17 win 33296 <nop,nop,timestamp 483615 14908588> (DF)
15:04:31.958795 196.7.162.26.1001 > 196.7.162.25.514: . ack 18 win 33296 <nop,nop,timestamp 483615 14908588> (DF)
15:04:31.959146 196.7.162.25.929 > 196.7.162.26.1000: F 1:1(0) ack 1 win 57920 <nop,nop,timestamp 14908588 483615> (DF)
15:04:31.959440 196.7.162.26.1000 > 196.7.162.25.929: . ack 2 win 33304 <nop,nop,timestamp 483616 14908588> (DF)
15:04:31.961198 196.7.162.26.1001 > 196.7.162.25.514: F 20:20(0) ack 18 win 33304 <nop,nop,timestamp 483616 14908588> (DF)
15:04:31.961239 196.7.162.25.514 > 196.7.162.26.1001: . ack 21 win 57920 <nop,nop,timestamp 14908589 483616> (DF)
15:04:31.961303 196.7.162.26.1000 > 196.7.162.25.929: F 1:1(0) ack 2 win 33304 <nop,nop,timestamp 483616 14908588> (DF)
15:04:31.961321 196.7.162.25.929 > 196.7.162.26.1000: . ack 2 win 57919 <nop,nop,timestamp 14908589 483616> (DF)

And here's the last one that failed:

15:04:31.984458 196.7.162.26.999 > 196.7.162.25.514: S 3911362959:3911362959(0) win 65535 <mss 1460,nop,wscale 1,nop,nop,timestamp 483618 0> (DF)
15:04:31.984514 196.7.162.25.514 > 196.7.162.26.999: S 834974100:834974100(0) ack 3911362960 win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 14908591 483618> (DF)
15:04:31.984863 196.7.162.26.999 > 196.7.162.25.514: . ack 1 win 33304 <nop,nop,timestamp 483618 14908591> (DF)
15:04:31.985141 196.7.162.26.999 > 196.7.162.25.514: P 1:5(4) ack 1 win 33304 <nop,nop,timestamp 483618 14908591> (DF)
15:04:31.985165 196.7.162.25.514 > 196.7.162.26.999: . ack 5 win 57916 <nop,nop,timestamp 14908591 483618> (DF)
15:04:31.992888 196.7.162.25.514 > 196.7.162.26.999: F 1:1(0) ack 5 win 57920 <nop,nop,timestamp 14908592 483618> (DF)
15:04:31.993164 196.7.162.26.999 > 196.7.162.25.514: . ack 2 win 33304 <nop,nop,timestamp 483619 14908592> (DF)
15:04:31.993698 196.7.162.26.999 > 196.7.162.25.514: F 5:5(0) ack 2 win 33304 <nop,nop,timestamp 483619 14908592> (DF)
15:04:31.993737 196.7.162.25.514 > 196.7.162.26.999: . ack 6 win 57920 <nop,nop,timestamp 14908592 483619> (DF)

You'll notice the absence of the second SYN from 196.7.162.25 to
196.7.162.26 and instead 196.7.162.25 immediately sends a FIN.  It
was at this point that rshd couldn't get the second port and
terminated the connection.

> However, the mailserver, which calls the rsh client is a solaris
> 8 machine :-(

That's not a problem because I believe the problem to be in rshd
and most likely in libc in rresvport_af(3).

Ian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43122.1052401148>