Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Jan 2006 14:34:54 -0500
From:      Kurt Miller <lists@intricatesoftware.com>
To:        freebsd-hackers@freebsd.org
Cc:        Daniel Eischen <deischen@freebsd.org>
Subject:   Re: read hang on datagram socket
Message-ID:  <200601271434.54776.lists@intricatesoftware.com>
In-Reply-To: <200601271042.04315.lists@intricatesoftware.com>
References:  <Pine.GSO.4.43.0601270909190.10667-100000@sea.ntplx.net> <200601271042.04315.lists@intricatesoftware.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 27 January 2006 10:42 am, Kurt Miller wrote:
> On Friday 27 January 2006 9:16 am, Daniel Eischen wrote:
> > On Thu, 26 Jan 2006, Kurt Miller wrote:
> > 
> > > On Thursday 26 January 2006 7:26 pm, Daniel Eischen wrote:
> > > >
> > > > The modified version does not hang on 5.2.  Do you have multiple
> > > > interfaces on your 5.4 box?
> > >
> > > No, the 5.4 box is virtually identical to the 6.0 box. I set them both
> > > up at the same time from initial installs for the project.
> > >
> > > truk@freebsd5-4$ ifconfig
> > > lnc0: flags=108843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> > >         inet6 fe80::250:56ff:fe40:451a%lnc0 prefixlen 64 scopeid 0x1
> > >         inet 172.16.1.36 netmask 0xffffff00 broadcast 172.16.1.255
> > >         ether 00:50:56:40:45:1a
> > > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
> > >         inet 127.0.0.1 netmask 0xff000000
> > >         inet6 ::1 prefixlen 128
> > >         inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
> > 
> > [ ... ]
> > 
> > > > What happens when you try using non-zero IP addresses and ports?
> > > >
> > >
> > > Setting the ports doesn't effect the problem, however setting the
> > > addresses does. It really seems like binding to INADDR_ANY only binds
> > > to loopback address 127.0.0.1 and not all the interfaces.
> > >
> > > If sock1 is bound to the hostAddress and sock2 connects to sock1 at
> > > the hostAddress it works ok. If sock1 is bound to INADDR_ANY and sock2
> > > connects to sock1 using INADDR_ANY it works. but any mixture of of
> > > using INADDR_ANY with the hostAddress fails.
> > 
> > According to Steven's Network Programming, when binding to
> > INADDR_ANY, the operating system doesn't assign an address
> > until the first write.  This is unlike the port, where using
> > port 0, an ephemeral port is assigned right away.  I don't
> > have the book handy right now, so I forgot if the INADDR_ANY
> > behavior is only when you have multiple interfaces or not.
> 
> The book I'm using is not that clear about it (Advanced
> Programming in the UNIX Environment). It does say that using
> connect with a datagram socket will receive datagrams only
> from the address specified, which seems related to the problem.
> 
> > > Unfortunately, I don't have control over the addresses, the java
> > > programs do. This particular jck test binds the first socket with
> > > INADDR_ANY (InetAddress.getByName("0.0.0.0")) and connects the second
> > > socket to the first using the hostAddress (InetAddress.getLocalHost()).
> > 
> > You can try sending a byte before getting the address for the
> > port and see if that works.  Do you have anything weird, like
> > not having a default route (gateway)?
> 
> Yes, sending a byte before doing the connect(sock2, &sock1Addres
> does work. However, calling connect/send/read after that fails too.
> The problem appears to be related to sock1's selection of it source
> address, or perhaps the connect call is ignoring the hostAddress and
> using the loopback address. The netstat output leads me to believe
> it is the latter. It is behaving like a mismatch between source
> address of the message and the address enforced by the connect call.
> 
> I've confirmed that the sock1Addr struct is filled in correctly with
> the hostAddress and port of sock1 and sock2Addr is filled in correctly
> with the hostAddress and port of sock2.
> 
> I've got a pretty standard setup. All three machines are using DHCP
> to get their addresses, default route and name servers. I've set
> the dhcp server to give them the same IP addresses each time. Here's
> the routing table for each:
> 
> truk@freebsd6-0$ netstat -r -f inet
> Routing tables
> 
> Internet:
> Destination        Gateway            Flags    Refs      Use  Netif Expire
> default            172.16.1.1         UGS         0       34   lnc0
> localhost          localhost          UH          0        0    lo0
> 172.16.1/24        link#1             UC          0        0   lnc0
> 172.16.1.1         00:00:24:c2:47:b5  UHLW        2        0   lnc0    671
> 172.16.1.10        00:13:46:c9:0a:5c  UHLW        1        0   lnc0   1103
> 172.16.1.72        00:12:f0:b5:f4:6c  UHLW        1      118   lnc0    961
> 
> truk@freebsd5-4$ netstat -r -f inet
> Routing tables
> 
> Internet:
> Destination        Gateway            Flags    Refs      Use  Netif Expire
> default            172.16.1.1         UGS         0      112   lnc0
> localhost          localhost          UH          1       19    lo0
> 172.16.1/24        link#1             UC          0        0   lnc0
> 172.16.1.1         00:00:24:c2:47:b5  UHLW        1        0   lnc0     40
> 172.16.1.10        00:13:46:c9:0a:5c  UHLW        0        0   lnc0   1106
> 172.16.1.36        localhost          UGHS        0        7    lo0
> 172.16.1.72        00:12:f0:b5:f4:6c  UHLW        0     3151   lnc0    749
> 
> $ netstat -r -f inet #freebsd4-11
> Routing tables
> 
> Internet:
> Destination        Gateway            Flags    Refs      Use  Netif Expire
> default            172.16.1.1         UGSc        1        0   lnc0
> localhost          localhost          UH          1        0    lo0
> 172.16.1/24        link#1             UC          2        0   lnc0
> 172.16.1.1         00:00:24:c2:47:b5  UHLW        2        0   lnc0   1200
> 172.16.1.30        localhost          UGHS        0        0    lo0
> 172.16.1.72        00:12:f0:b5:f4:6c  UHLW        1      112   lnc0   1195
> 
> Thanks for the ideas and suggestions.

The problem turned out to be related to how dhcp sets up the routing table.
Switching to a fixed address setup adjusted the routing table and now the
the program works. Go figure. Here's the routing table when using a fixed
address:

Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use  Netif Expire
default            172.16.1.1         UGS         0       48   lnc0
localhost          localhost          UH          0        0    lo0
172.16.1/24        link#1             UC          0        0   lnc0
172.16.1.1         00:00:24:c2:47:b5  UHLW        1        0   lnc0   1200
172.16.1.20        00:07:e9:47:0f:f9  UHLW        0        2   lnc0    429
172.16.1.21        00:40:96:39:b6:f9  UHLW        0        3   lnc0   1197
172.16.1.36        00:50:56:40:45:1a  UHLW        0        1    lo0
172.16.1.72        00:12:f0:b5:f4:6c  UHLW        0      637   lnc0   1187

Notice the difference in the gateway for 172.16.1.36.

Thanks for all the help and suggestions.

-Kurt



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200601271434.54776.lists>