Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Feb 2008 18:51:58 +1100
From:      Robert Jenssen <robertjenssen@ozemail.com.au>
To:        Brooks Davis <brooks@freebsd.org>
Cc:        net@freebsd.org
Subject:   Re: dhclient conflict between /sbin/devd and /etc/rc.d/netif ?
Message-ID:  <200802111851.58155.robertjenssen@ozemail.com.au>
In-Reply-To: <20080211010626.GA69153@lor.one-eyed-alien.net>
References:  <200802111137.21550.robertjenssen@ozemail.com.au> <20080211010626.GA69153@lor.one-eyed-alien.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Brooks and all,

On Mon, 11 Feb 2008 12:06:26 pm you wrote:
> On Mon, Feb 11, 2008 at 11:37:21AM +1100, Robert Jenssen wrote:
> > Hi,
> > Every so often I have trouble connecting rt2560 based PCI wireless network 
> > card to my wireless router/access point. Typically I get:
> > 
> > # sudo /etc/rc.d/netif restart ral0
> > Starting wpa_supplicant.
> > ral0: no link .............. giving up
> > ral0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> >         ether 00:11:50:63:cd:47
> >         media: IEEE 802.11 Wireless Ethernet autoselect (DS/1Mbps)
> >         status: no carrier
> > 
> > Even though there seems to be plenty of signal power:
> > 
> > # sudo ifconfig ral0 list scan
> > SSID            BSSID              CHAN RATE   S:N     INT CAPS
> > xxxxxxx...      00:xx:xx:xx:xx:xx   10   54M -74:-95  100 EPS  WPA
> > 
> > Recently I noticed that sometimes, after the above "netif restart" fails, 
the 
> > ral0 interface "automagically" comes up anyway. Then dhclient is owned 
> > by /sbin/devd. The default devd.conf starts dhclient for both ethernet and 
> > PCI-cardbus devices. Is it a good idea for both /sbin/devd 
> > and /etc/rc.d/netif to start a dhclient on ral0 at about the same time? 

In the "magical" case above what I think is happening is that the dhclient 
startup from /etc/rc.d/netif called by rc fails. Later /etc/rc.d/netif is 
called again from /etc/pccard_ether:pccard_ether_start() by /sbin/devd. That 
call succeeds. 

The rc system uses rcorder to determine the order in which to run the rc 
scripts. On my system rcorder shows devd fairly early in the list. The 
devd.conf file calls a number of rc scripts. So far as I can see /sbin/devd 
doesn't check that these are called in the order listed by rcorder. Is this a 
problem? 

I have disabled devd (set the moused port explicitly in rc.conf) and done some 
simple tests on /usr/src/sbin/dhclient.c. In particular, at line 365 main() 
allows a hard-coded maximum of 10 seconds for the call to 
interface_link_status() to succeed. I changed this to 20 seconds with a print 
out and ran /etc/rc.d/netif restart a few times with rc_debug="YES". The 
results were
15 15 5 5 5 5 5 15 15 5 5 5 5 5 21(timed out!) 5 5 and 5 seconds. Presumably 
the (10n+5) seconds is a magic number inside my wireless card or router. I'm 
going to set the hardcoded value to 25 seconds. Would it be possible for you 
to commit a similar change? Here is a patch:

*** src/sbin/dhclient/dhclient.c        2007-02-10 04:50:26.000000000 +1100
--- /usr/src/sbin/dhclient/dhclient.c   2008-02-11 18:09:25.000000000 +1100
***************
*** 360,370 ****
                fflush(stderr);
                sleep(1);
                while (!interface_link_status(ifi->name)) {
                        fprintf(stderr, ".");
                        fflush(stderr);
!                       if (++i > 10) {
                                fprintf(stderr, " giving up\n");
                                exit(1);
                        }
                        sleep(1);
                }
--- 360,370 ----
                fflush(stderr);
                sleep(1);
                while (!interface_link_status(ifi->name)) {
                        fprintf(stderr, ".");
                        fflush(stderr);
!                       if (++i > 25) {
                                fprintf(stderr, " giving up\n");
                                exit(1);
                        }
                        sleep(1);
                }


("diff -C 5" to show the sleep()s!). Rather than dhclient.c timing 10 seconds 
and calling exit(), as shown above, shouldn't the dhclient.conf "timeout" 
configuration item cover this situation? I see that PR bin/98577 wants this 
hardcoded timeout reduced or made adjustable via dhclient.conf.

Best regards,

Rob Jenssen



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200802111851.58155.robertjenssen>