Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 Aug 2007 06:31:29 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@des.no>
Cc:        "Chauncey N. Menefee" <cmenefee@prism-grp.com>, freebsd-gnats-submit@freebsd.org, freebsd-i386@freebsd.org
Subject:   Re: i386/115054: NTP errors out on startup but restart of NTP fixes problem
Message-ID:  <20070802060947.O76862@delplex.bde.org>
In-Reply-To: <86odhrlb18.fsf@ds4.des.no>
References:  <200707301716.l6UHG3eD020378@www.freebsd.org> <20070731072434.F5028@besplex.bde.org> <86odhrlb18.fsf@ds4.des.no>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1897146363-1186000289=:76862
Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Wed, 1 Aug 2007, [utf-8] Dag-Erling Sm=C3=B8rgrav wrote:

> Bruce Evans <brde@optusnet.com.au> writes:
>> Several versions of FreeBSD have annoying behaviouor for network
>> startup, involving the network not actually being up when ifconfig
>> returns and subsequent different mishandling of this by various
>> utilities.  [...]
>> This problem seems to get worse with each release of FreeBSD and/or
>> with newer NICs.  I never noticed fxp or even ed or rl NICs.  Now it
>> is barely noticeable with fxp and very noticeable with sk, bge and em
>> NICs.
>
> I have never seen this with any of the cards I've used (xl, fxp, rl, re,
> sis, bge, sk, msk and probably others, in no particular order).
>
> Perhaps there is a hardware issue involved?  Does the problem occur if
> you hardcode the link speed instead of relying on autonegotiation?

No difference.  I thought it might be the cheap switch, but going
direct makes no difference except to break hard-coding the link speed
for bge.  Thie followings is with bge (1Gbps capable but reduced to
100baseTX full-duplex by autonegotiation) under -current, connected
to fxp (100baseTX full-duplex by autonegotiation or hard-coded) under
FreeBSD-~5.2:

%%%
ttyv0:root@besplex:~> ifconfig bge0 down; time ifconfig bge0 up; time ping =
-c1
delplex; time route get delplex; time route get delplex

         0.48 real         0.00 user         0.47 sys
PING delplex.bde.org (192.168.2.4): 56 data bytes
Aug  2 05:57:49 besplex kernel: bge0: link state changed to DOWN
Aug  2 05:57:51 besplex kernel: bge0: link state changed to UP

--- delplex.bde.org ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
        11.01 real         0.00 user         0.00 sys
    route to: delplex
destination: delplex
   interface: bge0
       flags: <UP,HOST,DONE,LLINFO,WASCLONED>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     e=
xpire
        0         0         0         0         0         0      1500      =
1191
         0.00 real         0.00 user         0.00 sys
    route to: delplex
destination: delplex
   interface: bge0
       flags: <UP,HOST,DONE,LLINFO,WASCLONED>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     e=
xpire
        0         0         0         0         0         0      1500      =
1191
         0.00 real         0.00 user         0.00 sys
%%%

-current gives the differences that:
o ifconfig returns after 0.48 seconds instead of after 2+ seconds.  The
   "link state changed to UP" message still takes 2+ seconds altogether.
o The message is now printed to a different unwanted place (using tprintf()
   I think, instead of using printf(), but I want it in stderr).  The above
   output was captured using vidcontrol.
o The timestamps on the messages made by syslogd are almost precise enough
   to show the 2 second delay.
o ping still returns after 11+ seconds, but now it starts about 1.5 seconds
   earlier relative to the UP message, so the 11 seconds may be just ping's
   timeout and not related to UPness.

%%%
ttyv0:root@besplex:~> ifconfig bge0 down; time ifconfig bge0 up; time route=
 get
  delplex; time route get delplex
         0.48 real         0.00 user         0.47 sys
    route to: delplex
Aug  2 05:58:25 besplex kernel: bge0: link state changed to DOWN
Aug  2 05:58:27 besplex kernel: bge0: link state changed to UP
destination: 192.168.2.0
        mask: 255.255.255.0
   interface: bge0
       flags: <UP,DONE,CLONING>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     e=
xpire
        0         0         0         0         0         0      1500      =
  -7
         5.26 real         0.00 user         0.00 sys
    route to: delplex
destination: delplex
   interface: bge0
       flags: <UP,HOST,DONE,LLINFO,WASCLONED>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     e=
xpire
        0         0         0         0         0         0      1500      =
1196
         0.00 real         0.00 user         0.00 sys
%%%

The first "route get" still returns after 5+ seconds, but now it starts
about 1.5 seconds earlier relative to the UP message, so the 5 seconds
may be just route's timeout and not related to UPness.

The -current bge driver is acting identically to the ~5.2 bge driver.
Userland is ~5.2 all tests.  One reason I didn't report this earlier is
that it might be due to the ~5.2 userland and I don't have time to test
with a full -current userland, but ifconfig and route(8) seem to be portabl=
e
enough to mostly work with both kernels.  route(8) has a known problem
concerning the base for the expire time (it was broken for a long time
in -current due to the change to mono-time, but this causes few problems).

Bruce
--0-1897146363-1186000289=:76862--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070802060947.O76862>