Date: Tue, 22 Jan 2002 15:56:19 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Peter Wemm <peter@wemm.org> Cc: Andrew Gallatin <gallatin@cs.duke.edu>, alpha@FreeBSD.ORG Subject: Re: Is anybody actually able to netboot at the moment? Message-ID: <3C4DFC23.F5391D2D@mindspring.com> References: <20020122234007.1983E3BAD@overcee.wemm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Peter Wemm wrote: > > Actually, there's a bug in the one's complement case on the > > FreeBSD checksum calculation, sometimes. I was able to see > > incorrect checksums on a number of packets. I think it's in > > the incremental update code, but since it doesn't seem to > > stop things from working, I never tracked down the source of > > the ethreal traces where I saw this. > > Terry, what crack are you smoking this time? We dont do incremental > checksums in the libstand code. That stuff is as simple and as unoptimized > as it gets. The bug is on transmit, not on receive, Peter. 8-). Working validation on the receive with packets with bad checksums would stop the load. To see if this is the problem, it would be wise to do a dump of a failed boot attempt with ethreal, which flags checksum errors on packets on the wire. As always, this may or may not be the problem at all, but in the spirit of Sherlock Holmes... > The alpha problems were in boot1 (the 7.5K loader) and that shares no > code with netboot at all. OK. I typically don't use netboot, so I can believe this... > I have experimented with alignment in the ethernet frame send code.. it > seems that we are trying to send with 2-byte alignment for the bootp case. > Fixing it doesn't seem to make much difference. However, I wonder if SRM > is doing some length rounding or something because the lengths are not 4 or > 8 byte multiples for the bootp queries but are for the working rarp > queries. However, even that doesn't make sense because it sometimes works. > I'm more suspicious of interactions between the tulip cards when being > driven by SRM and the switch at the moment. OK, another shot in the dark. The first 16 bit NE1000 cards an interesting problem, in that, unless you sent an even number of bus transfer units, it would always do an even transfer anyway, and the last two bytes would be byte-swapped when you went to checksum them, and you'd sum some garbage byte instead of the right byte. The fix for this was to always send an even number of bytes, even if the payload wwas an odd length, to get around the problem. Maybe this is a byte-order problem? If it is, the place to fix it is on the server (again), by making it pad packets out to a 2 (or 4 or 8?) byte boundary so that the received packets are transferred as a unit, but only the payload portion is checked. This "fix" would only apply if the packets sent on the wire were good in both directions (i.e. it's still time for the ethreal trace by an otherwise uninvolved third party machine). Hope this helps... I'm waving my hands as fast as I can... ;^) -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C4DFC23.F5391D2D>