From owner-freebsd-amd64@FreeBSD.ORG Thu Jul 13 19:31:46 2006 Return-Path: X-Original-To: amd64@freebsd.org Delivered-To: freebsd-amd64@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ABB7E16A4DA for ; Thu, 13 Jul 2006 19:31:46 +0000 (UTC) (envelope-from sean@mcneil.com) Received: from mail.mcneil.com (mcneil.com [24.199.45.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5411843D45 for ; Thu, 13 Jul 2006 19:31:46 +0000 (GMT) (envelope-from sean@mcneil.com) Received: from localhost (localhost.mcneil.com [127.0.0.1]) by mail.mcneil.com (Postfix) with ESMTP id D8AE0F1A37; Thu, 13 Jul 2006 12:31:45 -0700 (PDT) X-Virus-Scanned: by amavisd-new at mcneil.com Received: from mail.mcneil.com ([127.0.0.1]) by localhost (triton.mcneil.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cjvgRrz9wcu0; Thu, 13 Jul 2006 12:31:41 -0700 (PDT) Received: from mcneil.com (mcneil.com [24.199.45.54]) by mail.mcneil.com (Postfix) with ESMTP id 10433F18DC; Thu, 13 Jul 2006 12:31:41 -0700 (PDT) From: Sean McNeil To: Oliver Lehmann In-Reply-To: <20060713210902.b393812e.lehmann@ans-netz.de> References: <20060713201434.a5335637.lehmann@ans-netz.de> <1152815780.17757.4.camel@triton.mcneil.com> <20060713210902.b393812e.lehmann@ans-netz.de> Content-Type: text/plain Date: Thu, 13 Jul 2006 12:31:40 -0700 Message-Id: <1152819100.21127.21.camel@triton.mcneil.com> Mime-Version: 1.0 X-Mailer: Evolution 2.6.2 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: amd64@freebsd.org Subject: Re: NFS lockup when copying a "special" file X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jul 2006 19:31:46 -0000 On Thu, 2006-07-13 at 21:09 +0200, Oliver Lehmann wrote: > Sean McNeil wrote: > > > > > I used to have a similar problem and tracked it down to my NIC and > > hardware checksums. Would this happen to be an if_re device? Can you > > give ifconfig info and if hardware checksums are on, try your test with > > them turned off (RXCSUM and TXCSUM). You can try different combos of > > these if turning off both helps. > > Yeah, it is indeed an on-board if_re: > > re0: port 0xe800-0xe8ff mem 0xcfffff00-0xcfffffff irq 16 at device 11.0 on pci0 > miibus0: on re0 > rgephy0: on miibus0 > rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > re0: Ethernet address: 00:11:09:cf:e2:e1 > re0: [FAST] > > > olivleh1@kartoffel olivleh1> ifconfig > re0: flags=8843 mtu 1500 > options=18 > inet6 fe80::211:9ff:fecf:e2e1%re0 prefixlen 64 scopeid 0x1 > inet 10.0.1.51 netmask 0xffffff00 broadcast 10.0.1.255 > ether 00:11:09:cf:e2:e1 > media: Ethernet autoselect (100baseTX ) > status: active > plip0: flags=108810 mtu 1500 > lo0: flags=8049 mtu 16384 > inet6 ::1 prefixlen 128 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 > inet 127.0.0.1 netmask 0xff000000 > > olivleh1@kartoffel olivleh1> uname -a > FreeBSD kartoffel.salatschuessel.net 6.1-STABLE FreeBSD 6.1-STABLE #0: Fri Jul 7 18:40:18 CEST 2006 olivleh1@kartoffel.salatschuessel.net:/usr/obj/amd64-athlon64-6.1/usr/src/sys/KARTOFFEL amd64 > > How can I disable checksum offloading? man ifconfig: ... -rxcsum, -txcsum If the driver supports user-configurable checksum offloading, disable receive (or transmit) checksum offloading on the inter- face. These settings may not always be independent of each other. ... So, just do an ifconfig re0 -rxcsum -txcsum But it would appear that they are already disabled. This is the exact same device that I had issues with and I thought receive only hw checksums were disabled by default. ifconfig shows that they are both off. You would have seen them in the options field if on. i.e. options=b Maybe try setting one of them on? Try "ifconfig re0 txcsum". How about on the other side? The checksum offloading problem exists for i386 and amd64 versions of the if_re driver. What does the NFS server have for a NIC and what is the OS/ifconfig settings? I seem to recall the problem was with RXCSUM. What is most likely occurring is either the box writing to the NFS server has a packet with an improperly calculated checksum or the checksum on the receive side is calculated incorrectly. This leaves it stuck trying to resend that bad packet over and over. >From your original email it appears that the amd64 box is sending to an NFS server. The other possibility is that, perhaps, you have your firewall setup on your amd64 box is such a way that it is messing up the packet. The NFS client is less likely to be the problem than one of the above. Cheers, Sean