From owner-freebsd-current@FreeBSD.ORG Thu Oct 2 22:43:39 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A155816A4B3 for ; Thu, 2 Oct 2003 22:43:39 -0700 (PDT) Received: from obsecurity.dyndns.org (adsl-64-169-107-253.dsl.lsan03.pacbell.net [64.169.107.253]) by mx1.FreeBSD.org (Postfix) with ESMTP id 689B743FEC for ; Thu, 2 Oct 2003 22:43:36 -0700 (PDT) (envelope-from kris@obsecurity.org) Received: from rot13.obsecurity.org (rot13.obsecurity.org [10.0.0.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 05F5F66FAB for ; Thu, 2 Oct 2003 22:43:27 -0700 (PDT) Received: by rot13.obsecurity.org (Postfix, from userid 1000) id D1C117F5; Thu, 2 Oct 2003 22:43:26 -0700 (PDT) Date: Thu, 2 Oct 2003 22:43:26 -0700 From: Kris Kennaway To: current@FreeBSD.org Message-ID: <20031003054326.GA51359@rot13.obsecurity.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="k1lZvvs/B4yU6o8G" Content-Disposition: inline User-Agent: Mutt/1.4.1i Subject: NFS corruption on p4 machines (please test) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Oct 2003 05:43:39 -0000 --k1lZvvs/B4yU6o8G Content-Type: text/plain; charset=us-ascii Content-Disposition: inline For some months now I have been experiencing NFS corruption on the three machines in the dosirak.kr package cluster - these are SMP pentium 4 machines that run -CURRENT. Setting DISABLE_PSE and DISABLE_PG_G does not fix these problems. I am able to easily reproduce these problems using /usr/src/tools/regression/fsx on a loopback nfs mount - they are not deterministic, but it blows up within about 8000 operations (less than a minute of operation). In fact sometimes it even manages to make fsx segfault, which is fairly impressive :) Just mount something rw via loopback nfs, and run 'fsx foo' on the nfs filesystem for a few minutes. e.g.: dosirak# fsx foo truncating to largest ever: 0x13e76 truncating to largest ever: 0x2e52c truncating to largest ever: 0x3c2c2 truncating to largest ever: 0x3f15f truncating to largest ever: 0x3fcb9 ftruncate1: 30cc3 dotruncate: ftruncate: Permission denied Is anyone else able to test this? The three machines I see this on have the same hardware specs, so it may be an interaction with certain hardware. Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.1-CURRENT #0: Fri Sep 26 20:23:51 KST 2003 root@dalki.kr.freebsd.org:/usr/obj/d/src/sys/DALKI Preloaded elf kernel "/boot/kernel/kernel" at 0xc0588000. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) XEON(TM) CPU 2.20GHz (2199.94-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf24 Stepping = 4 Features=0x3febfbff Hyperthreading: 2 logical CPUs real memory = 2147418112 (2047 MB) avail memory = 2084302848 (1987 MB) Programming 16 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 16 pins in IOAPIC #1 Programming 16 pins in IOAPIC #2 Programming 16 pins in IOAPIC #3 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00050014, at 0xfee00000 cpu2 (AP): apic id: 2, version: 0x00050014, at 0xfee00000 cpu3 (AP): apic id: 3, version: 0x00050014, at 0xfee00000 io0 (APIC): apic id: 8, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 9, version: 0x000f0011, at 0xfec01000 io2 (APIC): apic id: 10, version: 0x000f0011, at 0xfec02000 io3 (APIC): apic id: 11, version: 0x000f0011, at 0xfec03000 Pentium Pro MTRR support enabled ACPI-0660: *** Warning: Type override - [DEB_] had invalid type (Integer) for Scope operator, changed to ( Scope) ACPI-0660: *** Warning: Type override - [MLIB] had invalid type (Integer) for Scope operator, changed to ( Scope) ACPI-0660: *** Warning: Type override - [IO__] had invalid type (Integer) for Scope operator, changed to ( Scope) ACPI-0660: *** Warning: Type override - [DATA] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [SIO_] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [SB__] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [PM__] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [ICNT] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [ACPI] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [IORG] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [SB__] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [PM__] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [SIO_] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [PM__] had invalid type (String) for Scope operator, changed to (S cope) ACPI-0660: *** Warning: Type override - [BIOS] had invalid type (Integer) for Scope operator, changed to ( Scope) ACPI-0660: *** Warning: Type override - [CMOS] had invalid type (Integer) for Scope operator, changed to ( Scope) ACPI-0660: *** Warning: Type override - [KBC_] had invalid type (Integer) for Scope operator, changed to ( Scope) ACPI-0660: *** Warning: Type override - [OEM_] had invalid type (Integer) for Scope operator, changed to ( Scope) acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 pcibios: BIOS version 2.10 Using $PIR table, 7 entries at 0xc00f4a70 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x508-0x50b on acpi0 acpi_cpu0: on acpi0 acpi_cpu1: on acpi0 acpi_cpu2: on acpi0 acpi_cpu3: on acpi0 acpi_cpu4: on acpi0 acpi_cpu5: on acpi0 acpi_cpu6: on acpi0 acpi_cpu7: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 IOAPIC #1 intpin 2 -> irq 2 IOAPIC #1 intpin 1 -> irq 3 IOAPIC #1 intpin 3 -> irq 5 pci0: at device 2.0 (no driver attached) fxp0: port 0xce80-0xcebf mem 0xfe980000-0xfe99ffff,0xfe9fd000-0xfe9fdfff irq 3 at device 4.0 on pci0 fxp0: Ethernet address 00:30:48:12:59:16 miibus0: on fxp0 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp1: port 0xcf00-0xcf3f mem 0xfe9a0000-0xfe9bffff,0xfe9fe000-0xfe9fefff irq 5 at device 5.0 on pci0 fxp1: Ethernet address 00:30:48:12:49:d8 miibus1: on fxp1 inphy1: on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto isab0: at device 15.0 on pci0 isa0: on isab0 atapci0: port 0xffa0-0xffaf,0x374-0x377,0x170-0x177,0x3f4-0x3f7,0x1f0-0x 1f7 at device 15.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] pcib1: on acpi0 pci1: on pcib1 pcib2: on acpi0 pci2: on pcib2 IOAPIC #1 intpin 14 -> irq 9 IOAPIC #1 intpin 15 -> irq 10 ahc0: port 0xe400-0xe4ff mem 0xfebfe000-0xfebfefff irq 9 at device 2.0 on pci2 aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs ahc1: port 0xe800-0xe8ff mem 0xfebff000-0xfebfffff irq 10 at device 2. 1 on pci2 aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs atkbdc0: port 0x64,0x60 irq 1 on acpi0 fdc0: ready for input in output fdc0: cmd 3 failed at out byte 1 of 3 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A ppc0 port 0x778-0x77f,0x378-0x37f irq 7 drq 1 on acpi0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 fdc0: ready for input in output fdc0: cmd 3 failed at out byte 1 of 3 npx0: on motherboard npx0: INT 16 interface orm0: