Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 9 Feb 2003 18:24:37 +0100
From:      Gerrit =?iso-8859-1?Q?K=FChn?= <gerrit@pmp.uni-hannover.de>
To:        Andre Guibert de Bruet <andy@siliconlandmark.com>
Cc:        Vallo Kallaste <kalts@estpak.ee>, Attila Nagy <bra@fsn.hu>, current@FreeBSD.ORG
Subject:   Re: Does bg fsck have problems with large filesystems?
Message-ID:  <20030209172437.GA59271@pmp.uni-hannover.de>
In-Reply-To: <20030128173142.GF78630@pmp.uni-hannover.de>
References:  <20030127174127.GD71664@pmp.uni-hannover.de> <Pine.LNX.4.50.0301281141100.28577-100000@scribble.fsn.hu> <20030128125432.GB4813@tiiu.internal> <20030128110546.L66869@alpha.siliconlandmark.com> <20030128173142.GF78630@pmp.uni-hannover.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 28, 2003 at 06:31:42PM +0100, Gerrit Kühn wrote:

> > I've been trying to reproduce this bug on my desktop. This machine has 2
> > 80gb disks, one of which is dedicated with one slice. So far, after 8 hard
> > resets, I haven't had any problem with either the machine or bgfsck
> > hanging. 

> I'll try to reproduce the thing on my machine as soon as possible.
> Perhaps it was just because it was Monday, who knows...

Meanwhile I found out that my problem is 100% reproducible.

My file systems look like this:

Filesystem  1K-blocks    Used    Avail Capacity  Mounted on
/dev/ad0s1a    257838   67338   169874    28%    /
devfs               1       1        0   100%    /dev
/dev/ad0s1g  57467672       2 52870258     0%    /export
/dev/ad0s1f   4125838       4  3795768     0%    /tmp
/dev/ad0s1e  12383502 1336152 10056670    12%    /usr
/dev/ad0s1d   4125838    3458  3792314     0%    /var


When booting with non-clean filesystems, bgfsck runs quickly over a,
d, e and f. However, on g it keeps running forever. I can't kill the
fsck processes and I can't access g, though the rest of the system
seems to be usable as usual. Here is the output of ps axl:

  UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN STAT  TT       TIME COMMAND
    0     0     0   0 -16  0     0   12 sched  DLs   ??    0:00.00  (swapper)
    0     1     0   0   8  0   712  392 wait   ILs   ??    0:00.01 /sbin/init -
    0     2     0   0  -8  0     0   12 g_even DL    ??    0:00.02  (g_event)
    0     3     0   0  -8  0     0   12 g_up   DL    ??    0:00.09  (g_up)
    0     4     0   0  -8  0     0   12 g_down DL    ??    0:00.19  (g_down)
    0     5     0   0 -84  0     0   12 actask IL    ??    0:00.00  (acpi_task0
    0     6     0   0 -84  0     0   12 actask IL    ??    0:00.00  (acpi_task1
    0     7     0   0 -84  0     0   12 actask IL    ??    0:00.00  (acpi_task2
    0     8     0   0 -16  0     0   12 psleep DL    ??    0:00.00  (pagedaemon
    0     9     0   0  20  0     0   12 psleep DL    ??    0:00.00  (vmdaemon)
    0    10     0   0 -16  0     0   12 ktrace DL    ??    0:00.00  (ktrace)
    0    11     0 110 -16  0     0   12 -      RL    ??    2:20.07  (idle)
    0    12     0   0 -48  0     0   12 -      WL    ??    0:00.12  (swi6: tty:
    0    14     0   0 -44  0     0   12 -      WL    ??    0:00.00  (swi1: net)
    0    15     0   0  76  0     0   12 sleep  DL    ??    0:00.05  (random)
    0    19     0   0 -28  0     0   12 -      WL    ??    0:00.00  (swi5: acpi
    0    22     0   0 -64  0     0   12 -      WL    ??    0:00.28  (irq14: ata
    0    24     0   0 -68  0     0   12 -      WL    ??    0:00.00  (irq11: rl0
    0    25     0   0   8  0     0   12 usbevt DL    ??    0:00.00  (usb0)
    0    26     0   0   8  0     0   12 usbtsk DL    ??    0:00.00  (usbtask)
    0    27     0   0   8  0     0   12 usbevt DL    ??    0:00.00  (usb1)
    0    28     0   5 -68  0     0   12 -      WL    ??    0:00.00  (irq12: fwo
    0    29     0   0 -64  0     0   12 -      WL    ??    0:00.00  (irq6: fdc0
    0    32     0   0 -60  0     0   12 -      WL    ??    0:00.00  (irq7: ppc0
    0    33     0   0 -60  0     0   12 -      WL    ??    0:00.02  (irq1: atkb
    0    36     0  34 171  0     0   12 pgzero DL    ??    0:00.47  (pagezero)
    0    37     0   2  -4  0     0   12 snaplk DL    ??    0:00.24  (bufdaemon)
    0    38     0   0  20  0     0   12 syncer DL    ??    0:00.01  (syncer)
    0    39     0   0  -4  0     0   12 vlruwt DL    ??    0:00.00  (vnlru)
    0    40     0   0   8  0     0   12 nfsidl IL    ??    0:00.00  (nfsiod 0)
    0    41     0   0   8  0     0   12 nfsidl IL    ??    0:00.00  (nfsiod 1)
    0    42     0   0   8  0     0   12 nfsidl IL    ??    0:00.00  (nfsiod 2)
    0    43     0   0   8  0     0   12 nfsidl IL    ??    0:00.00  (nfsiod 3)
    0   246     1   0  96  0  1172  736 select Ss    ??    0:00.03 /usr/sbin/sy
    0   267     1   0  96  0  1372 1016 select Ss    ??    0:00.03 /usr/sbin/rp
    0   350     1 155 115  0  1220  992 select Is    ??    0:00.01 /usr/sbin/mo
    0   353     1 112 110  0  1168  876 select Is    ??    0:00.13 nfsd: master
    0   355   353 155   4  0  1128  748 nfsd   I     ??    0:00.00 nfsd: server
    0   356   353 155   4  0  1128  748 nfsd   I     ??    0:00.00 nfsd: server
    0   357   353 155   4  0  1128  748 nfsd   I     ??    0:00.00 nfsd: server
    0   358   353 155   4  0  1128  748 nfsd   I     ??    0:00.00 nfsd: server
    0   374     1   0  96  0  1144  680 select Ss    ??    0:00.00 /usr/sbin/us
    0   394     1 154 115  0  1196  808 select Is    ??    0:00.01 /usr/sbin/lp
    0   454     1 153 115  0  3092 2200 select Is    ??    0:00.63 /usr/sbin/ss
    0   460     1   0  96  0  3092 2544 select Ss    ??    0:00.01 sendmail: ac
   25   463     1 153  20  0  2992 2500 pause  Is    ??    0:00.00 sendmail: Qu
    0   512     1   0   8  0  1236  956 nanslp Ss    ??    0:00.01 /usr/sbin/cr
    0   522     1   0   8  0  1532 1236 wait   Is    v0    0:00.05 login [pam] 
    0   530   522   0  20  0  1504 1092 pause  S     v0    0:00.06 -csh (csh)
    0   544   530   0  96  0   664  444 -      R+    v0    0:00.00 ps axl
    0   523     1   0   5  0  1184  864 ttyin  Is+   v1    0:00.01 /usr/libexec
    0   524     1   0   5  0  1184  864 ttyin  Is+   v2    0:00.01 /usr/libexec
    0   525     1   0   5  0  1184  864 ttyin  Is+   v3    0:00.01 /usr/libexec
    0   526     1   0   5  0  1184  864 ttyin  Is+   v4    0:00.01 /usr/libexec
    0   527     1   0   5  0  1184  864 ttyin  Is+   v5    0:00.01 /usr/libexec
    0   528     1   0   5  0  1184  864 ttyin  Is+   v6    0:00.01 /usr/libexec
    0   529     1   0   5  0  1184  864 ttyin  Is+   v7    0:00.01 /usr/libexec
    0   516     1 153   8  4   248  140 wait   IN   con-   0:00.01 fsck -B -p
    0   517     1 153  -8  0  1112  564 piperd I    con-   0:00.00 logger -p da
    0   521   516   2  -8  4   632  376 getbuf DN   con-   0:01.19 fsck_ufs -p 


After turning off bgfsck in rc.conf the system rebooted using fgfsck
without further problems. Here is the dmesg from this booting:

Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD 5.0-RELEASE #1: Mon Jan 27 17:43:54 CET 2003
    root@comet.pmp.uni-hannover.de:/usr/obj/usr/src/sys/COMET
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0503000.
Preloaded elf module "/boot/kernel/acpi.ko" at 0xc05030a8.
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 930689508 Hz
CPU: VIA C3 Samuel 2 (930.69-MHz 686-class CPU)
  Origin = "CentaurHauls"  Id = 0x67a  Stepping = 10
  Features=0x803035<FPU,DE,TSC,MSR,MTRR,PGE,MMX>
real memory  = 125763584 (119 MB)
avail memory = 116744192 (111 MB)
Initializing GEOMetry subsystem
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <VIA603 AWRDACPI> on motherboard
    ACPI-0625: *** Info: GPE Block0 defined as GPE0 to GPE15
Using $PIR table, 6 entries at 0xc00fdcf0
acpi0: power button is handled as a fixed feature programming model.
Timecounter "ACPI-safe"  frequency 3579545 Hz
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
acpi_cpu0: <CPU> on acpi0
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <VIA Generic host to PCI bridge> mem 0xea000000-0xea3fffff at device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <display, VGA> at device 0.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <VIA 82C686 ATA100 controller> port 0xd000-0xd00f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 11 at device 7.2 on pci0
usb0: <VIA 83C572 USB controller> on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
ugen0: American Power Conversion Back-UPS 500 FW: 6.2.I USB FW: c1, rev 1.10/1.00, addr 2
uhci1: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 11 at device 7.3 on pci0
usb1: <VIA 83C572 USB controller> on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pci0: <bridge, PCI-unknown> at device 7.4 (no driver attached)
pcm0: <VIA VT82C686A> port 0xe400-0xe403,0xe000-0xe003,0xdc00-0xdcff irq 12 at device 7.5 on pci0
fwohci0: <VIA VT6306> port 0xe800-0xe87f mem 0xea400000-0xea4007ff irq 12 at device 11.0 on pci0
fwohci0: PCI bus latency was changing to 250.
fwohci0: OHCI version 1.0 (ROM=1)
fwohci0: No. of Isochronous channel is 8.
fwohci0: EUI64 00:30:1b:ab:00:00:55:b8
fwohci0: Phy 1394a available S400, 3 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
rl0: <RealTek 8139 10/100BaseTX> port 0xec00-0xecff mem 0xea401000-0xea4010ff irq 11 at device 12.0 on pci0
rl0: Realtek 8139B detected. Warning, this may be unstable in autoselect mode
rl0: Ethernet address: 00:30:1b:ab:55:54
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
sio0 port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A
ppc0 port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
orm0: <Option ROM> at iomem 0xd0000-0xd3fff on isa0
pmtimer0 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 10.000 msec
fwohci0: BUS reset
fwohci0: node_id = 0xc800ffc0, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
acpi_cpu: CPU throttling enabled, 2 steps from 100% to 50.0%
ad0: 78533MB <IC35L080AVVA07-0> [159560/16/63] at ata0-master UDMA100
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
ffs_snapshot_mount: old format snapshot inode 3


Sorry for posting so much debugging output, but I'm still hoping to
find out what the systems having problems with bgfsck have in common...


cu
  Gerrit
-- 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030209172437.GA59271>