Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Mar 2010 09:04:40 -0800
From:      David Wolfskill <david@catwhisker.org>
To:        current@freebsd.org
Subject:   SMP deadlock during multi-user mode transition after r204866
Message-ID:  <20100311170440.GR57205@bunrab.catwhisker.org>

next in thread | raw e-mail | index | archive | help

--wOOEd1gxhwV8K3c+
Content-Type: multipart/mixed; boundary="MgsldsnE3DYXgZCe"
Content-Disposition: inline


--MgsldsnE3DYXgZCe
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

My build machine (dmesg attached) is a dual CPU, single-core box; my
laptop is a single CPU, single-core box.  I track head on each daily;
while the build machine has been locking up during the transition to
multi-user mode since Tuesday (when I had built CURRENT at r204909;
previous was r204866, on Monday) -- and it boots to single-user mode OK
-- the laptop has not exhibited the problem.

This build machine was deployed fairly recently, and since a GENERIC
kernel had been working OK, I had left it that way (so that's the kernel
config).  I have a more customized config I had used on its predecessor;
I'm pretty sure I had that set up with DDB & assorted other "goodies" to
try to get something useful out of a misbehaviing system, and am willing
to set that up (but probably won't have time for several hours, at
least, as I need to give a presentation at a work meeting).

One of the more peculiar symptoms is that after such a lock-up, I
power-cycle the machine, then boot to single-user mode, at which point I
typically start with=20

	fsck -p

However, since Tuesday, that attempt yields:

Enter full pathname of shell or RETURN for /bin/sh:=20
# fsck -p
/dev/aacd0s4a: LINK COUNT DIR I=3D2  OWNER=3Droot MODE=3D40755
/dev/aacd0s4a: SIZE=3D1024 MTIME=3DMar 11 08:30 2010  COUNT 26 SHOULD BE 27
/dev/aacd0s4a: LINK COUNT INCREASING
/dev/aacd0s4a: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.
#=20

My circumvention of choice at the moment is:

# fsck -y / && fsck -p

as it appears that the root file system is the only one thus affected.

Is this sufficently well understood already that I should stop
disturbing folks who are trying to fix it?  Would it be usful for me to
configure a kernel that supports DDB & provide a backtrace (and maybe
additional stuff)?

To clarify, it appears that something after r204866 but no later than
r204909 has caused the observed problem.

Thanks.

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--MgsldsnE3DYXgZCe
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="dmesg.boot"
Content-Transfer-Encoding: quoted-printable

Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.0-CURRENT #92: Mon Mar  8 06:14:22 PST 2010
    root@freebeast.catwhisker.org:/common/S4/obj/usr/src/sys/GENERIC i386
WARNING: WITNESS option enabled, expect reduced performance.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 3.60GHz (3614.54-MHz 686-class CPU)
  Origin =3D "GenuineIntel"  Id =3D 0xf41  Stepping =3D 1
  Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG=
E,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=3D0x659d<SSE3,DTES64,MON,DS_CPL,EST,TM2,CNXT-ID,CX16,xTPR>
  AMD Features=3D0x20100000<NX,LM>
  TSC: P-state invariant
real memory  =3D 2147483648 (2048 MB)
avail memory =3D 2086187008 (1989 MB)
ACPI APIC Table: <PTLTD  	 APIC  >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 2 package(s) x 1 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  6
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic2 <Version 2.0> irqs 48-71 on motherboard
kbd1 at kbdmux0
acpi0: <PTLTD   RSDT> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <unknown> at device 0.1 (no driver attached)
pci0: <base peripheral> at device 1.0 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
aac0: <Adaptec SCSI RAID 2200S> mem 0xdc000000-0xdfffffff irq 24 at device =
1.0 on pci2
aac0: Enable Raw I/O
aac0: New comm. interface enabled
aac0: [ITHREAD]
aac0: Adaptec 2200S, aac driver 2.1.9-1
aacp0: <SCSI Passthrough Bus> on aac0
aacp1: <SCSI Passthrough Bus> on aac0
pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci1
pci3: <ACPI PCI bus> on pcib3
em0: <Intel(R) PRO/1000 Network Connection 6.9.25> port 0x2000-0x203f mem 0=
xd8200000-0xd821ffff irq 54 at device 2.0 on pci3
em0: [FILTER]
em0: Ethernet address: 00:30:48:2d:32:6a
em1: <Intel(R) PRO/1000 Network Connection 6.9.25> port 0x2040-0x207f mem 0=
xd8220000-0xd823ffff irq 55 at device 2.1 on pci3
em1: [FILTER]
em1: Ethernet address: 00:30:48:2d:32:6b
pcib4: <ACPI PCI-PCI bridge> irq 16 at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 6.0 on pci0
pci5: <ACPI PCI bus> on pcib5
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0x1400-0x141f irq 1=
6 at device 29.0 on pci0
uhci0: [ITHREAD]
usbus0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0x1420-0x143f irq 1=
9 at device 29.1 on pci0
uhci1: [ITHREAD]
usbus1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0x1440-0x145f irq 1=
8 at device 29.2 on pci0
uhci2: [ITHREAD]
usbus2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0x1460-0x147f irq 1=
6 at device 29.3 on pci0
uhci3: [ITHREAD]
usbus3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
ehci0: <Intel 82801EB/R (ICH5) USB 2.0 controller> mem 0xd8001000-0xd80013f=
f irq 23 at device 29.7 on pci0
ehci0: [ITHREAD]
usbus4: EHCI version 1.0
usbus4: <Intel 82801EB/R (ICH5) USB 2.0 controller> on ehci0
pcib6: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci6: <ACPI PCI bus> on pcib6
vgapci0: <VGA-compatible display> port 0x3000-0x30ff mem 0xd9000000-0xd9fff=
fff,0xd8300000-0xd8300fff irq 17 at device 1.0 on pci6
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177=
,0x376,0x14a0-0x14af at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x77 irq 8 on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model Generic PS/2 mouse, device ID 0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
uart0: console (9600,n,8,1)
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
uart1: [FILTER]
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0x=
c9fff,0xca000-0xcafff,0xcb000-0xcf7ff pnpid ORM0000 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=3D0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ppc0: parallel port not found.
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 122d00000e24
device_attach: est0 attach returned 6
p4tcc0: <CPU Frequency Thermal Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 122d00000e24
device_attach: est1 attach returned 6
p4tcc1: <CPU Frequency Thermal Control> on cpu1
Timecounters tick every 1.000 msec
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 12Mbps Full Speed USB v1.0
usbus2: 12Mbps Full Speed USB v1.0
usbus3: 12Mbps Full Speed USB v1.0
usbus4: 480Mbps High Speed USB v2.0
ata1: DMA limited to UDMA33, controller found non-ATA66 cable
ugen0.1: <Intel> at usbus0
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen2.1: <Intel> at usbus2
uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
ugen3.1: <Intel> at usbus3
uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
ugen4.1: <Intel> at usbus4
uhub4: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus4
acd0: DVDROM <MATSHITADVD-ROM SR-8178/PZ16> at ata1-slave UDMA33=20
aacd0: <RAID 1 (Mirror)> on aac0
aacd0: 34970MB (71619584 sectors)
aacd1: <RAID 1 (Mirror)> on aac0
aacd1: 69974MB (143307008 sectors)
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub3: 2 ports with 2 removable, self powered
uhub4: 8 ports with 8 removable, self powered
ses0 at aacp0 bus 0 scbus0 target 6 lun 0
ses0: <SUPER GEM318 0> Fixed Uninstalled SCSI-2 device=20
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
pass0 at aacp0 bus 0 scbus0 target 0 lun 0
pass0: <SEAGATE ST336754LC 0003> Fixed Uninstalled SCSI-3 device=20
pass0: 3.300MB/s transfers
pass1 at aacp0 bus 0 scbus0 target 1 lun 0
pass1: <SEAGATE ST336754LC 0003> Fixed Uninstalled SCSI-3 device=20
pass1: 3.300MB/s transfers
pass2 at aacp0 bus 0 scbus0 target 2 lun 0
pass2: <SEAGATE ST373454LC 0005> Fixed Uninstalled SCSI-3 device=20
pass2: 3.300MB/s transfers
pass3 at aacp0 bus 0 scbus0 target 3 lun 0
pass3: <SEAGATE ST373454LC 0005> Fixed Uninstalled SCSI-3 device=20
pass3: 3.300MB/s transfers
SMP: AP CPU #1 Launched!
WARNING: WITNESS option enabled, expect reduced performance.
Trying to mount root from ufs:/dev/aacd0s4a
em0: link state changed to UP

--MgsldsnE3DYXgZCe--

--wOOEd1gxhwV8K3c+
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (FreeBSD)

iEYEARECAAYFAkuZIqcACgkQmprOCmdXAD1H3gCfWK9zCdReRYUIfRN1ssEFa4r2
diEAn1/6oGuWVwWKnLCFkWGcHCYyg0ug
=xXp/
-----END PGP SIGNATURE-----

--wOOEd1gxhwV8K3c+--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100311170440.GR57205>