Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Dec 2003 19:35:09 +0700 (NKZ)
From:      Eugene Grosbein <eugen@kuzbass.ru>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   kern/60526: Post-PAE stable SMP machine freezes
Message-ID:  <200312231235.hBNCZ97r051640@main.svzserv.kemerovo.su>
Resent-Message-ID: <200312231240.hBNCeIHb079351@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         60526
>Category:       kern
>Synopsis:       Post-PAE stable SMP machine freezes
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Dec 23 04:40:18 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Eugene Grosbein
>Release:        FreeBSD 4.8-STABLE i386
>Organization:
Svyaz Service JSC
>Environment:
System: FreeBSD main.svzserv.kemerovo.su 4.8-STABLE FreeBSD 4.8-STABLE #4: Fri Dec 19 13:44:49 NKZ 2003 sa@main.svzserv.kemerovo.su:/usr/obj/usr/src/sys/MAIN i386
	CPUTYPE=p3 and no other optimizations
	SMP machine with old ServerWorks chipset

>Description:
	
	This machine worked rock-stable with 4-STABLE for long time more
than 2 years. And it is still rock-stable with 4.8-STABLE cvsup'd upto
date=2003.08.08.00.00.00 (plus security patches).
It freezes hard with later 4-STABLE versions.

It boots, works short time as expected and hangs: network services
do not respond, kernel does not respond to pings, syscons does not
respond to Alt-Fn (consoles do not switch), kernel does not break to DDB
using Ctrl-Alt-ESC. No error messages in system logs located on SCSI drive.
No error messages in system logs that are being sent over network to
another server. Power down/up cycle is needed to reboot the box.

	Its uptime depends on ata(4) activity. When ATA disk is mounted
read-write, it takes several minutes (sometimes less than a minute)
to hang after going to multiuser. When ATA disk is mounted read-only,
it hangs just several times per day. When ATA disk is not mounted,
it hangs too, but may work for long time before freeze occures.
Downgraded to 4.8-STABLE, it does not hang at all. The box is loaded
ftp server, proxy server, mail server. FTP data reside on ATA drive,
the rest is on SCSI.

	This host ran 4.8-STABLE of March 2003 for several month when I've
upgraded to 4-STABLE. It starts to hang, I downgraded to mentioned date
and it works nice again. I tried to upgrade to STABLE twice more since then
and both times saw this bug is not corrected. I saw ones a message on
the console about ATA timeout but it does not appear with recent 4-STABLE.

	Here comes dmesg.boot

Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD 4.8-STABLE #4: Fri Dec 19 13:44:49 NKZ 2003
    sa@main.svzserv.kemerovo.su:/usr/obj/usr/src/sys/MAIN
Timecounter "i8254"  frequency 1193182 Hz
CPU: Intel Pentium III (866.43-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x686  Stepping = 6
  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
real memory  = 536805376 (524224K bytes)
config> q
avail memory = 518467584 (506316K bytes)
Programming 16 pins in IOAPIC #0
Programming 16 pins in IOAPIC #1
FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs
 cpu0 (BSP): apic id:  3, version: 0x00040011, at 0xfee00000
 cpu1 (AP):  apic id:  0, version: 0x00040011, at 0xfee00000
 io0 (APIC): apic id:  4, version: 0x000f0011, at 0xfec00000
 io1 (APIC): apic id:  5, version: 0x000f0011, at 0xfec01000
Preloaded elf kernel "kernel" at 0xc03c0000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc03c009c.
VESA: v2.0, 4096k memory, flags:0x0, mode table:0xc034ed22 (1000022)
VESA: ATI MACH64
Pentium Pro MTRR support enabled
md0: Malloc disk
Using $PIR table, 268435454 entries at 0xc00fdf10
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <ServerWorks NB6635 3.0LE host to PCI bridge> on motherboard
IOAPIC #1 intpin 3 -> irq 2
IOAPIC #1 intpin 2 -> irq 5
IOAPIC #1 intpin 9 -> irq 9
pci0: <PCI bus> on pcib0
pci0: <ATI Mach64-GV graphics accelerator> at 2.0 irq 2
fxp0: <Intel 82557/8/9 EtherExpress Pro/100(B) Ethernet> port 0x5400-0x543f mem 0xfb000000-0xfb0fffff,0xfb201000-0xfb201fff irq 5 at device 3.0 on pci0
fxp0: Ethernet address 00:d0:b7:b6:2b:f9
inphy0: <i82555 10/100 media interface> on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp1: <Intel 82557/8/9 EtherExpress Pro/100(B) Ethernet> port 0x5440-0x547f mem 0xfb100000-0xfb1fffff,0xfb202000-0xfb202fff irq 9 at device 9.0 on pci0
fxp1: Ethernet address 00:02:b3:26:a9:52
inphy1: <i82555 10/100 media interface> on miibus1
inphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
isab0: <ServerWorks IB6566 PCI to ISA bridge> at device 15.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <ServerWorks ROSB4 ATA33 controller> port 0x5480-0x548f,0x374-0x377,0x170-0x177 at device 15.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pcib1: <ServerWorks NB6635 3.0LE host to PCI bridge> on motherboard
IOAPIC #1 intpin 0 -> irq 10
IOAPIC #1 intpin 1 -> irq 11
pci1: <PCI bus> on pcib1
ahc0: <Adaptec aic7899 Ultra160 SCSI adapter> port 0x5800-0x58ff mem 0xfd000000-0xfd000fff irq 10 at device 4.0 on pci1
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
ahc1: <Adaptec aic7899 Ultra160 SCSI adapter> port 0x6000-0x60ff mem 0xfd001000-0xfd001fff irq 11 at device 4.1 on pci1
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcf7ff,0xcf800-0xd07ff on isa0
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 flags 0x10 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppi0: <Parallel I/O> on ppbus0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
DUMMYNET initialized (011031)
IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to deny, unlimited logging
IPsec: Initialized Security Association Processing.
SMP: AP CPU #1 Launched!
ad0: 29410MB <QUANTUM FIREBALLP AS30.0> [59755/16/63] at ata0-master UDMA33
acd0: CDROM <ASUS CD-S400/A> at ata0-slave PIO4
Waiting 15 seconds for SCSI devices to settle
Mounting root from ufs:/dev/da0s1a
da0 at ahc0 bus 0 target 0 lun 0
da0: <QUANTUM ATLAS_V_18_WLS 0230> Fixed Direct Access SCSI-3 device 
da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da0: 17510MB (35861388 512 byte sectors: 255H 63S/T 2232C)

>How-To-Repeat:

	It repeats by itself very soon after reboot for recent 4.9-STABLE.
	I had important data loss on my SCSI drive because of unclean
	reboot already.

>Fix:

	Unknown
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200312231235.hBNCZ97r051640>