Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 31 Oct 2007 21:46:57 -0600
From:      "Clayton Milos" <clay@milos.co.za>
To:        =?koi8-r?B?5M3J1NLJyiDrz83BzMXF1w==?= <d.komaleev@konliga.ru>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: System hangs up every day
Message-ID:  <012601c81c39$dbf8fa20$0e917095@claylaptop>
References:  <2335ED0A1B2A294FACC6EB01EF0965F72EE7C0@exch01.konliga.ru> <b91012310710311854y536fcbbj8fe78c76533a7da@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
>A system failure of this sort (one which leaves no log entries of any
> kind) is generally a hardware fault; memory stick failures tend to
> cause kernel panics and easy repeatability.
>
> I would suggest examining the hardware components, the motherboard
> could have some faulty capacitors (burst, leaking, or swollen); the
> fans on the processors could be failing causing a lockup, the power
> supply fans could be failing causing an undervolt and lockup, but this
> usually makes the system reset.
>
> You get the idea, your symptoms are pointing to hardware issues in my 
> opinion.
>
> David

Or as I've seen a few times a power supply that cannot handle the load. You 
have 2 CPU's and a few hard disks which are sucking electricity. What rating 
power supply are you using? I've found FreeBSD to be finicky about hardware. 
If the hardware is all good it works perfectly and never lets you down. 
Something starts going faulty and FreeBSD hangs. Other OS's tend to chug 
along unpredictably instead.

If it's not the power supply it's possibly the raid card. I'm asuming you 
used the same raid card when you moved the drives to the other server.

Just my 2c

-Clay


> On 10/31/07, Дмитрий Комалеев <d.komaleev@konliga.ru> wrote:
>> Hello everybody
>>
>> I have a big problem
>>
>> There is one FreeBSD server in our company. The server platform is: 
>> Supermicro SuperServer 6014V-T2B (2x Intel Xeon 2.8, 1Gb RAM, 3WARE 
>> 3W-8006-2LP RAID-Controller).
>> The server works as:
>> - a gateway between LAN and Internet
>> - an Intranet web- and database server (Apache + MySQL + PHP)
>> - a firewall (OpenBSD pf)
>> - a transparent proxy server (Squid)
>> A mounthly traffic through this server is about 100Gb. There is about 200 
>> internet users in our conpany.
>> Here is a part of my dmesg-listing:
>>
>> Copyright (c) 1992-2007 The FreeBSD Project.
>> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>>         The Regents of the University of California. All rights reserved.
>> FreeBSD is a registered trademark of The FreeBSD Foundation.
>> FreeBSD 6.2-RELEASE-p8 #2: Thu Oct 11 19:51:25 MSD 2007
>>     sa@gateway.konliga.ru:/usr/obj/usr/src/sys/KERNEL01_NOSMP
>> module_register: module pci/em already exists!
>> Module pci/em failed to register: 17
>> ACPI APIC Table: <A M I  OEMAPIC >
>> Timecounter "i8254" frequency 1193182 Hz quality 0
>> CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2800.12-MHz 686-class CPU)
>>   Origin = "GenuineIntel"  Id = 0xf43  Stepping = 3
>> 
>> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>>   Features2=0x641d<SSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,<b14>>
>>   AMD Features=0x20000000<LM>
>>   Logical CPUs per core: 2
>> real memory  = 1073479680 (1023 MB)
>> avail memory = 1041465344 (993 MB)
>> ioapic0 <Version 2.0> irqs 0-23 on motherboard
>> ioapic1 <Version 2.0> irqs 24-47 on motherboard
>> ichwd module loaded
>> kbd1 at kbdmux0
>> ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, 
>> RF5413)
>> acpi0: <A M I OEMRSDT> on motherboard
>> acpi0: Power Button (fixed)
>> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
>> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
>> cpu0: <ACPI CPU> on acpi0
>> acpi_throttle0: <ACPI CPU Throttling> on cpu0
>> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
>> pci0: <ACPI PCI bus> on pcib0
>> pcib1: <ACPI PCI-PCI bridge> irq 16 at device 2.0 on pci0
>> pci1: <ACPI PCI bus> on pcib1
>> pcib2: <ACPI PCI-PCI bridge> irq 16 at device 3.0 on pci0
>> pci2: <ACPI PCI bus> on pcib2
>> pcib3: <ACPI PCI-PCI bridge> at device 28.0 on pci0
>> pci3: <ACPI PCI bus> on pcib3
>> twe0: <3ware Storage Controller. Driver version 1.50.01.002> port 
>> 0xbc00-0xbc0f mem 0xfc9ffc00-0xfc9ffc0f,0xfc000000-0xfc7fffff irq 24 at 
>> device 1.0 on pci3
>> twe0: [GIANT-LOCKED]
>> twe0: 2 ports, Firmware FE8S 1.05.00.068, BIOS BE7X 1.08.00.048
>> em0: <Intel(R) PRO/1000 Network Connection Version - 6.6.6> port 
>> 0xb800-0xb83f mem 0xfc9c0000-0xfc9dffff irq 26 at device 3.0 on pci3
>> em0: Ethernet address: 00:30:48:58:4d:2a
>> em0: [FAST]
>> em1: <Intel(R) PRO/1000 Network Connection Version - 6.6.6> port 
>> 0xb400-0xb43f mem 0xfc9a0000-0xfc9bffff irq 27 at device 4.0 on pci3
>> em1: Ethernet address: 00:30:48:58:4d:2b
>> em1: [FAST]
>> uhci0: <UHCI (generic) USB controller> port 0xe800-0xe81f irq 16 at 
>> device 29.0 on pci0
>> uhci0: [GIANT-LOCKED]
>> usb0: <UHCI (generic) USB controller> on uhci0
>> usb0: USB revision 1.0
>> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
>> uhub0: 2 ports with 2 removable, self powered
>> uhci1: <UHCI (generic) USB controller> port 0xec00-0xec1f irq 19 at 
>> device 29.1 on pci0
>> uhci1: [GIANT-LOCKED]
>> usb1: <UHCI (generic) USB controller> on uhci1
>> usb1: USB revision 1.0
>> uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
>> uhub1: 2 ports with 2 removable, self powered
>> pci0: <base peripheral> at device 29.4 (no driver attached)
>> pci0: <base peripheral, interrupt controller> at device 29.5 (no driver 
>> attached)
>> ehci0: <Intel 6300ESB USB 2.0 controller> mem 0xfebffc00-0xfebfffff irq 
>> 23 at device 29.7 on pci0
>> ehci0: [GIANT-LOCKED]
>> usb2: EHCI version 1.0
>> usb2: companion controllers, 2 ports each: usb0 usb1
>> usb2: <Intel 6300ESB USB 2.0 controller> on ehci0
>> usb2: USB revision 2.0
>> uhub2: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
>> uhub2: 4 ports with 4 removable, self powered
>> pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0
>> pci4: <ACPI PCI bus> on pcib4
>> pci4: <display, VGA> at device 5.0 (no driver attached)
>> isab0: <PCI-ISA bridge> at device 31.0 on pci0
>> isa0: <ISA bus> on isab0
>> atapci0: <Intel 6300ESB UDMA100 controller> port 
>> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0
>> ata0: <ATA channel 0> on atapci0
>> ata1: <ATA channel 1> on atapci0
>> pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
>> acpi_button0: <Power Button> on acpi0
>> acpi_button1: <Sleep Button> on acpi0
>> sio0: configured irq 4 not in bitmap of probed irqs 0
>> sio0: port may not be enabled
>> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on 
>> acpi0
>> sio0: type 16550A
>> sio1: configured irq 3 not in bitmap of probed irqs 0
>> sio1: port may not be enabled
>> sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
>> sio1: type 16550A
>> fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 
>> on acpi0
>> fdc0: [FAST]
>> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
>> ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77f irq 7 drq 
>> 3 on acpi0
>> ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
>> ppc0: FIFO with 16/16/9 bytes threshold
>> ppbus0: <Parallel port bus> on ppc0
>> plip0: <PLIP network interface> on ppbus0
>> lpt0: <Printer> on ppbus0
>> lpt0: Interrupt-driven port
>> ppi0: <Parallel I/O> on ppbus0
>> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
>> atkbd0: <AT Keyboard> irq 1 on atkbdc0
>> kbd0 at atkbd0
>> atkbd0: [GIANT-LOCKED]
>> psm0: <PS/2 Mouse> irq 12 on atkbdc0
>> psm0: [GIANT-LOCKED]
>> psm0: model IntelliMouse, device ID 3
>> ichwd0: <Intel 6300ESB watchdog timer> on isa0
>> pmtimer0 on isa0
>> orm0: <ISA Option ROMs> at iomem 
>> 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9800-0xca7ff,0xca800-0xcb7ff on isa0
>> sc0: <System console> at flags 0x100 on isa0
>> sc0: VGA <16 virtual consoles, flags=0x300>
>> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
>> Timecounter "TSC" frequency 2800118202 Hz quality 800
>> Timecounters tick every 1.000 msec
>> acd0: CDROM <CD-224E-N/1.AA> at ata0-master UDMA33
>> twed0: <Unit 0, TwinStor, Normal> on twe0
>> twed0: 152626MB (312579760 sectors)
>> Trying to mount root from ufs:/dev/twed0s1a
>> ext0: link state changed to UP
>> int0: link state changed to UP
>> vlan0: link state changed to UP
>>
>> This server hangs up every day without any messages in the log files and 
>> on the system console. A keyboard dosen't work too. I can make only hard 
>> reset and after restart coredump files are not appearing.
>> Here is my kernel configuration file:
>>
>> include GENERIC
>> ident           KERNEL01_NOSMP
>> device          ichwd # Intel ICH watchdog timer
>> #options        SMP
>> options         ALTQ
>> options         ALTQ_CBQ
>> options         ALTQ_RED
>> options         ALTQ_RIO
>> options         ALTQ_HFSC
>> options         ALTQ_PRIQ
>> #options                ALTQ_NOPCC
>> options         SC_DISABLE_REBOOT
>> options         MP_WATCHDOG
>> options         SW_WATCHDOG
>>
>> If I make and install a kernel with SMP options the system under working 
>> load begins hang up every two hours.
>>
>> The two days "Memtest" gave no result.
>> I tried to install the newest Intel ethernet adapter driver, but without 
>> any results.
>> As an experiment I tried also to plug a system HDD to another sever 
>> platform (SuperServer 6015V-TB), but system hanging didn't stop.
>> I think that it is not only hardware problem.
>> Linux (Gentoo) and Windows server 2003 on this hardware were working 
>> fine.
>>
>> Please help me to find a solution and solve a problem.
>>
>> Your faithfully
>> Dmitry Komaleev
>> IT Manager
>> "EDIPRESSE-KONLIGA" http://www.konliga.ru
>> Russia, Moscow
>> tel.:  +7 (495) 775-14-35, ext. 169
>> fax:   +7 (495) 775-14-34
>>
>> P.S. I have written the Bug Report on my problem but have received only 
>> one advice to turn off ACPI-option.
>> If I disable ACPI, then the RAID-controller and both of the ethernet 
>> controllers on my server recieve the same IRQ. I believe this is not 
>> good.
>> _______________________________________________
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>>
>


--------------------------------------------------------------------------------


> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?012601c81c39$dbf8fa20$0e917095>