Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 01 Mar 2004 10:23:52 -0500
From:      "David A. Koran" <dak@solo.net>
To:        Evren Yurtesen <yurtesen@ispro.net.tr>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Same Panic 12 on differnet servers
Message-ID:  <40435588.6010604@solo.net>
In-Reply-To: <40434F23.7070608@ispro.net.tr>
References:  <40434197.8060100@solo.net> <40434F23.7070608@ispro.net.tr>

next in thread | previous in thread | raw e-mail | index | archive | help
See comments wherein... (it's to note, that I'm glad somebody is willing 
to work with me on this, I'll do my best to comply with some of the 
requests for information)

Evren Yurtesen wrote:

> Hi,
>
> What does the panic message say? 

Unfortunately, I'm describing the results other than the panic type. The 
events that were describved in the original thread mirror what I have, 
however, I'm remtoe an can't pick up the consoel messages (if somebody's 
got a cool trick for that, I'd appreciate it). I'm usually recording 
that it has rebooted after a panic. Originally I was having more issues 
with processes not clearing buffers (I can find the thread that proposed 
the fix, but I remember it realted to threaded servers like Apache and 
MySQL)

>
>
> Did you do cvsup and world before the crashes started? or after? 

I cvsup about once or twice a day. I build world about once a week and 
upgrade ports daily. I'm going through a portsupgrade rihgt now and will 
build world again this afternoon. Since I don't have an exact date, and 
only anecdotal evidence to support this behaviour at a given time, but I 
can belive that this at least started after mid-January and runs until now.

>
> When was the last cvsup you made which worked stable? 80 days ago? 

see above. I think I may be able to go back to a January 15th tree and 
see. I think the last "panic" build I had (means just cvsuped for major 
fixes and built without watching the build process) was shortly after 
the shmat 
(http://lists.freebsd.org/pipermail/freebsd-security-notifications/2004-February/000022.html) 
alert. I will check my kernel config to see if we may have a Sys V issue 
possibly.

>
> You know, you can do some testing and go back to the same day you made 
> cvsup last time when it was stable and see if the problem persists. If 
> you have multiple machines then you can set them with 10 day 
> difference and see which ones will crash and which wont. Then close 
> the gap and find the day when the code which is causing this has been 
> committed and eventually find the reason. There is no easy way to tell 
> what is your problem with the information you have sent. Well this 
> method would work if the problem is a software bug. Did you consider 
> that there might be some hardware problems which showed themselves 
> after a reboot after 80 days? However improbable, it is a possibility. 

The hardware is fine and has been working without a hitch. And, for the 
case that I'm not sure EXACTLY when the last stable build ocurred (i can 
look at my saved daily logs for repeated reboots), I'm not going to have 
much to go on right now. I was mor eor less soliciting any me-toos to 
see if we can pin-point the issue. I'll post back on the progress of 
finding out when this ocurred (or started to at least).

>
>
> Which process is using the cpu so much before crashing? 

This is post crash diagnostics, so, I'm not process monitoring yet.

>
>
> Did you recompile and updated other binaries in your system which 
> doesnt come with the default freebsd distribution? 

I have a ton of apps on the machine (it's a loaded webserver and mail 
server, most of the laod comes from SPAM and Virus scanning of incomign 
e-mail right now).. so pin-pointing the offending app right now will 
probably take more work.

>
>
> How many different servers are you getting this panic on? do they have 
> the same hardware? 

Just this one (my backup test box [read: laptop] is out for hardware 
maintenance... FreeBSD 5.x kept dying on it... urf!)

>
>
> Evren
>
> David A. Koran wrote:
>
>> I'm getting the same type of errors for a box that's been keeping
>> current on 4.9 (and the 4.x tree) for the past two years. The box had
>> been relatively stable up and until late December, when in January and
>> all through February the box has been rebooting on a regular basis.
>>
>> This is a dual-proc box with 256 MB of ram. I'm running a pretty
>> balanced combination of web and mail server on it. The load used to
>> (and with some tuning) stays below 1.00 load, but I've seen it get to
>> above 3.00 and start crashing. I had it at 80.00+ before without it
>> dying before, so I'm betting there's some code instability.
>>
>> I'd be willing to work with any developer on the list to test code to
>> get this condition mentioned here in the thread solved.
>>
>> (On a side note, to inspire some quicker work, we host Howard Shore's
>> website, the one who won the music Oscars last night on Lord of the
>> Rings, and I would be grateful for any help to keep the site stable)
>>
>> P.S. - the unmounted filesystem error below is after one of the crash
>> reboots.
>>
>>
>> mail# uname -a
>> FreeBSD mail.solo.net 4.9-STABLE FreeBSD 4.9-STABLE #20: Sat Feb 21
>> 12:03:07 EST 2004     root@mail.solo.net:/usr/obj/usr/src/sys/SOLONET 
>> i386
>> mail# dmesg
>> Copyright (c) 1992-2003 The FreeBSD Project.
>> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
>> 1994
>>        The Regents of the University of California. All rights
>> reserved.
>> FreeBSD 4.9-STABLE #20: Sat Feb 21 12:03:07 EST 2004
>>    root@mail.solo.net:/usr/obj/usr/src/sys/SOLONET
>> Timecounter "i8254"  frequency 1193182 Hz
>> CPU: Pentium II/Pentium II Xeon/Celeron (350.80-MHz 686-class CPU)
>>  Origin = "GenuineIntel"  Id = 0x652  Stepping = 2
>>  Features=0x183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR> 
>>
>> real memory  = 268435456 (262144K bytes)
>> avail memory = 257265664 (251236K bytes)
>> Programming 24 pins in IOAPIC #0
>> IOAPIC #0 intpin 2 -> irq 0
>> IOAPIC #0 intpin 16 -> irq 11
>> IOAPIC #0 intpin 18 -> irq 9
>> FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs
>> cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee00000
>> cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee00000
>> io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
>> Preloaded elf kernel "kernel" at 0xc03bc000.
>> Pentium Pro MTRR support enabled
>> md0: Malloc disk
>> npx0: <math processor> on motherboard
>> npx0: INT 16 interface
>> pcib0: <Intel 82443BX (440 BX) host to PCI bridge> on motherboard
>> pci0: <PCI bus> on pcib0
>> pcib1: <Intel 82443BX (440 BX) PCI-PCI (AGP) bridge> at device 1.0 on
>> pci0
>> pci1: <PCI bus> on pcib1
>> pci1: <ATI Mach64-GZ graphics accelerator> at 0.0 irq 11
>> isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
>> isa0: <ISA bus> on isab0
>> atapci0: <Intel PIIX4 ATA33 controller> port 0xffa0-0xffaf at device
>> 7.1 on pci0
>> ata0: at 0x1f0 irq 14 on atapci0
>> ata1: at 0x170 irq 15 on atapci0
>> uhci0: <Intel 82371AB/EB (PIIX4) USB controller> at device 7.2 on pci0
>> uhci0: Could not map ports
>> device_probe_and_attach: uhci0 attach returned 6
>> Timecounter "PIIX"  frequency 3579545 Hz
>> chip1: <Intel 82371AB Power management controller> port 0x440-0x44f at
>> device 7.3 on pci0
>> pcib2: <DEC 21152 PCI-PCI bridge> at device 16.0 on pci0
>> pci2: <PCI bus> on pcib2
>> vx0: <3COM 3C590 Etherlink III PCI> port 0xdf80-0xdf9f irq 9 at device
>> 6.0 on pci2
>> utp[*utp*] address 00:a0:24:92:d2:d0
>> vx0: driver is using old-style compatibility shims
>> ahc0: <Adaptec aic7895 Ultra SCSI adapter> port 0xe400-0xe4ff mem
>> 0xffafe000-0xffafefff irq 11 at device 18.0 on pci0
>> aic7895C: Ultra Wide Channel A, SCSI Id=7, 32/253 SCBs
>> ahc1: <Adaptec aic7895 Ultra SCSI adapter> port 0xe800-0xe8ff mem
>> 0xffaff000-0xffafffff irq 11 at device 18.1 on pci0
>> aic7895C: Ultra Wide Channel B, SCSI Id=7, 32/253 SCBs
>> orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc87ff on isa0
>> pmtimer0 on isa0
>> fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
>> isa0
>> fdc0: FIFO enabled, 8 bytes threshold
>> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
>> atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
>> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
>> isa0
>> sc0: <System console> at flags 0x100 on isa0
>> sc0: VGA <16 virtual consoles, flags=0x300>
>> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
>> sio0: type 16550A
>> sio1 at port 0x2f8-0x2ff irq 3 on isa0
>> sio1: type 16550A
>> ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
>> ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode
>> plip0: <PLIP network interface> on ppbus0
>> lpt0: <Printer> on ppbus0
>> lpt0: Interrupt-driven port
>> ppi0: <Parallel I/O> on ppbus0
>> APIC_IO: Testing 8254 interrupt delivery
>> APIC_IO: routing 8254 via IOAPIC #0 intpin 2
>> DUMMYNET initialized (011031)
>> IP packet filtering initialized, divert enabled, rule-based forwarding
>> enabled, default to accept, logging limited to 100 packets/entry by
>> default
>> IP Filter: v3.4.31 initialized.  Default = pass all, Logging = enabled
>> SMP: AP CPU #1 Launched!
>> ad0: DMA limited to UDMA33, non-ATA66 cable or device
>> ad0: 76319MB <WDC WD800LB-00DNA0> [155061/16/63] at ata0-master UDMA33
>> ad2: 114473MB <WDC WD1200JB-00CRA1> [232581/16/63] at ata1-master
>> UDMA33
>> ad3: 114473MB <WDC WD1200JB-00CRA1> [232581/16/63] at ata1-slave
>> UDMA33
>> acd0: CDROM <TOSHIBA CD-ROM XM-6402B> at ata0-slave PIO4
>> Waiting 5 seconds for SCSI devices to settle
>> Mounting root from ufs:/dev/ad0s1a
>> WARNING: / was not properly dismounted
>> vx0: promiscuous mode enabled
>>
>>
>> _______________________________________________
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to 
>> "freebsd-stable-unsubscribe@freebsd.org"
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?40435588.6010604>