Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Dec 2001 00:39:00 +0100
From:      "Kristian K. Nielsen" <jkkn@jkkn.dk>
To:        "Matthew Dillon" <dillon@apollo.backplane.com>, "Nils Holland" <nils@tisys.org>
Cc:        =?iso-8859-1?Q?S=F8ren_Schmidt?= <sos@freebsd.dk>, "Matthew Gilbert" <agilbertm@earthlink.net>, <freebsd-stable@FreeBSD.ORG>, <freebsd-hackers@FreeBSD.ORG>
Subject:   Re: 4.4-STABLE crashes - suspects new ata-driver over wd-drivers
Message-ID:  <008d01c18f2f$ab983b40$bb5ca8c0@jkkn.net>
References:  <200112262355.fBQNtfK48250@apollo.backplane.com> <200112270945.fBR9j1e97273@freebsd.dk> <20011227163252.A151@tisys.org> <200112271847.fBRIlxh52129@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hey,

It is great if you are finding a solution for the VIA-chipset.
Do you have any idea if it is a simular problem that I am experiencing?

I am not enough into chip code to have a clue what exactly the patch
is doing - but maybe it is just decreasing the load on the kernel/system in
a
way that the crashes are avoided tough there is still a bug outthere
somewhere?!

I do not have a single VIA-chip in my box that I know of - all Intel
and is running the latest BIOS version avialable for my motherboard
and still having crashes whenever I put any pressure on the box, like
compiling/moving large files across filesystems/etc:

Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 4.4-STABLE #0: Fri Dec  7 14:21:48 CET 2001
    jkkn@jkkn.jkkn.net:/usr/src/sys/compile/JKKN_KRNL
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 300683283 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (300.68-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x634  Stepping = 4

Features=0x80f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,M
MX>
real memory  = 402640896 (393204K bytes)
avail memory = 387928064 (378836K bytes)
Preloaded elf kernel "kernel" at 0xc02ff000.
Pentium Pro MTRR support enabled
Using $PIR table, 6 entries at 0xc00f0d10
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443BX (440 BX) host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib1: <Intel 82443BX (440 BX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
isab0: <Intel 82371AB PCI to ISA bridge> at device 4.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0xd800-0xd80f at device 4.1 on
pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: <Intel 82371AB/EB (PIIX4) USB controller> at 4.2 irq 12
chip1: <Intel 82371AB Power management controller> port 0xe800-0xe80f at
device 4.3 on pci0
rl0: <RealTek 8139 10/100BaseTX> port 0xd000-0xd0ff mem
0xe3000000-0xe30000ff irq 10 at device 10.0 on pci0
rl0: Ethernet address: 00:40:95:30:2e:5e
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pci0: <S3 ViRGE DX/GX graphics accelerator> at 12.0 irq 11
orm0: <Option ROM> at iomem 0xc0000-0xc7fff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
device_probe_and_attach: atkbd0 attach returned 6
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/9 bytes threshold
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
IPsec: Initialized Security Association Processing.
ad0: 39266MB <IC35L040AVER07-0> [79780/16/63] at ata0-master UDMA33
ad2: 9641MB <IBM-DTTA-371010> [19590/16/63] at ata1-master UDMA33
acd0: CD-RW <CD-RW CRX100E> at ata1-slave using PIO4
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
swapon: adding /dev/ad0s1b as swap device
Automatic boot in progress...
/dev/ad0s1a:
1312 files, 66500 used, 32691 free
(355 frags, 4042 blocks, 0.4% fragmentation)
/dev/ad2s1a:
FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/ad2s1a:
clean, 15373 free
(197 frags, 1897 blocks, 0.5% fragmentation)
/dev/ad2s1f:
FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/ad2s1f:
clean, 2784577 free
(35865 frags, 343589 blocks, 0.4% fragmentation)
/dev/ad2s1e:
FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/ad2s1e:
clean, 10574 free
(1182 frags, 1174 blocks, 6.0% fragmentation)
/dev/ad0s1f:
UNREF FILE
 I=8558312  OWNER=cyrus MODE=100600
/dev/ad0s1f: SIZE=676 MTIME=Dec 26 03:35 2001  (CLEARED)
/dev/ad0s1f:
FREE BLK COUNT(S) WRONG IN SUPERBLK
 (SALVAGED)
/dev/ad0s1f:
SUMMARY INFORMATION BAD
 (SALVAGED)
/dev/ad0s1f:
BLK(S) MISSING IN BIT MAPS
 (SALVAGED)
/dev/ad0s1f:
237835 files, 5627465 used, 32454489 free
(75065 frags, 4047428 blocks, 0.2% fragmentation)
/dev/ad0s1e:
135 files, 568 used, 19247 free
(71 frags, 2397 blocks, 0.4% fragmentation)
Doing initial network setup:
 hostname
.
rl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet 192.168.1.2 netmask 0xffffff00 broadcast 192.168.1.255
        inet 192.168.1.3 netmask 0xffffffff broadcast 192.168.1.3
        inet 192.168.1.4 netmask 0xffffffff broadcast 192.168.1.4
        ether 00:40:95:30:2e:5e
        media: Ethernet autoselect (100baseTX)
        status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
        inet 127.0.0.1 netmask 0xff000000
add net default: gateway 192.168.1.1
Additional routing options:
 TCP keepalive=YES
.
Routing daemons:
.
Additional daemons:
 syslogd
.
dumpon: crash dumps to /dev/ad0s1b (116, 131073)
Checking for core dump:
savecore: reboot after panic: page fault
Dec 26 04:11:11 jkkn savecore: reboot after panic: page fault
savecore: system went down at Wed Dec 26 04:07:57 2001
savecore: writing core to /var/crash/vmcore.4

.....snip end.....


Regards
Kristian

----- Original Message -----
From: "Matthew Dillon" <dillon@apollo.backplane.com>
To: "Nils Holland" <nils@tisys.org>
Cc: "Søren Schmidt" <sos@freebsd.dk>; "Matthew Gilbert"
<agilbertm@earthlink.net>; <freebsd-stable@FreeBSD.ORG>;
<freebsd-hackers@FreeBSD.ORG>
Sent: Thursday, December 27, 2001 7:47 PM
Subject: Re: 4.4-STABLE crashes - suspects new ata-driver over wd-drivers


>     This is great news!  I'm crossing my fingers and hoping that Nils
can't
>     reproduce the crash any more with Soren's fix.
>
>     Just to let you all know, Nils has been working his ass off helping me
>     track his crash down.  I've been pulling my hair out... I gave him
patch
>     after patch to test various conditions & panic if the nfs_node's hash
list
>     somehow got broken, and for the last week not a single one of those
tests
>     detected the problem prior to the panic.  The nfs_node's hash list
>     was being corrupted seemingly out of nowhere.
>
>     The last two days I've had Nils use hardware watchpoints in DDB> to
>     try to track down what was modifying the memory location, with no
>     success.  The watchpoint was catching the (correct) write to the list
>     head but then failed to catch the corrupted write prior to the system
>     panicing, which is what makes me believe it is some sort of chipset
>     issue.
>
>     Another thing to note:  One of the really weird things about Nils
crashes
>     is that the same memory location was getting corrupted every time,
five
>     times in a row (which made it possible to use a hardware watch point).
>     The corruption changed somewhat when he added the hardware watch
point.
>     Another similar set of crashes in the vm_page_list (that other people
>     report, including a number of machines at Yahoo), have a similar
M.O....
>     IDE drive, medium/heavy activity, but while corrupted address always
>     winds up in the (static) vm_page array, it always tends to be slightly
>     different.  I'm hoping that it winds up being the same or similar
>     issue.  I'm not ruling out the possibility that chipsets other then
>     the 686B have problems too.
>
>     In anycase, Nils description makes a lot of sense.  I've asked him to
>     continue testing his system to make sure that this particular crash
cannot
>     be reproduced, and I am crossing my fingers.
>
>     I'm also wondering how applicable this patch might be in regards to
>     forcing a 'safe' mode for other PCI chipsets, to allow us to test
>     it on non-686B machines that have similar problems.
>
> -Matt
> Matthew Dillon
> <dillon@backplane.com>
>
>
> :On Thu, Dec 27, 2001 at 10:45:01AM +0100, Søren Schmidt stood up and
spoke:
> :>
> :> OK, here goes the VIA 686b patch, it is hand cut out from the bulk
patches
> :> to go into 4.5 so beware :)
> :
> :Well, as Matt has said, I reported a crash that he's trying to debug.
Since
> :I have the 686b in my machine, I applied the patch. Ever since then I was
> :not able to reproduce the crash again, although yesterday it was so easy
> :that I could do it twice an hour ;-)
> :
> :Anyway, you (Soren) said that the right way to fix this is a BIOS update.
> :Now, could it be that some mainboard manufacturers are incapabel of
> :handling this? I'm using the latest BIOS for my board, and according to
> :http://www.chaintech.com.tw/DL/7xMB/7AJA0.HTM, this should already have
> :been fixed in their BIOS release from 2001-04-23...
> :
> :Second interesting thing: I was using a UDMA66 drive on my 686b until a
few
> :weeks ago and never had any problems - the stuff Matt is looking at only
> :started two appear a short while after I exchanged that drive for a
UDMA100
> :one. So, it seems as if probably the slower drive didn't produce a high
> :enough PCI workload for anything to actually happen.
> :
> :This fix will probably also have some influence on a few other similar
> :problems (I read Matt was working on many of them). In the end I hope
that
> :this fix - or a variation thereof - will actually go into 4.5.
> :
> :Greetings
> :Nils
> :
> :--
> :Nils Holland
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?008d01c18f2f$ab983b40$bb5ca8c0>