Date: Sat, 23 Nov 2002 03:46:45 -0500 From: Scott Sipe <cscotts@mindspring.com> To: Terry Lambert <tlambert2@mindspring.com> Cc: John Baldwin <jhb@FreeBSD.org>, current@FreeBSD.org Subject: Re: DP2 Fatal Trap Message-ID: <200211230346.45917.cscotts@mindspring.com> In-Reply-To: <3DDF305B.7C468ED6@mindspring.com> References: <XFMail.20021122105737.jhb@FreeBSD.org> <200211222210.24713.cscotts@mindspring.com> <3DDF305B.7C468ED6@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday 23 November 2002 02:38 am, Terry Lambert wrote: > Scott Sipe wrote: > > Alright, this is pretty frustrating. I've installed DP2 4 or 5 times= now > > (each time reformatting). > > > > The first time the installation program acted really weird and didn't= do > > the install correctly. > > Not useful information, but expected, based on your other reports. > Most people would accuse you of overclocking. 8-). I'm actually underclocking..(and have been) > > The third time I think was with the trap 12 when I started tcsh... > > now I reinstalled and it SEEMS to be working fine. At least enough > > that I was able to compile a custom kernel, and compile most of > > the gnome suite from ports too (and then to remove it ;). > > The trap 12 is a real problem. The useful information you posted > before was the traceback. The fact that the error occurred where > no such error should be possible is indicative of a hardware > problem: either bad RAM, or a cooked CPU (usually a result of > overclocking), or a CPU bug from the vendor, or a problem with > the data as it was transferred from the hard drive (a disk or > controller problem; rarely, a driver problem, though it's not > likely, since you got as far as you did). as mentioned earlier, not overclocking. (and plus I am running Stable and= =20 WinXP with no stability problems). > > There was ONE problem I had -- one of the g++ include files > > (limits) had one line that was corrupted and I could fix. > > the line was like: > > > > coint name_more; (somethingl ike that) > > > > when it should have been > > const int name_more10; > > If this was actually it, then it dropped 32 bits on its way > into cache. Did you try rebooting, to see if the file "healed > itself"? This would support the theory of a disk/controller/driver > error. Is this maybe a CMD 640B or similar IDE/ATAPI controller? > Note that this could also be a result of a dirty cache line and/or > a CPU bug, but it's more likely to *change* characters, rather > than deleting them. The correct thing to do would probably be to > "hd" the file, to see if the characters were not there, or if they > were converted to non-displaed characters (e.g. four "0x00"'s). I didn't make a backup copy (or mark down the errors) of the bad file or = try=20 rebooting which in retrospect would have been a good idea..sorry--I just=20 fixed the file and saved it so I could compile some ports--and that worke= d.=20 I have an IWILL KK266 motherboard which has a "AMI MegaRaid" controller a= nd a=20 VIA Apollo KT133A chipset. The FreeBSD drive is primary master ad0 on th= e=20 via ide line (both Current and Stable are on the same disk). I have a dv= d=20 drive and a cdrw on the secondary channel. Then 2 harddisks, one each on= the=20 RAID controller (I use the bios to alternate which drives are used for=20 booting--the RAID or the IDE) some pertinent parts of my STABLE dmesg: atapci0: <VIA 82C686 ATA100 controller> port 0xc000-0xc00f at device 7.1 on pci0 atapci0: Correcting VIA config for southbridge data corruption bug ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 atapci1: <CMD 649 ATA100 controller> port 0xe800-0xe80f,0xe400-0xe403,0xe 000-0xe007,0xdc00-0xdc03,0xd800-0xd807 irq 10 at device 16.0 on pci0 ata2: at 0xd800 on atapci1 ata3: at 0xe000 on atapci1 ad0: 76319MB <MAXTOR 4K080H4> [155061/16/63] at ata0-master UDMA100 ad1: 114440MB <WDC WD1200JB-75CRA0> [232514/16/63] at ata2-master UDMA100 ad2: 114440MB <WDC WD1200JB-75CRA0> [232514/16/63] at ata3-master UDMA100 acd0: DVD-ROM <Pioneer DVD-ROM ATAPIModel DVD-106S 0109> at ata1-master P= =20 IO4 acd1: CD-RW <LITE-ON LTR-32123S> at ata1-slave PIO4 > > (line 1710 iirc) > > > > so basically it seems like I'm getting random data corruption at rand= om > > times with random results. fwiw, I'm running stable off the same > > harddisk as I type this. > > Stable has this problem? Yes or No? No, Stable has no problems at all like I see in current. Install never m= essed=20 up, no data corruption, no traps, few core dumps (and none like I experie= nced=20 in Current). I've been running stable on this current computer since ear= ly=20 in the 4.x series and haven't seen this kinda problem before. > > sorry if this throws a wrench in things again. > > No, it doesn't. But it would help if you answered the question > about whether or not Stable has the problem, too, and the first > three questions I asked. If it's the CPU bug, I *can* provide a > kernel that fixes the problem, I believe. I just have to be able > to create a kernel that has the problem, first, and since my hardware > doesn't have the problem locally, that leaves your hardware, for the > testing. 1) Yes it happened with a generic kernel straight off the DP2 install CD. 2) I had the problems directly off DP2 iso image burned cd install, so ca= n=20 that tell you what you need to know about the cvs date or do you want me = to=20 do more? 3) Yes, I'm at college on a fast connection (though with a limited upload= ) so=20 if you need to I can setup an ftp login for you on my computer. thanks, Scott To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200211230346.45917.cscotts>