From owner-freebsd-current Tue Jan 6 00:51:07 1998 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id AAA17598 for current-outgoing; Tue, 6 Jan 1998 00:51:07 -0800 (PST) (envelope-from owner-freebsd-current) Received: from dog.farm.org (gw-hssi-2.farm.org [209.66.103.33]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id AAA17570 for ; Tue, 6 Jan 1998 00:50:48 -0800 (PST) (envelope-from dog.farm.org!dk) Received: (from dk@localhost) by dog.farm.org (8.7.5/dk#3) id AAA04597; Tue, 6 Jan 1998 00:52:04 -0800 (PST) Date: Tue, 6 Jan 1998 00:52:04 -0800 (PST) From: Dmitry Kohmanyuk Message-Id: <199801060852.AAA04597@dog.farm.org> To: asami@cs.berkeley.edu (Satoshi Asami) Cc: freebsd-current@freebsd.org, dk@farm.org Subject: Re: RAM parity error Newsgroups: cs-monolit.gated.lists.freebsd.current Organization: FARM Computing Association Reply-To: dk+@ua.net X-Newsreader: TIN [version 1.2 PL2] Sender: owner-freebsd-current@freebsd.org X-Loop: FreeBSD.org Precedence: bulk In article <199712310100.RAA03280@vader.cs.berkeley.edu> you wrote: > I have been seeing a "RAM parity error" in one of our machines lately. > I have swapped machines and it still happens on the machine in the > same position. The only things that are common in the old and new > machines are the external SCSI disk array. It has happened on 3 PCs. > Is it possible that data on a non-system filesystem would cause such > an error? I always thought the parity error panic is caused by the > chipset asserting a signal line dedicated for the NMI.... > Here's one of the dumps: [...] > #14 0xf0151f62 in spec_strategy () > #15 0xf0151689 in spec_vnoperate () > #16 0xf01aee01 in ufs_vnoperatespec () > #17 0xf0114606 in ccdstart () > #18 0xf0114560 in ccdstrategy () > #19 0xf0151f62 in spec_strategy () [...] note the ccd drives... I have double P6 machine running 2.2.5 which was installed a year ago and did not had a single failure since. It is running as a news server. Just recently, I have configured ccd on it with 2 arrays, each with 2 drives (on its own controller). I have 2 2940s (one ultra, one not), 128M parity RAM, hawk and barracuda drives (all 4G). 10 days after that, I have got the same error. > P6-200 > Intel Venus (VS440FX) motherboard > 96MB RAM (32MB fake, 64MB real) > 2 x Adaptec 3940UW > 14 x IBM Scorpion 9GB drives > Intel EEPro100/B running at 100Mbps > 3Com 595TX running it 10Mbps > 3.0-current of 12/11 + CAM > The disks are connected 7 each on channel A of both controllers, then > combined into a 14-disk ccd array. The 3Com card is the interface to > the world; the Intel card is connected via a crossover cable to a > Windows NT machine. The panic always happens when I download a large > directory through the crossover cable. as you see, i have different OS version, disks, and same kind of error... looks like something to do with pci scsi controller/bus? dmesg info: CPU: Pentium Pro (199.43-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x617 Stepping=7 Features=0xfbff,MTRR,PGE,MCA,CMO V> real memory = 134217728 (131072K bytes) avail memory = 127344640 (124360K bytes) DEVFS: ready for devices Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0 chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 de0 rev 18 int a irq 5 on pci0:10 de0: DEC DE500-XA 21140 [10-100Mb/s] pass 1.2 de0: address 00:00:f8:30:99:73 ahc0 rev 0 int a irq 9 on pci0:11 ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs ahc0 waiting for scsi devices to settle (ahc0:0:0): "SEAGATE ST34371N 0338" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 4148MB (8496960 512 byte sectors) sd0(ahc0:0:0): with 5168 cyls, 10 heads, and an average 164 sectors/track (ahc0:1:0): "SEAGATE ST15230N 0638" type 0 fixed SCSI 2 sd1(ahc0:1:0): Direct-Access 4095MB (8386733 512 byte sectors) sd1(ahc0:1:0): with 3992 cyls, 19 heads, and an average 110 sectors/track (ahc0:3:0): "SEAGATE ST15230N 0498" type 0 fixed SCSI 2 sd2(ahc0:3:0): Direct-Access 4095MB (8386733 512 byte sectors) sd2(ahc0:3:0): with 3992 cyls, 19 heads, and an average 110 sectors/track vga0 rev 84 int a irq 10 on pci0:12 ahc1 rev 0 int a irq 11 on pci0:13 ahc1: aic7880 Single Channel, SCSI Id=7, 16 SCBs ahc1 waiting for scsi devices to settle (ahc1:1:0): "SEAGATE ST34371N 0280" type 0 fixed SCSI 2 sd3(ahc1:1:0): Direct-Access 4148MB (8496960 512 byte sectors) sd3(ahc1:1:0): with 5168 cyls, 10 heads, and an average 164 sectors/track (ahc1:2:0): "SEAGATE ST34371N 0280" type 0 fixed SCSI 2 sd4(ahc1:2:0): Direct-Access 4148MB (8496960 512 byte sectors) sd4(ahc1:2:0): with 5168 cyls, 10 heads, and an average 164 sectors/track (ahc1:5:0): "TEAC CD-ROM CD-56S 1.0A" type 5 removable SCSI 2 cd0(ahc1:5:0): CD-ROM can't get the size Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in bt0 not found at 0x330 npx0 on motherboard npx0: INT 16 interface DEVFS: ready to run ccd0-3: Concatenated disk drivers IP packet filtering initialized, divert enabled, logging disabled