Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Jan 96 11:24:28 +0100
From:      wirth@zerberus.hai.siemens.co.at (Helmut F. Wirth)
To:        freebsd-bugs@freebsd.org
Cc:        wirth@zerberus.hai.siemens.co.at
Subject:   Bug with NCR810 driver and FreeBSD 2.1 release, please help
Message-ID:  <9601191024.AA01493@zerberus.hai.siemens.co.at>

next in thread | raw e-mail | index | archive | help
Hello !

I think I triggered a bug in the NCR driver code: If I try to do
a dump from one of the IBM disks (see below) an error in the NCR driver
code shows up. Details see below.

My Hardware:

Pentium-120, ASUS TPX4 motherboard, 32MB memory 
NCR810 controller
Diamond Stealth 64 
SCSI Bus:
Quantum ATLAS (target 0)
IBM OEM 1GB   (target 1)
IBM OEM 1GB   (target 2)
VIPER ARCHIVE tape (target 5)
TOSHIBA CDROM (target 6)

Software:
MSDOS 6.2 (Win 3.1) on target 0
FreeBSD 2.1 release on target 0,1,2; target 0 contains /, swap and /usr

This are the FreeBSD kernel boot messages:
Jan 18 20:06:05 atlantis /kernel: FreeBSD 2.1.0-RELEASE #0: Wed Jan 17 21:39:28  1996
Jan 18 20:06:05 atlantis /kernel:     hfwirth@atlantis.ping.at:/usr/src/sys/compile/ATLANTIS
Jan 18 20:06:05 atlantis /kernel: CPU: 120-MHz Pentium 735\90 or 815\100 (Pentium-class CPU)
Jan 18 20:06:05 atlantis /kernel:   Origin = "GenuineIntel"  Id = 0x525  Stepping=5
Jan 18 20:06:05 atlantis /kernel:   Features=0x1bf<FPU,VME,PSE,MCE,CX8,APIC>
Jan 18 20:06:05 atlantis /kernel: real memory  = 33554432 (32768K bytes)
Jan 18 20:06:05 atlantis /kernel: avail memory = 30900224 (30176K bytes)
Jan 18 20:06:05 atlantis /kernel: Probing for devices on the ISA bus:
Jan 18 20:06:05 atlantis /kernel: sc0 at 0x60-0x6f irq 1 on motherboard
Jan 18 20:06:05 atlantis /kernel: sc0: VGA color <16 virtual consoles, flags=0x0>
Jan 18 20:06:05 atlantis /kernel: sio0 at 0x3f8-0x3ff irq 4 on isa
Jan 18 20:06:05 atlantis /kernel: sio0: type 16550A
Jan 18 20:06:05 atlantis /kernel: sio1 at 0x2f8-0x2ff irq 3 on isa
Jan 18 20:06:05 atlantis /kernel: sio1: type 16550A
Jan 18 20:06:06 atlantis /kernel: lpt0 at 0x378-0x37f irq 7 on isa
Jan 18 20:06:06 atlantis /kernel: lpt0: Interrupt-driven port
Jan 18 20:06:06 atlantis /kernel: lp0: TCP/IP capable interface
Jan 18 20:06:06 atlantis /kernel: fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
Jan 18 20:06:06 atlantis /kernel: fdc0: NEC 72065B
Jan 18 20:06:06 atlantis /kernel: fd0: 1.2MB 5.25in
Jan 18 20:06:06 atlantis /kernel: fd1: 1.44MB 3.5in
Jan 18 20:06:06 atlantis /kernel: npx0 on motherboard
Jan 18 20:06:06 atlantis /kernel: npx0: INT 16 interface
Jan 18 20:06:06 atlantis /kernel: sb0 at 0x220 irq 5 drq 1 on isa
Jan 18 20:06:06 atlantis /kernel: sb0: <SoundBlaster 16 4.5>
Jan 18 20:06:06 atlantis /kernel: sbxvi0 at 0x0 drq 5 on isa
Jan 18 20:06:06 atlantis /kernel: sbxvo0: <SoundBlaster 16 4.5>
Jan 18 20:06:06 atlantis /kernel: sbmidi0 at 0x300 on isa
Jan 18 20:06:06 atlantis /kernel:  <SoundBlaster MPU-401>
Jan 18 20:06:06 atlantis /kernel: bio_imask c0000040 tty_imask c003009a net_imask c003009a
Jan 18 20:06:06 atlantis /kernel: Probing for devices on the PCI bus:
Jan 18 20:06:06 atlantis /kernel: chip0 <Intel 82437 (Triton)> rev 2 on pci0:0
Jan 18 20:06:07 atlantis /kernel: chip1 <Intel 82371 (Triton)> rev 2 on pci0:7
Jan 18 20:06:07 atlantis /kernel: vga0 <Display device> rev 0 on pci0:9
Jan 18 20:06:07 atlantis /kernel: ncr0 <ncr 53c810 scsi> rev 1 int a irq 11 on pci0:12
Jan 18 20:06:07 atlantis /kernel: ncr0 waiting for scsi devices to settle
Jan 18 20:06:07 atlantis /kernel: (ncr0:0:0): "Quantum XP32150 81HB" type 0 fixed SCSI 2
Jan 18 20:06:07 atlantis /kernel: sd0(ncr0:0:0): Direct-Access 
Jan 18 20:06:07 atlantis /kernel: sd0(ncr0:0:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8.
Jan 18 20:06:07 atlantis /kernel: 2050MB (4199760 512 byte sectors)
Jan 18 20:06:07 atlantis /kernel: (ncr0:1:0): 200ns (5 Mb/sec) offset 8.
Jan 18 20:06:07 atlantis /kernel: (ncr0:1:0): "IBM OEM 0662S12 3 30" type 0 fixed SCSI 2
Jan 18 20:06:07 atlantis /kernel: sd1(ncr0:1:0): Direct-Access 
Jan 18 20:06:07 atlantis /kernel: sd1(ncr0:1:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8.
Jan 18 20:06:07 atlantis /kernel: 1003MB (2055035 512 byte sectors)
Jan 18 20:06:07 atlantis /kernel: (ncr0:2:0): "IBM DPES-31080 S31Q" type 0 fixed SCSI 2
Jan 18 20:06:07 atlantis /kernel: sd2(ncr0:2:0): Direct-Access 
Jan 18 20:06:07 atlantis /kernel: sd2(ncr0:2:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8.
Jan 18 20:06:07 atlantis /kernel: 1034MB (2118144 512 byte sectors)
Jan 18 20:06:07 atlantis /kernel: (ncr0:5:0): "ARCHIVE VIPER 150  21247 -011" type 1 removable SCSI 1
Jan 18 20:06:07 atlantis /kernel: st0(ncr0:5:0): Sequential-Access st0: Archive  Viper 150 is a known rogue
Jan 18 20:06:07 atlantis /kernel: density code 0x0,  drive empty
Jan 18 20:06:07 atlantis /kernel: (ncr0:6:0): "TOSHIBA CD-ROM XM-3501TA 2694" type 5 removable SCSI 2
Jan 18 20:06:07 atlantis /kernel: cd0(ncr0:6:0): CD-ROM 
Jan 18 20:06:07 atlantis /kernel: cd0(ncr0:6:0): 250ns (4 Mb/sec) offset 8.
Jan 18 20:06:07 atlantis /kernel: cd present.[264427 x 2048 byte records]
Jan 18 20:06:05 atlantis lpd[94]: restarted

Bug description: (send-pr not possible yet, my mail does not work (yet))

I discoverd this while trying to dump from one IBM disk to the other
IBM disk, like this (details see below): dump 0f - (diskname)|gzip -c >(file)
The disk, from which dump *reads*, has problems with the NCR driver.
Actually there were two different errors, but I think they are related.

The target 0 (Quantum ATLAS) is mounted at /, /usr and for swap
The target 1 (IBM OEM, old) is mounted at /home/disk1
The target 2 (IBM OEM, new) is mounted at /home/disk2

With this I got the following error (I will refer to it as ERROR_1)
while trying to dump from target 1 to a file on target 2:

 bash# dump 0f - /home/disk1 | gzip -c > /home/disk2/disk1.dump.gz
  DUMP: Date of this level 0 dump: Fri Jan 19 01:29:56 1996
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/rsd1s1e (/home/disk1) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 469684 tape blocks.
  DUMP: slave couldn't reopen disk: Device not configured
  DUMP:   DUMP: The ENTIRE dump is aborted.
 bash#

The /var/log/messages contained:
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:2
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:2
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , FAILURE
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , FAILURE
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:2
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:2
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , retries:1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , FAILURE
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0): NOT READY asc:4,1
Jan 19 01:30:02 atlantis /kernel: sd1(ncr0:1:0):  Logical unit is in process of becoming ready
Jan 19 01:30:02 atlantis /kernel: , FAILURE

I experimented a bit and tried to dump from target 1 to a file on
target 0 and from target 1 to the tape:
 bash# dump 0f - /home/disk1 | gzip -c > /var/tmp/disk1.dump.gz
and
 bash# dump 0f /dev/rst0 /home/disk1
Both tries yielded exactly the same error as above.

I tried then to dump from *target 2* and here the error was different,
I will refer to it as ERROR_2:
Dumping from target 2 to a file on target 1:

 bash# dump 0f - /home/disk2 | gzip -c > /home/disk1/disk2.dump.gz
  DUMP: Date of this level 0 dump: Fri Jan 19 01:32:56 1996
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/rsd2e (/home/disk2) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 595453 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
^C  DUMP: Interrupt received.
  DUMP: Do you want to abort dump?: ("yes" or "no")   DUMP: Broken pipe
  DUMP: The ENTIRE dump is aborted.

(The dump seemed to work, but I did not trust it and aborted).

var/log/messages contained:
Jan 19 01:17:02 atlantis /kernel: assertion "cp" failed: file "../../pci/ncr.c", line 5560
Jan 19 01:17:02 atlantis /kernel: assertion "cp" failed: file "../../pci/ncr.c", line 5560
Jan 19 01:17:02 atlantis /kernel: sd2(ncr0:2:0): COMMAND FAILED (4 28) @f0a2ce00.

I tried to dump target 2 to target 0 and to tape
 bash# dump 0f - /home/disk2 | gzip -c > /var/tmp/disk2.dump.gz
 bash# dump 0f /dev/rts0 /home/disk2
and ERROR_2 occured in both cases too.

Triying to dump *from* target 0 works to all other targets (except the
CDROM and the NCR810 of course :-)).

So far the description what happened.

Considerations and further experimenting:

1) The SCSI bus, and termination:
 The controller and the CDROM are the last devices on the cable and are
 both terminated properly. The controller supplies terminator power.
 There is only a cable inside the PC and it is under 100cm long.
 The machine worked with NetBSD 1.0A and MSDOS(Windows) without any
 problems.
 So I think this is'nt a hardware related problem.

2) Both IBM disks seem to trigger the problem:
 The first IBM disk (target 1) is about one year old. It never had problems
 but with a early version of the NCR driver and with NetBSD 0.9 I had
 problems with the tags. This showed up with a "disk not ready" during
 savecore (without a core dump) while booting. I think this is very similar
 to ERROR_1. The problem disappeared with the next driver version, but I
 found a solution to it: Disable the tags for the IBM disk.
 I tried this for the two bugs ERROR_1 and ERROR_2: With the tags
 disabled for all disks both bugs disappeared. With the tags disabled for
 the disk which I try to dump *from*, the bug disappears too.
 So could that be buggy disks ? I think not, because the second disk
 is about 2 weeks old and looks completly different. I think IBM has
 a way with tags which the driver won't like.

Playing around with ncrcontrol showed some strange things too, but I had
not the time to look into it more:
ncrcontrol shows the SCSI devices, and for all three disks there are
4 tags. The CDROM is SCSI-2 and has no tags, the tape is SCSI-1.

Using ncrcontrol .. -t 1 -s tags=1 solved the problem for target 1, but
doing ncrcontrol after this showed 4 tags for target 1 ?? The list entry
did not change, but it seems the driver got it.

The datasheet for the disks mention jumpers to disable SCSI-ATTENTION after
a SCSI bus reset. For target 1 (only) there is a jumper to disable
active (target initiated) sync negiotation. Could one of these help ?

Thats all I know yet. 
Thank you for any help and hints, completly disabling tags hurts the
performance and I would like to find a better solution.

Helmut Wirth
 

So far the description of the bugs








Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9601191024.AA01493>