Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Jun 1995 18:21:18 +0200
From:      esser@zpr.uni-koeln.de (Stefan Esser)
To:        Jeff Aitken <jaitken@cslab.cs.vt.edu>
Cc:        hackers@freebsd.org
Subject:   Re: NCR810 problem?
Message-ID:  <199506201621.AA03594@FileServ1.MI.Uni-Koeln.DE>
In-Reply-To: Jeff Aitken <jaitken@cslab.cs.vt.edu> "Re: NCR810 problem?" (Jun 20, 11:36)

next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 20, 11:36, Jeff Aitken wrote:
} Subject: Re: NCR810 problem?
} > On Jun 19, 16:22, Satoshi Asami wrote:
} > } Subject: Re: NCR810 problem?
} > }  * > ncr0 targ0?: ERROR (80:100) (e-ab-2) (8/13) @ (10d4:e000000).

This is a timeout waiting for the drive to 
accept an identify messsage, I guess. The
NCR seems to be in a "sane" state, and the
command and SCSI control lines as driven 
by the NCR are consistent.

The drive didn't respond within 0.1 second,
leading to a command abort.

What BIOS is on your motherboard, and what
version of the SDMS software ?

} > }  * >              reg: da 10 0 13 47 8 0 1f 0 e 80 ab 80 0 3 0.
} > }  * > ncr0: restart (fatal error).
} > }  * > ncr0: reset by timeout.

This should re-initialise the SCSI bus
and all its devices. But all commands 
on all drives are aborted, and the 
consequences will be fatal, if they 
aren't retried by the generic SCSI code.

} > That's funny, since there were no changes to the code 
} > over quite some time. The patches I wanted to be put
} > into 2.0.5 don't seem to have included, and so there
} > must have been some other change of parameters ...
} 
} I have an NCR 53c810 controller, and I see the above mentioned error (or
} something that looks just like it) all too often these days.  I
} installed 2.0.5R the day it was released, and see the problem only
} when rebooting.  It's occurs just after the filesystems are
} checked/mounted.  However, something I've discovered only lately, is
} that if, after I see the error, I turn the machine off then boot into
} single-user mode, do the fsck myself, then just exit, things work fine.

Could you please send /var/log/message lines,
covering the complete boot messages until the
error is reported ?

What kind of drives do you use ?

Would booting to multi-user after the machine 
was turned off work or hang ?

Is it possible, that the drive needs some more
time to become ready, and by first booting to
single user, the fsck is just delayed enough 
for the drive to become usable ?

} I don't always see this error on reboot, though.  Sometimes the machine
} boots normally.  Other times, it gets that damned error.  I also had a
} lock up in the middle of a disk-intensive activity last night, with no
} hint as to why.  What I don't understand, and would like to help Stefan
} with, is determining what is triggering these problems.  What can I do
} to try and narrow down the causes of the problem?

Guess there was no syslog messages written
to disk because of that lock up ?
Any console messages ?

} I might mention that I had a similar problem under 2.0R.  Every time I
} rebooted the machine "properly" (ie, shutdown -r or whatever, so that
} the disks got unmounted) the machine would not boot because of an NCR
} error right after the filesystems were mounted.  But if I just shut the
} power off, then rebooted, it would always work (it had to clean the
} filesystems).  

Never observed that kind of behaviour.
Do you by chance have the error in some
old log file or do you remember it ?

Could you try forcing the drives to 
asynch. transfers:

# ncrcontrol -sasync

before the reboot ?


Maybe I can make me a picture of what's
going on from this information ...

Regards, STefan

-- 
 Stefan Esser				Internet:	<se@ZPR.Uni-Koeln.DE>
 Zentrum fuer Paralleles Rechnen	Tel:		+49 221 4706017
 Universitaet zu Koeln			FAX:		+49 221 4705160
 Weyertal 80
 50931 Koeln



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199506201621.AA03594>