Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Mar 1999 10:23:42 -0800 (PST)
From:      Matthew Jacob <mjacob@feral.com>
To:        Tom Torrance at home <tom@tomqnx.com>
Cc:        chris@shenton.org, scsi@freebsd.org
Subject:   Re: 3.1-STABLE: nrsa0 T4000 doesn't honor "no rewind"? SCSI errs in logs
Message-ID:  <Pine.LNX.4.04.9903101001010.23447-100000@feral-gw>
In-Reply-To: <m10KZfr-000I4EC@TomQNX.tomqnx.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Okay. A long response. I won't respond point to point. It's also an
inaccurate view of many items, but that's not germane.

The summary I'll take from this that the eject is wrong. I'll agree with
that- that was probably the wrong thing to do. You're further asserting
that any other state action that the driver attempts to take that will
protect the tape from being overwritten in the wrong place is also an
incorrect action.

Well- I'm not so sure. I can't claim 39 years in this business- only 20.
But I will assert that the applications we're talking about here are
ill-conceived at best. Yes, it's a problem for applications that try and
run on multiple platforms, but any application that receives an error
indication *and does not use the information at it's disposal to assure
data integrity or take other steps to ensure data integrity* is bad joke.
I don't give a rat's ass whether it's the popular backup program of choice
for the masses- it's just plain wrong.

The only question here is whether the driver should try and shield idiot
applications from doing something bad to existing data. The steps I took
previously are wrong- but mostly only because I screwed up and didn't
handle the case of tapes that don't eject.

This isn't entirely to do with EOM conditions. This also has to do with
any catastrophic error. Let's say a SCSI Bus reset resets the tape drive.
Let's say somebody manually ejects the tape and replaces it with another.
Either action means that the tape identity is not known and the tape
position isn't known- but that writing can continue, and if it does so
blindly, data destruction will occur.

There are two behaviours I should choose from at this juncture:

	1 Receive an error indicating that tape position has been lost.
	  Propagate it, mapped to EIO, badk to the user application.
	  Nothing else.

	2 Receive an error that indicates that tape position has been
	  lost- and no, user applications aren't the owners of this
	  information- and require specific programmatic intervention
	  before writing can resume. Reading we don't really care about.

	  The step I took before wasn't such a hot idea. Requiring a
	  user application to then make the position and state of the
	  tape known via either a rewind/eject or a space to EOD is
	  sufficient. But a pain to implement.

I want to get a sense of what people would like to see in this case. From
your mail, I'm assuming choice #1. It'd be easier to implement- that's for
sure (just remove any checks).

Taking the point of view of an application writer (I worked at Legato- the
mmd code I worked on supported 30 different platforms- a large fraction of
which *weren't* Unix) I'd also probably pick #1- because I would be
writing an application that doesn't want the tape driver to get in the way
and *I'll* manage data integrity. But in general, Unix user applications
that do backups don't manage data integrity, or manage it poorly. So, I'm
a little unsure as to the right choice- that's why I asked for opinions,
and I hope to see more of them than yours, Tom. But thanks for it all the
same- it was quite informative.

-matt








To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.04.9903101001010.23447-100000>