Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 1 Jun 2003 17:13:45 -0700 (PDT)
From:      Matthew Jacob <mjacob@feral.com>
To:        Kern Sibbald <kern@sibbald.com>
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: SCSI tape data loss
Message-ID:  <20030601163730.T97138@beppo>
In-Reply-To: <20030601124620.S18592@root.org>
References:  <20030601124620.S18592@root.org>

next in thread | previous in thread | raw e-mail | index | archive | help

Hello, I'm the author of the SA driver. This specific case is something
I have indeed tried to handle correctly, but could have missed something
on. In particular I've been wary of devices in fixed block mode.

The executive summary: I need more info. I need to know:

	a) was the tape device in fixed or variable block mode

	b) you claim to have lost blocks 1555..1567, and that
	1568 was the signifier to change tapes. Are these tape
	blocks reflective of single 'write' requests? Or are
	these multiple tape records issued in one write?

	c) What was the signifier you got that indicated that it
	was time to change tapes (viz block 1568)? -1 and an errno
	set? A residual that indicated that some data that you
	had requested to be written had not been written.


	d) Other general info about whether you were indeed using
	the 'no-rewind' device, whether you'd changed the default
	EOT model (from 'dual filemark' to 'single filemark'- you
	*have* read the man pages, yes? :-))



There is one case I'm also worried about. This is from sa.c:saerror:

       if (csio->cdb_io.cdb_bytes[0] == SA_WRITE) {
                if (sense_key == SSD_KEY_VOLUME_OVERFLOW) {
                        csio->resid = resid;
                        error = ENOSPC;
                } else if (sense->flags & SSD_EOM) {
                        softc->flags |= SA_FLAG_EOM_PENDING;
                        /*
                         * Grotesque as it seems, the few times
                         * I've actually seen a non-zero resid,
                         * the tape drive actually lied and had
                         * writtent all the data!.
                         */
                        csio->resid = 0;
                }

This is saying: if we were writing, and we got SSD_KEY_VOLUME_OVERFLOW,
we're at hard EOT- we have to assume we didn't write *any* data
for this last operation, and we return an errno.

Otherwise, if early warning was spotted, mark EOM pending, but *don't*
believe the residual field.

Every tape drive I'd tested with (and this was around 7 or 8) had all,
when presenting a non-zero residual, had lied about what they actually
had put on the tape.

What I'm obviously worried about here is whether or not your tape drive
was correct in reporting a residual. This would indeed fit your data.

I'm pretty sure I also tested my EOT test program with an Archive
autoloader- but I don't remember for sure.



Other points:

> However, more recently Dan Langille did some extensive
> testing writing a 6GB file to six tapes. This brought
> out additional problems of the driver "freezing" the tape,

If the tape is 'freezing' it means that tape position was lost.
Under what circumstances did this occor?


-matt



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030601163730.T97138>