Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Jun 2003 15:45:12 -0700 (PDT)
From:      Matthew Jacob <mjacob@feral.com>
To:        Kern Sibbald <kern@sibbald.com>
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: SCSI tape data loss
Message-ID:  <20030602154021.T71034@beppo>
In-Reply-To: <1054593075.13606.28.camel@rufus>
References:  <3EDB31AB.16420.C8964B7D@localhost> <3EDB59A4.27599.C93270FB@localhost> <577540000.1054579840@aslan.btc.adaptec.com> <20030602131225.F71034@beppo>  <1054590119.13606.8.camel@rufus>  <20030602145421.D71034@beppo> <1054593075.13606.28.camel@rufus>

next in thread | previous in thread | raw e-mail | index | archive | help


On Mon, 3 Jun 2003, Kern Sibbald wrote:

> On Mon, 2003-06-02 at 23:55, Matthew Jacob wrote:
> > > I suspect that the problem is something very simple such as
> > > the drive buffering data then hitting the physical EOM and
> > > of course any buffered data goes down the bit bucket.
> >
> > A question to ask then is why tape_pattern_tester stopped at LEOT but
> > Bacula didn't and kept going to PEOT.
> >
> > -matt
>
> This was just a thought, because you or Justin said that
> the driver does not fail writes at the LEOF, which means
> that unless you are doing something special in your
> tpt, it is not stopping at the LEOF.

Yes, it does provide a signfier. At the end of one operation that has
athe check condition that indicates early warning:

                } else if (sense->flags & SSD_EOM) {
                        softc->flags |= SA_FLAG_EOM_PENDING;

and

        SA_FLAG_ERR_PENDING     = (SA_FLAG_EOM_PENDING|SA_FLAG_EIO_PENDING|
                                   SA_FLAG_EOF_PENDING),

and at the start of an I/O:

                } else if ((softc->flags & SA_FLAG_ERR_PENDING) != 0) {
		....
                        bp->b_resid = bp->b_bcount;
			...
                        if ((softc->flags & SA_FLAG_EOM_PENDING) != 0) {
                                /*
                                 * We now just clear errors in this case
                                 * and let the residual be the notifier.
                                 */
                                bp->b_error = 0;

The signifier here back to the user application is a write returning
less than the requested amount.


>
> One thought that I had is: the fact that Bacula backs
> up at the EOM to re-read the last record could cause
> some problems.  I've asked Dan if he will re-run the
> Bacula backup/restore test but with the re-read disabled.
> As someone said, this will give one more data point.

Yes.


>
> Another interesting test would be to see if the same
> data loss occurs in a situation where a tape size is
> specified such that Bacula stops writing before the
> EOM on the first tape.

That too.

-matt



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030602154021.T71034>