From owner-freebsd-scsi@FreeBSD.ORG  Tue Jun  3 09:41:09 2003
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1537837B401
	for <freebsd-scsi@freebsd.org>; Tue,  3 Jun 2003 09:41:09 -0700 (PDT)
Received: from matou.sibbald.com (matou.sibbald.com [195.202.201.48])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5E5E343F85
	for <freebsd-scsi@freebsd.org>; Tue,  3 Jun 2003 09:41:07 -0700 (PDT)
	(envelope-from kern@sibbald.com)
Received: from [192.168.68.112] (rufus [192.168.68.112])
	by matou.sibbald.com (8.11.6/8.11.6) with ESMTP id h53GeWv10269;
	Tue, 3 Jun 2003 18:40:32 +0200
From: Kern Sibbald <kern@sibbald.com>
To: "Justin T. Gibbs" <gibbs@scsiguy.com>
In-Reply-To: <882210000.1054657530@aslan.btc.adaptec.com>
References: <3EDB31AB.16420.C8964B7D@localhost>
	<3EDB59A4.27599.C93270FB@localhost> <20030602110836.H71034@beppo>
	<20030602131225.F71034@beppo>
	<1054645616.13630.161.camel@rufus>  <20030603072944.U44880@beppo>
	<1054652678.13630.209.camel@rufus>
	<882210000.1054657530@aslan.btc.adaptec.com>
Content-Type: text/plain
Organization: 
Message-Id: <1054658432.13630.252.camel@rufus>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.2.4 
Date: 03 Jun 2003 18:40:32 +0200
Content-Transfer-Encoding: 7bit
cc: freebsd-scsi@freebsd.org
cc: mjacob@feral.com
Subject: Re: SCSI tape data loss
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Jun 2003 16:41:09 -0000

Yes, I probably should move the clrerror() and the
check/set of errno inside the check for "stat == -1". 
However, the code though odd is correct since 
I do not use errno unless the status is -1.

Our most recent tests are even more interesting.
We are getting the same data loss any time
Bacula switches tapes.  This means the data loss
does not have anything in particular to do with
the LEOM or PEOM status.

By the way, the funny casting is mandatory in C++,
because ssize_t as returned by the write is not the 
same as size_t (what is written).

More after I look at the most recent tests results.

Best regards,

Kern

On Tue, 2003-06-03 at 18:25, Justin T. Gibbs wrote:
> > What is clear from the output is that the write()
> > is returning a -1 status. errno could possibly be 0,
> > in which case I set it to ENOSPC, if it is not 0
> > then it is ENOSPC judging by the error message that
> > is printed "Write error on device ...".
> > 
> > You may want to see more, but here is the basic code
> > that does the write:
> >    if ((uint32_t)(stat=write(dev->fd, block->buf, (size_t)wlen)) !=
> > wlen) {
> >       /* We should check for errno == ENOSPC, BUT many 
> >        * devices simply report EIO when it is full.
> >        * with a little more thought we may be able to check
> >        * capacity and distinguish real errors and EOT
> >        * conditions.  In any case, we probably want to
> >        * simulate an End of Medium.
> >        */
> >       clrerror_dev(dev, -1);
> 
> Apart from the funny casting, the only obvious bug is that you
> are expecting errno to be set on every syscall.  Errno is only
> valid if stat == -1 or you explicitly clear it prior to the
> syscall (or after the last time it was set).  You don't seem
> to be doing that here.
> 
> See the errno man page for details
> 
> --
> Justin