From owner-freebsd-scsi@FreeBSD.ORG Tue Jun 3 09:03:39 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5E39D37B401 for ; Tue, 3 Jun 2003 09:03:39 -0700 (PDT) Received: from beppo.feral.com (beppo.feral.com [192.67.166.79]) by mx1.FreeBSD.org (Postfix) with ESMTP id 71EA543F3F for ; Tue, 3 Jun 2003 09:03:38 -0700 (PDT) (envelope-from mjacob@feral.com) Received: from wonky.in0.lcl (wonky.in0.lcl [172.16.166.7]) by beppo.feral.com (8.12.9/8.12.9) with ESMTP id h53G3aqw046246; Tue, 3 Jun 2003 09:03:36 -0700 (PDT) (envelope-from mjacob@feral.com) Date: Tue, 3 Jun 2003 09:03:36 -0700 (PDT) From: Matthew Jacob X-X-Sender: mjacob@wonky.in0.lcl To: Kern Sibbald In-Reply-To: <1054653106.13606.217.camel@rufus> Message-ID: <20030603084701.U24586@wonky.in0.lcl> References: <3EDB31AB.16420.C8964B7D@localhost> <3EDB59A4.27599.C93270FB@localhost> <577540000.1054579840@aslan.btc.adaptec.com> <20030602131225.F71034@beppo> <1054645616.13630.161.camel@rufus> <1054653106.13606.217.camel@rufus> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-scsi@freebsd.org Subject: Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: mjacob@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jun 2003 16:03:39 -0000 > > This is exactly what it does. *Every* time the requested write > size does not agree with the returned value, Bacula gives > up on the tape. My last email has the code that does that. > > My email above was not very clear because I was telling you what > happened in the particular case of loss of data (the -1 and errno=0 > or errno=ENOSPC I don't know which). As noted here, Bacula *will* > stop writing if the driver returns a short block (assuming my > code isn't broken), but I have never seen that case on FreeBSD. That's really wierd. I have to look at this closer. I've had some drives not report LEOT at all, but since tape_pattern_tester didn't complain on the same drive you were using, I know tape_pattern_tester is in fact stopping at LEOT. write(2) isn't necessarily returning -1. It may be returning 0- which means that no data moved. I think the ENOSPC as you report is a red herring because you're setting this value- unless you actually *did* see -1 returned from write(2) and ENOSPC set in errno,. In any case, even if you hit PEOT instead of LEOT, you shouldn't *lose* data. If you hit PEOT, we have to return -1/ENOSPC. Because this is Unix or Linux or Solaris instead of a reasonable and modern OS, like RSX, VMS or NT, which allow you to give realistic details to failures in I/O requests, this means you have no way of telling the user application how much was *actually* written when you hit *PEOT* (not LEOT, note!). As far as the user application is concerned, *no* data was written at all for this last write. But there may in fact be data on the tape media. What is particularily annoying in the PEOT case is that your application probably asked for the next tape and rewrote all the blocks from the failed write. This is fine, but you have to make damned sure then on rereading the data later that you can handle duplicate blocks because you may read blocks NOPQR on tapeA and then switch to tapeB and read blocks OPQR again on tapeB. I don't think this is your problem here, but I thought I'd have a pre-coffee diatribe about it. Grump. > > > Ignoring the short write and waiting until you hit ENOSPC guarantees > > you will hit PEOM, since the LEOM is only reported once. The tape > > driver expects that you know what you are doing if you go on writing. > > The only additional writing Bacula does (unless I am missing something) > is the two EOF marks. This is one of the things that's bothering me. You shouldn't be writing extra marks if you actually close the device. I'd like to look over all the current Bacula source, but sourceforge is offline at the moment. -matt