From owner-freebsd-scsi@FreeBSD.ORG Wed Jun 16 23:31:48 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1466A1065674 for ; Wed, 16 Jun 2010 23:31:48 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id D59758FC0A for ; Wed, 16 Jun 2010 23:31:47 +0000 (UTC) Received: from [192.168.0.102] (m206-63.dsl.tsoft.com [198.144.206.63]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o5GNVlDG060455 for ; Wed, 16 Jun 2010 16:31:47 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4C195EE6.1050207@feral.com> Date: Wed, 16 Jun 2010 16:31:50 -0700 From: Matthew Jacob User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4 MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Default is to whitelist mail, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.67.166.1]); Wed, 16 Jun 2010 16:31:47 -0700 (PDT) Subject: Re: sa: write returns 0 = LEOM? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Jun 2010 23:31:48 -0000 On 6/16/2010 3:52 PM, Dustin J. Mitchell wrote: > I'm investigating a user bug report in Amanda: > http://forums.zmanda.com/showthread.php?t=2832 > > The problem boils down to a write(2) call for a SCSI tape device > (/dev/nsa0) returning 0 after quite a bit of data and a number of > filemarks have been written. Jean-Louis suspected that this was an > early warning EOM indication, and that a subsequent write() would > succeed, with Amanda having been duly warned that a physical EOM is > coming up. That is, I believe, a specific feature of Solaris (EOM detection triggers a zero write, but allows for trailer records). I seem to recall helping architect this back in 1996. > But looking at scsi_sa.c, this doesn't seem to be the > case. It looks like an early warning would result in a successful > write instead, because resid is set to zero. > > cam/scsi/scsi_sa.c: > 2418 /* > 2419 * Handle filemark, end of tape, mismatched record sizes.... > 2420 * From this point out, we're only handling read/write cases. > 2421 * Handle writes&& reads differently. > 2422 */ > 2423 > 2424 if (csio->cdb_io.cdb_bytes[0] == SA_WRITE) { > 2425 if (sense_key == SSD_KEY_VOLUME_OVERFLOW) { > 2426 csio->resid = resid; > 2427 error = ENOSPC; > 2428 } else if (sense->flags& SSD_EOM) { > 2429 softc->flags |= SA_FLAG_EOM_PENDING; > 2430 /* > 2431 * Grotesque as it seems, the few times > 2432 * I've actually seen a non-zero resid, > 2433 * the tape drive actually lied and had > 2434 * written all the data!. > 2435 */ > 2436 csio->resid = 0; > 2437 } > > Yes, I remember this code. I remember on doing test readbacks that the residual reported was in fact incorrect- the data had actually been written. But this was really a long while back (at least 8 years ago). > That said, I don't know my way around the kernel source, so I'm > probably missing something obvious. So: > > 1. What could cause a write syscall to return 0? > I'll try and look into this. Do you happen to know whether the device you experienced this on was set in fixed block or variable block mode? > 2. Since we will be using early warning in the next version of Amanda, > hints as to the best way to handle early warning from userspace would > be appreciated. > > Urrr.... I used to have opinions about this. Now I'm not so sure. Expecting consistent behaviour from platform to platform is tough. Can't you write until you get a hard failure, back up one record (which, of course, you've hung onto), write a trailer label and then ask for a new tape?