Date: Tue, 01 Jul 2003 20:07:40 -0400 From: "Dan Langille" <dan@langille.org> To: Matthew Jacob <mjacob@feral.com> Cc: freebsd-scsi@freebsd.org Subject: Re: Differences between Solaris/Linux and FreeBSD Message-ID: <3F01EA0C.420.5FD9390A@localhost> In-Reply-To: <20030603111738.X24586@wonky.in0.lcl> References: <1054550725.1582.1859.camel@rufus>
next in thread | previous in thread | raw e-mail | index | archive | help
As a bacula fan I'd like to see it working on FreeBSD but I don't know what I can do in order to achieve that objective. Any ideas? On 3 Jun 2003 at 11:39, Matthew Jacob wrote: > > As promised, in this email, I will try my best to describe > > the differences I found between Solaris/Linux and FreeBSD > > concerning tape handling. There were five separate areas > > where I noticed differences: > > > > 1. On Solaris/Linux, the default behavior for ioctl(MTEOM) > > is to run in what they call slow mode. In this mode, the > > tape is positioned to the end of the data, and the driver > > returns the correct file number in the MTIOCGET packet. > > It is possible to enable fast-EOM, but no one uses it to > > my knowledge. > > > > On FreeBSD, you apparently always use the fast-EOM so that > > the tape position is unknown after the ioctl(). > > You *could* read block position. Particularly for h/w blocks this works > very fast when you need to locate. > > NB: SCSI-3 changed the layout for h/w block position stuff and I haven't > updated the FreeBSD driver to handle this yet. > > > Bacula always knows how many files are on a tape, and when > > appending to a tape that is already written and newly opened, > > it MUST know where it is on the tape. As a consequence, on > > FreeBSD, I must explicitly use MTFSF with read()s in between > > to position to the end of the tape -- a fairly slow affair. > > Uh, this is how 'slow' EOM works. It's not really faster to do it in the > kernel as opposed to in the driver. > > I must point out that you cannot, and should not, depend absolutely on > reported position. For tape you can ensure BOT or end of recorded media, > but otherwise you really must use self-referential data on the tape if > tape location is important. > > > 2. Your handling of EOM differs from Solaris/Linux. On both of > > those systems, when the Bacula reads the first EOF, the driver > > returns 0 bytes read. On reading the second EOF, the driver > > returns 0 bytes read, but before returning backspaces over > > the EOF, leaving you positioned correctly for appending to the > > tape and having told you you are at the end of the tape by > > giving two consecutive 0 byte read. Any further read() > > request return an I/O error. > > > > On FreeBSD, reading the first EOF returns 0 bytes, reading > > the second EOF also returns 0 bytes (sometimes, I apparently > > get "Illegal operation"). However, the tape is left positioned > > after the second EOF, so appending from that point effectively > > "loses" the data. > > > > To handle this correctly the FreeBSD user must add a configuration > > statement to Bacula telling him to backspace file at EOM. > > Yes. This is a problem. > > But part of the problem here is that dual-filemark at EOM is only one > tape convention- and a poorly thought out one at best- it exists > *solely* because a *few* (ancient) tape drives would unwind off the feed > reel if you kept advancing them. For QIC drives, you *cannot* write dual > filemarks (really). > > Note that there is a setting that can change the model to single EOM. If > I could have gotten away with it, I would have made this the default. > > I think, though, I'd accept that the FreeBSD behaviour is a bug that > should be fixed. If we have a dual fmk EOT model and are advancing along > and hit two in a row, we *probably* should say we're at logical EOT and > backspace over one of them. After all, this is what we do when we're > *writing* to tape and close the no-rewind device. > > I also would agree that this situation is exacerbated by the 'space to > end of recorded data' model for the MTEOM command. This now leaves us > with a legacy of tapes with spurious dual filemarks in the middle. > > Oops. This means that I really can't fix things the way you'd like :-(. > > > > > 3. I have previously described this but will do so again for > > completeness here. On Solaris/Linux when Bacula does: > > > > write(); > > ioctl(MTEOF); > > ioctl(MTEOF) > > ioctl(MTBSF); > > ioctl(MTBSF); > > ioctl(MTBSR); > > read(); > > > > the read() re-reads the last write. On FreeBSD, the read returns > > 0 bytes (there is also a problem of freezing the tape wrapped into > > this example if I am not mistaken). Apparently the 0 bytes read is > > because FreeBSD adds an additional EOF mark (not necessary) and > > leaves the drive positioned *after* the mark thus re-reading the > > last record fails when it logically should not. > > I don't believe that FreeBSD adds an additional filemark here, but I > should add this as a test case. I have another tester program that I use > for testing block locate, but I haven't really validated it or finished > it yet. > > Why, btw, are you issuing two MTEOFs? The mtop has a count field y'know > :-). > > > > > 4. Tape freezing: On Solaris/Linux, the tape never "freezes". On > > FreeBSD it does freeze. As best I can determine, you freeze the > > drive when you lose track of where you are. Typically, this > > occurs when I do a MTBSR to re-read the last record. On Solaris/Linux > > the tape is never frozen, but when they don't know the position, > > they simply return -s in the MTIOCGET packet, which is fine with > > me because Bacula only uses that info when initially reading a > > tape to append to it. > > > > Freezing the tape causes all sorts of problems because it generates > > a flood of unexpected errors. Within a large complicated program like > > Bacula, when a low level routine re-reads a record during writing and > > the tape freezes, it cannot simply rewind the drive as this could > > cause chaos and possible overwriting of the beginning of the drive. > > > > I've attempted to overcome tape freezing by providing the user a > > means to turn off MTBSR (but they don't always do so), and by issuing > > ioctl(MTIOCERRSTAT) after every return of -1 from any I/O request. > > > > I recommend that you do away with freezing the drive -- it seems to > > me that it only causes more problems. In saying that I have to > > that I really do not understand tape freezing or why you do it since > > I found no documentation on it, and everything I write above I have > > deduced from what Dan has reported back to me. > > Freezing the drive is precisely what Solaris and Linux *should* do. If > you've lost position, you have to take some action to bring the tape to > a known position. The unaware application should not be allowed to > overwrite in random spots on the tape. If your low level read/write > routines get any kind of error, you have to move to a "what do I have in > my tape drive now?" state anyway. > > You know, I was pretty sure I'd documented the freeze option, but I > cannot find it in the man page (sa(4)) now at all. > > > > > > 5. I am quite fuzzy on this point because I forget exactly what happened > > and what I did about it. > > > > It seems to me that on Linux, if I read a block but specify a number > > of bytes less than the number actually in the block on the tape, the > > driver returns the data anyway. I then check if the block is > > internally complete and if not, increase my record size to the size > > indicated in the data received, backspace one record, and re-read it. > > > > If I am not mistaken, on FreeBSD, the first read returns an error, > > and Bacula just immediately gives up. Your documentation specifies > > that one can never read a partial record from a tape, but it does not > > specify what error code is generated. As a consequence, rather than > > recovering and re-reading the record, Bacula has to assume it was > > a fatal error. > > The reason linux 'succeeds' here is because linux internally reads all > tape data to an oversized buffer in kernel memory anyway. This means > that it doesn't suffer an 'overrun' condition which is what you are > doing if you attempt to read *less* than a tape record size. Solaris > will fail the same way, btw, as FreeBSD. > > What you should always do is start out by reading the largest possible > record size (a pathetic 64KB for FreeBSD) and adjust *downward* (if > desired and you are just autosizing to find a tape record size). > > > THanks for doing the critique. There's definitely food for thought here > and some changes that *should* be made. -- Dan Langille : http://www.langille.org/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3F01EA0C.420.5FD9390A>