Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 8 Jul 2017 00:23:55 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Willem Jan Withagen <wjw@digiware.nl>,  FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: newfs returns cg 0: bad magic number
Message-ID:  <20170707220108.F1124@besplex.bde.org>
In-Reply-To: <20170707062354.GP1935@kib.kiev.ua>
References:  <c98c813c-c393-9ba5-5c70-b9575fe59553@digiware.nl> <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> <20170707062354.GP1935@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 7 Jul 2017, Konstantin Belousov wrote:

> On Fri, Jul 07, 2017 at 12:12:49AM +0200, Willem Jan Withagen wrote:
>> On 5-7-2017 08:55, Bruce Evans wrote:
>>> On Wed, 5 Jul 2017, Konstantin Belousov wrote:
>>>
>>>> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote:
>>>>> Hi,
>>>>>
>>>>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0.
>>>>> It looks like I can:
>>>>>     run dd on it
>>>>>     gpart the disk
>>>>>     create a zpool on it
>>>>>
>>>>> But when I try to create a UFS file system on it, newfs complains
>>>>> straight from the bat.
>>>>>
>>>>> # sudo newfs -E /dev/ggate0p1
>>>>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment
>>>>> size 4096
>>>>>         using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes.
>>>>> Erasing sectors [128...2093055]
>>>>> super-block backups (for fsck_ffs -b #) at:
>>>>>  192, 523520, 1046848, 1570176
>>>>> cg 0: bad magic number
>>>>>
>>>>> Googling returns that this is on and off a problem with new devices, but
>>>>> there is no generic suggestion on how to debug this....
>>>>>
>>>>> Any/all suggestions are welcome,
>>>> Typically this error means that the drive returns wrong data, not the
>>>> bytes that were written to it and expected to be read.
>>>
>>> This might be for writing to a nonexistent sector.  Checking for write
>>> errors was broken by libufs, so some write errors are only sometimes
>>> detected as a side effect of reading back garbage.
>>>
>>> I use the following quick fix (the patch also fixes some style bugs).
>>>
>>> X Index: mkfs.c
>>> X ===================================================================
>>> X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v
>>> X retrieving revision 1.85
>>> X diff -u -1 -r1.85 mkfs.c
>>> X --- mkfs.c    9 Apr 2004 19:58:33 -0000    1.85
>>> X +++ mkfs.c    7 Apr 2005 23:51:56 -0000
>>> X @@ -437,16 +441,19 @@
>>> X      if (!Nflag && Oflag != 1) {
>>> X -        i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
>>> SBLOCKSIZE);
>>> X +        i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
>>> X +            SBLOCKSIZE);
>>> X          if (i == -1)
>>> X -            err(1, "can't read old UFS1 superblock: %s", disk.d_error);
>>> X -
>>> X +            err(1, "can't read old UFS1 superblock: %s",
>>> X +                disk.d_error);
>>> X          if (fsdummy.fs_magic == FS_UFS1_MAGIC) {
>>> X              fsdummy.fs_magic = 0;
>>> X -            bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
>>> SBLOCKSIZE);
>>> X +            bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
>>> X +                SBLOCKSIZE);
>>> X              for (i = 0; i < fsdummy.fs_ncg; i++)
>>> X -                bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)),
>>> X -                        chdummy, SBLOCKSIZE);
>>> X +                bwrite(&disk,
>>> X +                    fsbtodb(&fsdummy, cgsblock(&fsdummy, i)),
>>> X +                    chdummy, SBLOCKSIZE);
>>> X          }
>>> X      }
>>> X -    if (!Nflag)
>>> X -        sbwrite(&disk, 0);
>>> X +    if (!Nflag && sbwrite(&disk, 0) != 0)
>>> X +        err(1, "sbwrite: %s", disk.d_error);
>>> X      if (Eflag == 1) {
>>> X @@ -518,4 +525,4 @@
>>> X      }
>>> X -    if (!Nflag)
>>> X -        sbwrite(&disk, 0);
>>> X +    if (!Nflag && sbwrite(&disk, 0) != 0)
>>> X +        err(1, "sbwrite: %s", disk.d_error);
>>> X      for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize)
>>>
>>> libufs broke the error handling for the most important writes -- to
>>> the superblock.  Error handling is still done almost correctly in
>>> wtfs(), and most writes are still done using wtfs() which is now
>>> just a wrapper which adds error handling to libufs's bwrite(3), but
>>> writes to superblock are (were) now done internally by libufs's
>>> sbwrite(3) which (like most of libufs) is too hard to use.
>>>
>>> Note that -current needs a slightly different fix.  Part of libufs
>>> being too hard to use is that it is a library so it can't just exit
>>> for errors.  It returns errors in the string disk.d_error and the
>>> fix uses that for newfs, unlike for most other calls to sbwrite(3).
>>> However, newfs no longer uses sbwrite(3).  It uses a wrapper
>>> do_sbwrite() which reduces to pwrite(2).  The wrapper doesn't set
>>> d_error, so it is incompatible with sbwrite(3).
>>>
>>> This is an example that libufs is even harder to use than might first
>>> appear.  The version with the do_sbwrite() wrapper fixes a previous
>>> version which replaced bwrite(3) instead of wrapping it.  bwrite()
>>> in the application conflicted with bwrite(3) in libufs, since libufs
>>> is not designed to have its internals replaced by inconsistent parts
>>> like that.  Apparently, a special case is only needed for superblock
>>> writes, and do_sbwrite() does that, and since libufs doesn't call any
>>> sbwrite() function internally there is no need to replace sbwrite(3);
>>> sbwrite(3) is just useless for its main application.  All that the
>>> bwrite(3) and sbwrite(3) library functions do is handle the block
>>> size implicitly in a way that makes them harder to use than just
>>> multiplying by the block size like wtfs() used to do and do_sbwrite()
>>> now does.
>>
>> This is where the trouble originates:
>> /usr/srcs/11/src/lib/libufs/sblock.c:148
>>         /*
>>          * Write superblock summary information.
>>          */
>>         blks = howmany(fs->fs_cssize, fs->fs_fsize);
>>         space = (uint8_t *)disk->d_sbcsum;
>>         for (i = 0; i < blks; i += fs->fs_frag) {

I already said that newfs doesn't use this.

>>
>> But:
>>
>> (gdb) p disk->d_sbcsum
>> $19 = (struct csum *) 0x0
>>
>> and this pointer is later on used to write:
>>     for (i = 0; i < blks; i += fs->fs_frag) {
>>         size = fs->fs_bsize;
>>         if (i + fs->fs_frag > blks)
>>             size = (blks - i) * fs->fs_fsize;
>>         if (bwrite(disk, fsbtodb(fs, fs->fs_csaddr + i), space, size)
>>             == -1) {
>>             ERROR(disk, "Failed to write sb summary information");
>>             return (-1);
>>         }
>>         space += size;
>>     }
>>
>> But the bwrite returns error because the called pwrite() tries to write
>> 4096 bytes from a null pointer. And that it does not like.
>>
>> Now the question is: why isn't d_sbcsum not filled out?
>> Note that the disk is filled with random data.

d_sbcsum() isn't filled out because newfs also doesn't call sbread()
(or ufs_disk_fillout()).

>> I've been looking for quite some time, but I just don't get it.
>> Where should the superblock come from if a whole disk is being used?
>> (so there no MBR or gpart written. Dangerously dedicated)
>
> Indeed I am not sure what do you report there.  newfs(8) does not use
> sbwrite() function from libufs.  Set a breakpoint on the sbwrite()
> function and catch the backtrace of the call.

The full unusability of libufs is now even clearer.  sbread(3) is
obviously unusable by newfs since there is no fs initially.  libufs's
API is already ugly to support this as well as possible.  Closing is
done by ufs_disk_close(), but there is no corresponding ufs_disk_open().
Opening is done by the badly named functions ufs_disk_fillout() and
ufs_disk_fillout_blank().  These open a file a fill out a struct uufsd.
The 'blank' one is for use in newfs when there is no superblock.  It
fills out the the struct uufsd except for leaving the fields related
to the superblock blank.  The man pages don't give the details about
what these fields are, so you have to look at the implementation to
see which libufs functions can be safely called with blank initialization.

It might seem obvious that sbwrite() should not be called unless sbread()
(or a non-blank fillout) has been called to initialize libufs fields related
to the superblock, but the first libufs'ed versions of newfs made the call.
This was fragile: newfs is chummy with the implementation and hacks on
the libufs variable disk.d_bsize (it still does), so that this variable
is not "blank"; then sbwrite(3) uses this variable to "unblank" another
even more internal variable so that sbwrite(3) sort of worked.

Further history:
- r109926: first use of libufs in newfs
- r185588: to get variable block sizes, don't use libufs bwrite(3) in newfs
- r188520: replacing library functions as in r185588 is fragile in all cases
            and was broken for static linkage, so don't do it.  Don't use
 	   sbwrite(3) instead
- r207141: add soft update/journaling stuff to libufs only.  This adds
            complications to sbwrite(3).  Since newfs doesn't use sbwrite(3),
 	   it is unaffected.  I think it doesn't need any changes, but the
 	   replacement for sbwrite(3) is now even more incompatible.

sbwrite(3) is only used by:
- tunefs.  This shouldn't write more than the basic superblock
- fsck_ffs: the soft update/journaling parts call sbwrite(3).

So it seems that sbwrite(3) had no useful uses before soft update/journaling
started using it, and all not incorrect uses of it start with non-blank
initialization.

For newfs, there are a lot of silly complications involving the block size:
- the of initialization of disk.d_bsize to sectorsize (from the firmware
   or label) or even the -S option doesn't seem to be good for anything.
   Actually, the comment explains this.  It says that "Our blocks = sector
   size".  This is not what libufs wants.  It is what was convenient for
   newfs before libufs.  It sometimes works for libufs too, but is not
   documented to work (no method for initializing disk.d_bsize is documented).
- oops, the wrapper db_sbwrite() is no to handle complications for the block
   size.  It is to add the partition offset.  For most i/o's newfs tells
   libufs the full offset.  Superblock i/o's are special because the offsets
   are passed implicitly and the implicit values don't contain the offset.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170707220108.F1124>